1x/ 2x/ 4x RPI 5 8gb Llama 7b/13b #17
serralva-ruben
started this conversation in
Results
Replies: 1 comment 2 replies
-
Nice! 3 tokens/second for Llama 13B. What switch/router have you used? |
Beta Was this translation helpful? Give feedback.
2 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
ps: I only ran each mode 1 time
1x RPI 5 8gb Llama 7b
🔶 G 420 ms I 419 ms T 1 ms S 0 kB R 0 kB Hello
🔶 G 443 ms I 434 ms T 9 ms S 0 kB R 0 kB world
🔶 G 446 ms I 434 ms T 12 ms S 0 kB R 0 kB !
🔶 G 443 ms I 434 ms T 9 ms S 0 kB R 0 kB The
🔶 G 406 ms I 405 ms T 0 ms S 0 kB R 0 kB weather
🔶 G 407 ms I 407 ms T 0 ms S 0 kB R 0 kB was
🔶 G 443 ms I 435 ms T 8 ms S 0 kB R 0 kB pleasant
🔶 G 440 ms I 432 ms T 8 ms S 0 kB R 0 kB in
🔶 G 448 ms I 435 ms T 12 ms S 0 kB R 0 kB the
🔶 G 445 ms I 436 ms T 8 ms S 0 kB R 0 kB morning
🔶 G 445 ms I 437 ms T 8 ms S 0 kB R 0 kB ,
🔶 G 438 ms I 430 ms T 8 ms S 0 kB R 0 kB and
🔶 G 446 ms I 438 ms T 8 ms S 0 kB R 0 kB the
🔶 G 447 ms I 434 ms T 12 ms S 0 kB R 0 kB sun
🔶 G 449 ms I 441 ms T 8 ms S 0 kB R 0 kB was
🔶 G 448 ms I 436 ms T 12 ms S 0 kB R 0 kB sh
Generated tokens: 16
Avg generation time: 438.38 ms
Avg inference time: 430.44 ms
Avg transfer time: 7.69 ms
2x RPI 5 8gb Llama 7b
🔶 G 291 ms I 257 ms T 34 ms S 1779278 kB R 522 kB Hello
🔶 G 263 ms I 228 ms T 35 ms S 590 kB R 522 kB world
🔶 G 301 ms I 257 ms T 44 ms S 590 kB R 522 kB ,
🔶 G 308 ms I 268 ms T 40 ms S 590 kB R 522 kB I
🔶 G 305 ms I 256 ms T 49 ms S 590 kB R 522 kB '
🔶 G 303 ms I 253 ms T 49 ms S 590 kB R 522 kB m
🔶 G 264 ms I 228 ms T 35 ms S 590 kB R 522 kB An
🔶 G 300 ms I 256 ms T 44 ms S 590 kB R 522 kB kit
🔶 G 305 ms I 257 ms T 48 ms S 590 kB R 522 kB D
🔶 G 304 ms I 257 ms T 47 ms S 590 kB R 522 kB w
🔶 G 343 ms I 301 ms T 41 ms S 590 kB R 522 kB ived
🔶 G 264 ms I 225 ms T 38 ms S 590 kB R 522 kB i
🔶 G 302 ms I 256 ms T 46 ms S 590 kB R 522 kB .
🔶 G 304 ms I 254 ms T 47 ms S 590 kB R 522 kB commits
🔶 G 266 ms I 234 ms T 32 ms S 590 kB R 522 kB
🔶 G 303 ms I 257 ms T 45 ms S 590 kB R 522 kB 4
Generated tokens: 16
Avg generation time: 295.38 ms
Avg inference time: 252.75 ms
Avg transfer time: 42.12 ms
4x RPI 5 8gb Llama 7b
🔶 G 182 ms I 143 ms T 39 ms S 2670078 kB R 784 kB Hello
🔶 G 180 ms I 135 ms T 45 ms S 2046 kB R 784 kB world
🔶 G 181 ms I 133 ms T 48 ms S 2046 kB R 784 kB !
🔶 G 181 ms I 132 ms T 49 ms S 2046 kB R 784 kB This
🔶 G 205 ms I 134 ms T 71 ms S 2046 kB R 784 kB is
🔶 G 223 ms I 168 ms T 54 ms S 2046 kB R 784 kB the
🔶 G 222 ms I 170 ms T 52 ms S 2046 kB R 784 kB brand
🔶 G 183 ms I 142 ms T 41 ms S 2046 kB R 784 kB new
🔶 G 223 ms I 169 ms T 53 ms S 2046 kB R 784 kB blog
🔶 G 223 ms I 169 ms T 54 ms S 2046 kB R 784 kB for
🔶 G 226 ms I 171 ms T 55 ms S 2046 kB R 784 kB the
🔶 G 222 ms I 173 ms T 49 ms S 2046 kB R 784 kB City
🔶 G 221 ms I 168 ms T 53 ms S 2046 kB R 784 kB of
🔶 G 220 ms I 166 ms T 53 ms S 2046 kB R 784 kB South
🔶 G 224 ms I 174 ms T 49 ms S 2046 kB R 784 kB F
🔶 G 216 ms I 171 ms T 45 ms S 2046 kB R 784 kB ult
Generated tokens: 16
Avg generation time: 208.25 ms
Avg inference time: 157.38 ms
Avg transfer time: 50.62 ms
2x RPI 5 8gb Llama 13b
🔶 G 476 ms I 437 ms T 37 ms S 3485724 kB R 818 kB Hello
🔶 G 459 ms I 416 ms T 43 ms S 924 kB R 818 kB world
🔶 G 498 ms I 445 ms T 53 ms S 924 kB R 818 kB !
🔶 G 497 ms I 441 ms T 55 ms S 924 kB R 818 kB I
🔶 G 499 ms I 441 ms T 58 ms S 924 kB R 818 kB '
🔶 G 458 ms I 411 ms T 47 ms S 924 kB R 818 kB m
🔶 G 495 ms I 440 ms T 54 ms S 924 kB R 818 kB Ch
🔶 G 496 ms I 440 ms T 56 ms S 924 kB R 818 kB ase
🔶 G 496 ms I 445 ms T 51 ms S 924 kB R 818 kB ,
🔶 G 498 ms I 449 ms T 48 ms S 924 kB R 818 kB the
🔶 G 495 ms I 442 ms T 52 ms S 924 kB R 818 kB Le
🔶 G 496 ms I 444 ms T 51 ms S 924 kB R 818 kB ad
🔶 G 497 ms I 442 ms T 55 ms S 924 kB R 818 kB Mark
🔶 G 533 ms I 444 ms T 89 ms S 924 kB R 818 kB eting
🔶 G 462 ms I 414 ms T 47 ms S 924 kB R 818 kB Special
🔶 G 493 ms I 439 ms T 54 ms S 924 kB R 818 kB ist
Generated tokens: 16
Avg generation time: 490.50 ms
Avg inference time: 436.88 ms
Avg transfer time: 53.12 ms
4x RPI 5 8gb Llama 13b
🔶 G 312 ms I 267 ms T 42 ms S 1036099 kB R 1227 kB Hello
🔶 G 330 ms I 238 ms T 92 ms S 3203 kB R 1227 kB world
🔶 G 331 ms I 270 ms T 61 ms S 3203 kB R 1227 kB ,
🔶 G 349 ms I 287 ms T 61 ms S 3203 kB R 1227 kB I
🔶 G 294 ms I 239 ms T 55 ms S 3203 kB R 1227 kB '
🔶 G 334 ms I 271 ms T 62 ms S 3203 kB R 1227 kB m
🔶 G 328 ms I 268 ms T 60 ms S 3203 kB R 1227 kB the
🔶 G 332 ms I 273 ms T 58 ms S 3203 kB R 1227 kB one
🔶 G 295 ms I 243 ms T 52 ms S 3203 kB R 1227 kB who
🔶 G 370 ms I 308 ms T 61 ms S 3203 kB R 1227 kB is
🔶 G 336 ms I 271 ms T 65 ms S 3203 kB R 1227 kB trying
🔶 G 336 ms I 275 ms T 60 ms S 3203 kB R 1227 kB to
🔶 G 335 ms I 272 ms T 63 ms S 3203 kB R 1227 kB win
🔶 G 336 ms I 268 ms T 68 ms S 3203 kB R 1227 kB your
🔶 G 337 ms I 273 ms T 63 ms S 3203 kB R 1227 kB heart
🔶 G 335 ms I 269 ms T 66 ms S 3203 kB R 1227 kB .
Generated tokens: 16
Avg generation time: 330.62 ms
Avg inference time: 268.25 ms
Avg transfer time: 61.81 ms
Beta Was this translation helpful? Give feedback.
All reactions