1x Raspberry Pi 4B 8 GB + 7x Raspberry Pi 4B 4 GB + Mercusys MS108G Switch [Llama 2 7B/13B and Llama 3 8B] #104

EntusiastaIApy · 2024-07-23T01:25:21Z

EntusiastaIApy
Jul 23, 2024

Weights: Q40
Buffer: Q80

Llama 2 7B:
Avg tokens/second: 3.06
Avg generation time: 327.31 ms
Avg inference time: 264.81 ms
Avg transfer time: 61.75 ms

Llama 2 13B:
Avg tokens/second: 1.73
Avg generation time: 579.25 ms
Avg inference time: 468.44 ms
Avg transfer time: 109.62 ms

Llama 3 8B:
Avg tokens/second: 1.87
Avg generation time: 534.06 ms
Avg inference time: 481.50 ms
Avg transfer time: 50.06 ms