Rasberry Pi 4B 8GB x 8, Netgear GS308EPP PoE 123W #122
githuba9f5404
started this conversation in
Results
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hardware
Configured as a cluster similar to this guide, with network boot for all nodes but the head node:
https://www.raspberrypi.com/tutorials/cluster-raspberry-pi-tutorial/
Head Node
1 x Raspberry Pi 4B 8GB hw rev 1.5
1 x Official Raspberry Pi IEEE 802.3at-2009 PoE hat
1 x SanDisk 512GB Extreme microSDXC UHS-I Memory Card
1 x Crucial BX500 1TB 3D NAND SATA 2.5-Inch Internal SSD
1 x SATA 2.5 SSD 6GB to USB3.0 adapter
Worker Nodes
7 x Raspberry Pi 4B 8GB hw rev 1.5
7 x Official Raspberry Pi IEEE 802.3at-2009 PoE hat
Additional Hardware
C4 Labs Cloudlet Case - Clear
Netgear GS308EPP PoE 123W
8 x 8 inch patch cables
WOWNOVA IPS USB Mini Screen PC CPU RAM HDD Data Monitor
Notes
Single power connection, otherwise fully self contained unit. Accessed via ssh over WiFi on the Head node.
Benchmarks
All using
--steps 64
and--prompt "The Eiffel Tower is"
. Unless otherwise noted, for all runs CPUs were overclocked to a conservative 2000 MHZ.Using Llama 3.1 8B Instruct Q40
Only running on full 8 x Nodes due to memory requirements (running with full context).
Stock 1 (1.8 KHZ)
Generated tokens: 64
Avg tokens / second: 1.91
Avg generation time: 523.69 ms
Avg inference time: 502.83 ms
Avg transfer time: 18.34 ms
OverClock 1 (2.0 KHZ)
Generated tokens: 64
Avg tokens / second: 2.09
Avg generation time: 479.27 ms
Avg inference time: 439.12 ms
Avg transfer time: 37.98 ms
Llama 3 8B Q40
2 x Nodes
Generated tokens: 64
Avg tokens / second: 1.04
Avg generation time: 960.50 ms
Avg inference time: 915.23 ms
Avg transfer time: 43.11 ms
4 x Nodes
Generated tokens: 64
Avg tokens / second: 1.56
Avg generation time: 639.84 ms
Avg inference time: 617.83 ms
Avg transfer time: 19.88 ms
8 x Nodes
Generated tokens: 64
Avg tokens / second: 1.84
Avg generation time: 544.20 ms
Avg inference time: 470.78 ms
Avg transfer time: 71.19 ms
Llama 2 7B Q40
1 x Nodes
Generated tokens: 64
Avg tokens / second: 0.82
Avg generation time: 1213.45 ms
Avg inference time: 1206.73 ms
Avg transfer time: 6.08 ms
2 x Nodes
Generated tokens: 64
Avg tokens / second: 1.32
Avg generation time: 758.20 ms
Avg inference time: 663.80 ms
Avg transfer time: 93.83 ms
4 x Nodes
Generated tokens: 64
Avg tokens / second: 2.16
Avg generation time: 464.03 ms
Avg inference time: 447.08 ms
Avg transfer time: 16.22 ms
8 x Nodes
Generated tokens: 64
Avg tokens / second: 3.40
Avg generation time: 293.88 ms
Avg inference time: 260.34 ms
Avg transfer time: 32.97 ms
Llama 2 13B Q40
2 x Nodes
Generated tokens: 64
Avg tokens / second: 0.75
Avg generation time: 1335.83 ms
Avg inference time: 1217.81 ms
Avg transfer time: 117.23 ms
4 x Nodes
Generated tokens: 64
Avg tokens / second: 1.22
Avg generation time: 819.88 ms
Avg inference time: 659.30 ms
Avg transfer time: 159.95 ms
8 x Nodes
Generated tokens: 64
Avg tokens / second: 2.15
Avg generation time: 464.78 ms
Avg inference time: 413.16 ms
Avg transfer time: 51.03 ms
Beta Was this translation helpful? Give feedback.
All reactions