Rasberry Pi 4B 8GB x 8, Netgear GS308EPP PoE 123W #122

githuba9f5404 · 2024-09-25T03:19:39Z

githuba9f5404
Sep 25, 2024

Hardware

Configured as a cluster similar to this guide, with network boot for all nodes but the head node:
https://www.raspberrypi.com/tutorials/cluster-raspberry-pi-tutorial/

Head Node

1 x Raspberry Pi 4B 8GB hw rev 1.5
1 x Official Raspberry Pi IEEE 802.3at-2009 PoE hat
1 x SanDisk 512GB Extreme microSDXC UHS-I Memory Card
1 x Crucial BX500 1TB 3D NAND SATA 2.5-Inch Internal SSD
1 x SATA 2.5 SSD 6GB to USB3.0 adapter

Worker Nodes

7 x Raspberry Pi 4B 8GB hw rev 1.5
7 x Official Raspberry Pi IEEE 802.3at-2009 PoE hat

Additional Hardware

C4 Labs Cloudlet Case - Clear
Netgear GS308EPP PoE 123W
8 x 8 inch patch cables
WOWNOVA IPS USB Mini Screen PC CPU RAM HDD Data Monitor

Notes

Single power connection, otherwise fully self contained unit. Accessed via ssh over WiFi on the Head node.

Benchmarks

All using --steps 64 and --prompt "The Eiffel Tower is". Unless otherwise noted, for all runs CPUs were overclocked to a conservative 2000 MHZ.

Using Llama 3.1 8B Instruct Q40

Only running on full 8 x Nodes due to memory requirements (running with full context).

Stock 1 (1.8 KHZ)

Generated tokens: 64
Avg tokens / second: 1.91
Avg generation time: 523.69 ms
Avg inference time: 502.83 ms
Avg transfer time: 18.34 ms

OverClock 1 (2.0 KHZ)

Generated tokens: 64
Avg tokens / second: 2.09
Avg generation time: 479.27 ms
Avg inference time: 439.12 ms
Avg transfer time: 37.98 ms

Llama 3 8B Q40

2 x Nodes

Generated tokens: 64
Avg tokens / second: 1.04
Avg generation time: 960.50 ms
Avg inference time: 915.23 ms
Avg transfer time: 43.11 ms

4 x Nodes

Generated tokens: 64
Avg tokens / second: 1.56
Avg generation time: 639.84 ms
Avg inference time: 617.83 ms
Avg transfer time: 19.88 ms

8 x Nodes

Generated tokens: 64
Avg tokens / second: 1.84
Avg generation time: 544.20 ms
Avg inference time: 470.78 ms
Avg transfer time: 71.19 ms

Llama 2 7B Q40

1 x Nodes

Generated tokens: 64
Avg tokens / second: 0.82
Avg generation time: 1213.45 ms
Avg inference time: 1206.73 ms
Avg transfer time: 6.08 ms

2 x Nodes

Generated tokens: 64
Avg tokens / second: 1.32
Avg generation time: 758.20 ms
Avg inference time: 663.80 ms
Avg transfer time: 93.83 ms

4 x Nodes

Generated tokens: 64
Avg tokens / second: 2.16
Avg generation time: 464.03 ms
Avg inference time: 447.08 ms
Avg transfer time: 16.22 ms

8 x Nodes

Generated tokens: 64
Avg tokens / second: 3.40
Avg generation time: 293.88 ms
Avg inference time: 260.34 ms
Avg transfer time: 32.97 ms

Llama 2 13B Q40

2 x Nodes

Generated tokens: 64
Avg tokens / second: 0.75
Avg generation time: 1335.83 ms
Avg inference time: 1217.81 ms
Avg transfer time: 117.23 ms

4 x Nodes

Generated tokens: 64
Avg tokens / second: 1.22
Avg generation time: 819.88 ms
Avg inference time: 659.30 ms
Avg transfer time: 159.95 ms

8 x Nodes

Generated tokens: 64
Avg tokens / second: 2.15
Avg generation time: 464.78 ms
Avg inference time: 413.16 ms
Avg transfer time: 51.03 ms

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Rasberry Pi 4B 8GB x 8, Netgear GS308EPP PoE 123W #122

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 0 comments

Select a reply

Rasberry Pi 4B 8GB x 8, Netgear GS308EPP PoE 123W #122

githuba9f5404 Sep 25, 2024

Hardware

Head Node

Worker Nodes

Additional Hardware

Notes

Benchmarks

Using Llama 3.1 8B Instruct Q40

Stock 1 (1.8 KHZ)

OverClock 1 (2.0 KHZ)

Llama 3 8B Q40

2 x Nodes

4 x Nodes

8 x Nodes

Llama 2 7B Q40

1 x Nodes

2 x Nodes

4 x Nodes

8 x Nodes

Llama 2 13B Q40

2 x Nodes

4 x Nodes

8 x Nodes

Replies: 0 comments

githuba9f5404
Sep 25, 2024