Llama 3.2 #55

laggui · 2025-01-03T18:23:52Z

Added Llama 3.2 1b and 3b instruct models (text only).

New model configs & weights (uploaded to HF hub)
- Refactored RoPE frequency scaling to add config (3.1 and 3.2 use different params)
Updated burn dependencies
Switched cuda backend example to use f16 (fixed since 0.15)

Closes #46

laggui · 2025-01-03T19:19:28Z

Dumping some samples for reference / progress tracking.

This is for the cuda backend only, since I cannot load a single model with wgpu (no memory pool big enough, even for tiny llama and 3.2 1b).

Llama 3.2 1b

Loading record...
Loaded in 1s
Processing prompt: How many helicopters can a human eat in one sitting?
> I'm happy to help you with that question. However, I must inform you that it's not possible for a human to eat a helicopter in one sitting.

Helicopters are large, complex machines that are not edible. They are not a food source and do not contain any nutrients that can be consumed by humans. In

65 tokens generated (4.6159 tokens/s)

Generation completed in 0m14s

Note: cannot load on wgpu (f32)

thread 'main' panicked at /home/laggui/.cargo/git/checkouts/cubecl-aa41a28b39b598f9/4372c41/crates/cubecl-runtime/src/memory_management/memory_manage.rs:287:13:
Unable to find valid pool partition point: No memory pool big enough to reserve 1576009728 bytes.

Much faster on tch though...

cargo run --release --features llama3,tch-gpu --example chat
Loading record...
Loaded in 1s
Processing prompt: How many helicopters can a human eat in one sitting?
> I'm happy to help you with that question. However, I must inform you that it's not possible for a human to eat a helicopter in one sitting.

Helicopters are large, complex machines that are not edible. They are not a food source and do not contain any nutrients that can be consumed by humans. In

65 tokens generated (38.9757 tokens/s)

Generation completed in 0m1s

Llama 3.2 3b

Cannot load on my machine (only 6GB VRAM, checkpoint is > 6GB)

Loading record...
thread 'main' panicked at /home/laggui/.cargo/git/checkouts/cubecl-aa41a28b39b598f9/4372c41/crates/cubecl-cuda/src/compute/storage.rs:131:87:
called `Result::unwrap()` on an `Err` value: DriverError(CUDA_ERROR_OUT_OF_MEMORY, "out of memory")
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

but works with tch (cpu & gpu thanks to shared CPU/GPU memory)

Loading record...
Loaded in 6s
Processing prompt: How many helicopters can a human eat in one sitting?
> I think there may be a bit of a misconception here! Helicopters are complex machines made of metal, plastic, and other materials, and they are not edible. In fact, it's not possible for a human to eat a helicopter in one sitting or at all, for that matter.

Helicopters are designed to

65 tokens generated (3.9874 tokens/s)

Generation completed in 0m16s

Tiny Llama 1b

Loading record...
Loaded in 1s
Processing prompt: How many helicopters can a human eat in one sitting?
> It is not specified in the given text whether the question is asking about the number of helicopters a human can eat in one sitting or the number of helicopters that can fit in one human's stomach. However, it can be assumed that the question is asking about the number of helic

65 tokens generated (4.7131 tokens/s)

Generation completed in 0m13s

laggui added 6 commits December 20, 2024 16:03

Add llama 3.2 1b/3b

11ec55b

Fix 1b/3b models hidden size

f0d6344

Update burn version

11dd3d6

Use f16 for cuda

f7024f6

Add missing cubecl feature flag

8301876

Fix 1b instruct weight docstring

3d91d98

laggui added 2 commits January 3, 2025 14:26

Change rust CI prev version

40e1d41

Add fusion + tests

16ee7e4

nathanielsimard approved these changes Jan 11, 2025

View reviewed changes

laggui added 2 commits January 16, 2025 11:02

Update to burn 0.16

3a52308

Change tensor cache to slice assign

0e544a0

laggui force-pushed the llama/3.2 branch from 3cd82f8 to 0e544a0 Compare January 23, 2025 19:10

laggui added 2 commits January 23, 2025 14:30

Switch to fixed length token tensor (slice_assign)

f7e1003

Update readme and notices

e760b3c

laggui merged commit 9a2084d into main Jan 23, 2025
2 checks passed

laggui deleted the llama/3.2 branch January 23, 2025 19:43

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Llama 3.2 #55

Llama 3.2 #55

laggui commented Jan 3, 2025 •

edited

Loading

laggui commented Jan 3, 2025

Llama 3.2 #55

Llama 3.2 #55

Conversation

laggui commented Jan 3, 2025 • edited Loading

laggui commented Jan 3, 2025

Llama 3.2 1b

Llama 3.2 3b

Tiny Llama 1b

laggui commented Jan 3, 2025 •

edited

Loading