Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Llama 3.2 #55

Merged
merged 12 commits into from
Jan 23, 2025
Merged

Llama 3.2 #55

merged 12 commits into from
Jan 23, 2025

Conversation

laggui
Copy link
Member

@laggui laggui commented Jan 3, 2025

Added Llama 3.2 1b and 3b instruct models (text only).

  • New model configs & weights (uploaded to HF hub)
    • Refactored RoPE frequency scaling to add config (3.1 and 3.2 use different params)
  • Updated burn dependencies
  • Switched cuda backend example to use f16 (fixed since 0.15)

Closes #46

@laggui
Copy link
Member Author

laggui commented Jan 3, 2025

Dumping some samples for reference / progress tracking.

This is for the cuda backend only, since I cannot load a single model with wgpu (no memory pool big enough, even for tiny llama and 3.2 1b).

Llama 3.2 1b

Loading record...
Loaded in 1s
Processing prompt: How many helicopters can a human eat in one sitting?
> I'm happy to help you with that question. However, I must inform you that it's not possible for a human to eat a helicopter in one sitting.

Helicopters are large, complex machines that are not edible. They are not a food source and do not contain any nutrients that can be consumed by humans. In

65 tokens generated (4.6159 tokens/s)

Generation completed in 0m14s

Note: cannot load on wgpu (f32)

thread 'main' panicked at /home/laggui/.cargo/git/checkouts/cubecl-aa41a28b39b598f9/4372c41/crates/cubecl-runtime/src/memory_management/memory_manage.rs:287:13:
Unable to find valid pool partition point: No memory pool big enough to reserve 1576009728 bytes.

Much faster on tch though...

cargo run --release --features llama3,tch-gpu --example chat
Loading record...
Loaded in 1s
Processing prompt: How many helicopters can a human eat in one sitting?
> I'm happy to help you with that question. However, I must inform you that it's not possible for a human to eat a helicopter in one sitting.

Helicopters are large, complex machines that are not edible. They are not a food source and do not contain any nutrients that can be consumed by humans. In

65 tokens generated (38.9757 tokens/s)

Generation completed in 0m1s

Llama 3.2 3b

Cannot load on my machine (only 6GB VRAM, checkpoint is > 6GB)

Loading record...
thread 'main' panicked at /home/laggui/.cargo/git/checkouts/cubecl-aa41a28b39b598f9/4372c41/crates/cubecl-cuda/src/compute/storage.rs:131:87:
called `Result::unwrap()` on an `Err` value: DriverError(CUDA_ERROR_OUT_OF_MEMORY, "out of memory")
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

but works with tch (cpu & gpu thanks to shared CPU/GPU memory)

Loading record...
Loaded in 6s
Processing prompt: How many helicopters can a human eat in one sitting?
> I think there may be a bit of a misconception here! Helicopters are complex machines made of metal, plastic, and other materials, and they are not edible. In fact, it's not possible for a human to eat a helicopter in one sitting or at all, for that matter.

Helicopters are designed to

65 tokens generated (3.9874 tokens/s)

Generation completed in 0m16s

Tiny Llama 1b

Loading record...
Loaded in 1s
Processing prompt: How many helicopters can a human eat in one sitting?
> It is not specified in the given text whether the question is asking about the number of helicopters a human can eat in one sitting or the number of helicopters that can fit in one human's stomach. However, it can be assumed that the question is asking about the number of helic

65 tokens generated (4.7131 tokens/s)

Generation completed in 0m13s

@laggui laggui merged commit 9a2084d into main Jan 23, 2025
2 checks passed
@laggui laggui deleted the llama/3.2 branch January 23, 2025 19:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Llama 3.2
2 participants