BitNet_Llama_model_test_huggingface_GPU.ipynb #1

DewEfresh · 2023-10-25T14:57:12Z

I was testing in Colab and when I ran "model.model.layers[0].mlp.gate_proj.weight". I recieved very different results from yours. You got:
Parameter containing:
tensor([[ 0.0032, -0.0339, 0.0150, ..., 0.0041, -0.0048, 0.0061],
[-0.0105, -0.0049, -0.0586, ..., -0.0092, 0.0188, -0.0084],
[-0.0383, -0.0109, 0.0031, ..., -0.0410, 0.0211, 0.0223],
...,
[ 0.0131, -0.0259, 0.0034, ..., 0.0233, -0.0281, -0.0131],
[ 0.0062, 0.0198, 0.0085, ..., 0.0129, -0.0205, 0.0050],
[ 0.0292, 0.0152, -0.0175, ..., 0.0256, 0.0276, 0.0082]],
device='cuda:0', dtype=torch.bfloat16, requires_grad=True)

I got:
tensor([[ 0.0007, 0.0007, 0.0007, ..., 0.0007, -0.0007, 0.0007],
[ 0.0007, 0.0007, 0.0007, ..., 0.0007, 0.0007, -0.0007],
[ 0.0007, 0.0007, -0.0007, ..., -0.0007, -0.0007, 0.0007],
...,
[ 0.0007, -0.0007, 0.0007, ..., -0.0007, -0.0007, 0.0007],
[ 0.0007, 0.0007, 0.0007, ..., -0.0007, -0.0007, -0.0007],
[ 0.0007, -0.0007, 0.0007, ..., 0.0007, 0.0007, 0.0007]],
device='cuda:0', dtype=torch.bfloat16)

Beomi · 2023-10-25T14:59:06Z

Thanks for notice!

Could you provide the colab code you tried to run?
it would be helpful to check the issue😄

DewEfresh · 2023-10-25T16:54:42Z

https://colab.research.google.com/drive/1nvzhy_PCBZ_r6dlvQv3GfweJsGlZHrNJ?usp=sharing

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

BitNet_Llama_model_test_huggingface_GPU.ipynb #1

BitNet_Llama_model_test_huggingface_GPU.ipynb #1

DewEfresh commented Oct 25, 2023

Beomi commented Oct 25, 2023

DewEfresh commented Oct 25, 2023

BitNet_Llama_model_test_huggingface_GPU.ipynb #1

BitNet_Llama_model_test_huggingface_GPU.ipynb #1

Comments

DewEfresh commented Oct 25, 2023

Beomi commented Oct 25, 2023

DewEfresh commented Oct 25, 2023