We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
I was testing in Colab and when I ran "model.model.layers[0].mlp.gate_proj.weight". I recieved very different results from yours. You got: Parameter containing: tensor([[ 0.0032, -0.0339, 0.0150, ..., 0.0041, -0.0048, 0.0061], [-0.0105, -0.0049, -0.0586, ..., -0.0092, 0.0188, -0.0084], [-0.0383, -0.0109, 0.0031, ..., -0.0410, 0.0211, 0.0223], ..., [ 0.0131, -0.0259, 0.0034, ..., 0.0233, -0.0281, -0.0131], [ 0.0062, 0.0198, 0.0085, ..., 0.0129, -0.0205, 0.0050], [ 0.0292, 0.0152, -0.0175, ..., 0.0256, 0.0276, 0.0082]], device='cuda:0', dtype=torch.bfloat16, requires_grad=True)
I got: tensor([[ 0.0007, 0.0007, 0.0007, ..., 0.0007, -0.0007, 0.0007], [ 0.0007, 0.0007, 0.0007, ..., 0.0007, 0.0007, -0.0007], [ 0.0007, 0.0007, -0.0007, ..., -0.0007, -0.0007, 0.0007], ..., [ 0.0007, -0.0007, 0.0007, ..., -0.0007, -0.0007, 0.0007], [ 0.0007, 0.0007, 0.0007, ..., -0.0007, -0.0007, -0.0007], [ 0.0007, -0.0007, 0.0007, ..., 0.0007, 0.0007, 0.0007]], device='cuda:0', dtype=torch.bfloat16)
The text was updated successfully, but these errors were encountered:
Thanks for notice!
Could you provide the colab code you tried to run? it would be helpful to check the issue😄
Sorry, something went wrong.
https://colab.research.google.com/drive/1nvzhy_PCBZ_r6dlvQv3GfweJsGlZHrNJ?usp=sharing
No branches or pull requests
I was testing in Colab and when I ran "model.model.layers[0].mlp.gate_proj.weight". I recieved very different results from yours. You got:
Parameter containing:
tensor([[ 0.0032, -0.0339, 0.0150, ..., 0.0041, -0.0048, 0.0061],
[-0.0105, -0.0049, -0.0586, ..., -0.0092, 0.0188, -0.0084],
[-0.0383, -0.0109, 0.0031, ..., -0.0410, 0.0211, 0.0223],
...,
[ 0.0131, -0.0259, 0.0034, ..., 0.0233, -0.0281, -0.0131],
[ 0.0062, 0.0198, 0.0085, ..., 0.0129, -0.0205, 0.0050],
[ 0.0292, 0.0152, -0.0175, ..., 0.0256, 0.0276, 0.0082]],
device='cuda:0', dtype=torch.bfloat16, requires_grad=True)
I got:
tensor([[ 0.0007, 0.0007, 0.0007, ..., 0.0007, -0.0007, 0.0007],
[ 0.0007, 0.0007, 0.0007, ..., 0.0007, 0.0007, -0.0007],
[ 0.0007, 0.0007, -0.0007, ..., -0.0007, -0.0007, 0.0007],
...,
[ 0.0007, -0.0007, 0.0007, ..., -0.0007, -0.0007, 0.0007],
[ 0.0007, 0.0007, 0.0007, ..., -0.0007, -0.0007, -0.0007],
[ 0.0007, -0.0007, 0.0007, ..., 0.0007, 0.0007, 0.0007]],
device='cuda:0', dtype=torch.bfloat16)
The text was updated successfully, but these errors were encountered: