-
Notifications
You must be signed in to change notification settings - Fork 293
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
example grammar is failed #797
Comments
Mac OS 18 The error only occurs with 1) "microsoft/Phi-3-mini-128k-instruct", 2) with a constrained grammar. There is no error when using "meta-llama/Llama-3.1-8B-Instruct" with constrained grammar. I believe the error arrises because the vocab_size is set at 32064, but "https://huggingface.co/microsoft/Phi-3-mini-128k-instruct/blob/main/tokenizer.json" only has 3200 tokens, and "https://huggingface.co/microsoft/Phi-3-mini-128k-instruct/blob/main/added_tokens.json" has 11 tokens. I don't know where the missing 53 tokens are. |
this is related : microsoft/phi-2/discussions/97, #epfl-dlab/transformers-CFG/pull/83 i suggest somethings like:
|
Describe the bug
I got error
directory:
/Users/yuta/ghq/github.com/EricLBuehler/mistral.rs/mistralrs/examples
mymachine environment ```
ProductName: macOS
ProductVersion: 14.4.1
❯ cargo run --example grammar --release
Compiling mistralrs-quant v0.3.0 (/Users/yuta/ghq/github.com/EricLBuehler/mistral.rs/mistralrs-quant)
Compiling mistralrs-core v0.3.0 (/Users/yuta/ghq/github.com/EricLBuehler/mistral.rs/mistralrs-core)
Compiling mistralrs-vision v0.3.0 (/Users/yuta/ghq/github.com/EricLBuehler/mistral.rs/mistralrs-vision)
Compiling mistralrs v0.3.0 (/Users/yuta/ghq/github.com/EricLBuehler/mistral.rs/mistralrs)
Finished
release
profile [optimized] target(s) in 30.75sRunning
/Users/yuta/ghq/github.com/EricLBuehler/mistral.rs/target/release/examples/grammar
2024-09-27T05:38:45.324420Z INFO hf_hub: Token file not found "/Users/yuta/.cache/huggingface/token"
2024-09-27T05:38:45.324590Z INFO mistralrs_core::utils::tokens: Could not load token at "/Users/yuta/.cache/huggingface/token", using no HF token.
2024-09-27T05:38:45.325083Z INFO mistralrs_core::pipeline::normal: Loading
tokenizer.json
atmicrosoft/Phi-3.5-mini-instruct
2024-09-27T05:38:45.325540Z INFO mistralrs_core::pipeline::normal: Loading
config.json
atmicrosoft/Phi-3.5-mini-instruct
2024-09-27T05:38:45.993416Z INFO mistralrs_core::pipeline::paths: Found model weight filenames ["model-00001-of-00002.safetensors", "model-00002-of-00002.safetensors"]
2024-09-27T05:38:46.198383Z INFO mistralrs_core::pipeline::normal: Loading
generation_config.json
atmicrosoft/Phi-3.5-mini-instruct
2024-09-27T05:38:46.933785Z INFO mistralrs_core::pipeline::normal: Loading
tokenizer_config.json
atmicrosoft/Phi-3.5-mini-instruct
2024-09-27T05:38:46.935057Z INFO mistralrs_core::pipeline::normal: Loading model
microsoft/Phi-3.5-mini-instruct
on cpu.2024-09-27T05:38:46.935316Z INFO mistralrs_core::utils::log: Automatic loader type determined to be
phi3
2024-09-27T05:38:46.935866Z INFO mistralrs_core::utils::normal: DType selected is F16.
2024-09-27T05:38:46.935898Z INFO mistralrs_core::pipeline::normal: Model config: Config { vocab_size: 32064, hidden_act: Silu, hidden_size: 3072, intermediate_size: 8192, num_hidden_layers: 32, num_attention_heads: 32, num_key_value_heads: 32, rms_norm_eps: 1e-5, rope_theta: 10000.0, bos_token_id: Some(1), eos_token_id: Some(32000), rope_scaling: Some(Classic { short_factor: [1.0, 1.0199999809265137, 1.0299999713897705, 1.0299999713897705, 1.0499999523162842, 1.0499999523162842, 1.0499999523162842, 1.0499999523162842, 1.0499999523162842, 1.069999933242798, 1.0999999046325684, 1.1099998950958252, 1.1599998474121094, 1.1599998474121094, 1.1699998378753662, 1.2899998426437378, 1.339999794960022, 1.679999828338623, 1.7899998426437378, 1.8199998140335083, 1.8499997854232788, 1.879999756813049, 1.90999972820282, 1.9399996995925903, 1.9899996519088743, 2.0199997425079346, 2.0199997425079346, 2.0199997425079346, 2.0199997425079346, 2.0199997425079346, 2.0199997425079346, 2.0299997329711914, 2.0299997329711914, 2.0299997329711914, 2.0299997329711914, 2.0299997329711914, 2.0299997329711914, 2.0299997329711914, 2.0299997329711914, 2.0299997329711914, 2.0799996852874756, 2.0899996757507324, 2.189999580383301, 2.2199995517730713, 2.5899994373321533, 2.729999542236328, 2.749999523162842, 2.8399994373321533], long_factor: [1.0800000429153442, 1.1100000143051147, 1.1399999856948853, 1.340000033378601, 1.5899999141693115, 1.600000023841858, 1.6200000047683716, 2.620000123977661, 3.2300000190734863, 3.2300000190734863, 4.789999961853027, 7.400000095367432, 7.700000286102295, 9.09000015258789, 12.199999809265137, 17.670000076293945, 24.46000099182129, 28.57000160217285, 30.420001983642575, 30.840002059936523, 32.590003967285156, 32.93000411987305, 42.32000350952149, 44.96000289916992, 50.34000396728515, 50.45000457763672, 57.55000305175781, 57.93000411987305, 58.21000289916992, 60.1400032043457, 62.61000442504883, 62.62000274658203, 62.71000289916992, 63.1400032043457, 63.1400032043457, 63.77000427246094, 63.93000411987305, 63.96000289916992, 63.970001220703125, 64.02999877929688, 64.06999969482422, 64.08000183105469, 64.12000274658203, 64.41000366210938, 64.4800033569336, 64.51000213623047, 64.52999877929688, 64.83999633789063], scaling_type: Su }), max_position_embeddings: 131072, use_flash_attn: false, sliding_window: Some(262144), original_max_position_embeddings: 4096, quantization_config: None }
100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 67/67 [00:04<00:00, 11.71it/s]
100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 128/128 [00:10<00:00, 7.58it/s]
2024-09-27T05:39:05.613379Z INFO mistralrs_core::pipeline::isq: Applying in-situ quantization into Some(Q4K) to 129 tensors.
2024-09-27T05:39:05.613615Z INFO mistralrs_core::pipeline::isq: Applying ISQ on 10 threads.
2024-09-27T05:39:12.267294Z INFO mistralrs_core::pipeline::isq: Applied in-situ quantization into Some(Q4K) to 129 tensors out of 129 total tensors. Took 6.65s
2024-09-27T05:39:12.311255Z INFO mistralrs_core::pipeline::chat_template: bos_toks = "
", eos_toks = "<|endoftext|>", "<|end|>", "<|assistant|>", unk_tok =Error: shape mismatch in add, lhs: [32064], rhs: [32011]
The text was updated successfully, but these errors were encountered: