Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

example grammar is failed #797

Open
higumachan opened this issue Sep 27, 2024 · 2 comments
Open

example grammar is failed #797

higumachan opened this issue Sep 27, 2024 · 2 comments
Labels
bug Something isn't working

Comments

@higumachan
Copy link

Describe the bug

I got error

directory: /Users/yuta/ghq/github.com/EricLBuehler/mistral.rs/mistralrs/examples
mymachine environment ```
ProductName: macOS
ProductVersion: 14.4.1

Hardware Overview:

  Model Name: MacBook Pro
  Model Identifier: MacBookPro18,4
  Model Number: Z15H0016ZJ/A
  Chip: Apple M1 Max
  Total Number of Cores: 10 (8 performance and 2 efficiency)
  Memory: 64 GB

❯ cargo run --example grammar --release
Compiling mistralrs-quant v0.3.0 (/Users/yuta/ghq/github.com/EricLBuehler/mistral.rs/mistralrs-quant)
Compiling mistralrs-core v0.3.0 (/Users/yuta/ghq/github.com/EricLBuehler/mistral.rs/mistralrs-core)
Compiling mistralrs-vision v0.3.0 (/Users/yuta/ghq/github.com/EricLBuehler/mistral.rs/mistralrs-vision)
Compiling mistralrs v0.3.0 (/Users/yuta/ghq/github.com/EricLBuehler/mistral.rs/mistralrs)
Finished release profile [optimized] target(s) in 30.75s
Running /Users/yuta/ghq/github.com/EricLBuehler/mistral.rs/target/release/examples/grammar
2024-09-27T05:38:45.324420Z INFO hf_hub: Token file not found "/Users/yuta/.cache/huggingface/token"
2024-09-27T05:38:45.324590Z INFO mistralrs_core::utils::tokens: Could not load token at "/Users/yuta/.cache/huggingface/token", using no HF token.
2024-09-27T05:38:45.325083Z INFO mistralrs_core::pipeline::normal: Loading tokenizer.json at microsoft/Phi-3.5-mini-instruct
2024-09-27T05:38:45.325540Z INFO mistralrs_core::pipeline::normal: Loading config.json at microsoft/Phi-3.5-mini-instruct
2024-09-27T05:38:45.993416Z INFO mistralrs_core::pipeline::paths: Found model weight filenames ["model-00001-of-00002.safetensors", "model-00002-of-00002.safetensors"]
2024-09-27T05:38:46.198383Z INFO mistralrs_core::pipeline::normal: Loading generation_config.json at microsoft/Phi-3.5-mini-instruct
2024-09-27T05:38:46.933785Z INFO mistralrs_core::pipeline::normal: Loading tokenizer_config.json at microsoft/Phi-3.5-mini-instruct
2024-09-27T05:38:46.935057Z INFO mistralrs_core::pipeline::normal: Loading model microsoft/Phi-3.5-mini-instruct on cpu.
2024-09-27T05:38:46.935316Z INFO mistralrs_core::utils::log: Automatic loader type determined to be phi3
2024-09-27T05:38:46.935866Z INFO mistralrs_core::utils::normal: DType selected is F16.
2024-09-27T05:38:46.935898Z INFO mistralrs_core::pipeline::normal: Model config: Config { vocab_size: 32064, hidden_act: Silu, hidden_size: 3072, intermediate_size: 8192, num_hidden_layers: 32, num_attention_heads: 32, num_key_value_heads: 32, rms_norm_eps: 1e-5, rope_theta: 10000.0, bos_token_id: Some(1), eos_token_id: Some(32000), rope_scaling: Some(Classic { short_factor: [1.0, 1.0199999809265137, 1.0299999713897705, 1.0299999713897705, 1.0499999523162842, 1.0499999523162842, 1.0499999523162842, 1.0499999523162842, 1.0499999523162842, 1.069999933242798, 1.0999999046325684, 1.1099998950958252, 1.1599998474121094, 1.1599998474121094, 1.1699998378753662, 1.2899998426437378, 1.339999794960022, 1.679999828338623, 1.7899998426437378, 1.8199998140335083, 1.8499997854232788, 1.879999756813049, 1.90999972820282, 1.9399996995925903, 1.9899996519088743, 2.0199997425079346, 2.0199997425079346, 2.0199997425079346, 2.0199997425079346, 2.0199997425079346, 2.0199997425079346, 2.0299997329711914, 2.0299997329711914, 2.0299997329711914, 2.0299997329711914, 2.0299997329711914, 2.0299997329711914, 2.0299997329711914, 2.0299997329711914, 2.0299997329711914, 2.0799996852874756, 2.0899996757507324, 2.189999580383301, 2.2199995517730713, 2.5899994373321533, 2.729999542236328, 2.749999523162842, 2.8399994373321533], long_factor: [1.0800000429153442, 1.1100000143051147, 1.1399999856948853, 1.340000033378601, 1.5899999141693115, 1.600000023841858, 1.6200000047683716, 2.620000123977661, 3.2300000190734863, 3.2300000190734863, 4.789999961853027, 7.400000095367432, 7.700000286102295, 9.09000015258789, 12.199999809265137, 17.670000076293945, 24.46000099182129, 28.57000160217285, 30.420001983642575, 30.840002059936523, 32.590003967285156, 32.93000411987305, 42.32000350952149, 44.96000289916992, 50.34000396728515, 50.45000457763672, 57.55000305175781, 57.93000411987305, 58.21000289916992, 60.1400032043457, 62.61000442504883, 62.62000274658203, 62.71000289916992, 63.1400032043457, 63.1400032043457, 63.77000427246094, 63.93000411987305, 63.96000289916992, 63.970001220703125, 64.02999877929688, 64.06999969482422, 64.08000183105469, 64.12000274658203, 64.41000366210938, 64.4800033569336, 64.51000213623047, 64.52999877929688, 64.83999633789063], scaling_type: Su }), max_position_embeddings: 131072, use_flash_attn: false, sliding_window: Some(262144), original_max_position_embeddings: 4096, quantization_config: None }
100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 67/67 [00:04<00:00, 11.71it/s]
100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 128/128 [00:10<00:00, 7.58it/s]
2024-09-27T05:39:05.613379Z INFO mistralrs_core::pipeline::isq: Applying in-situ quantization into Some(Q4K) to 129 tensors.
2024-09-27T05:39:05.613615Z INFO mistralrs_core::pipeline::isq: Applying ISQ on 10 threads.
2024-09-27T05:39:12.267294Z INFO mistralrs_core::pipeline::isq: Applied in-situ quantization into Some(Q4K) to 129 tensors out of 129 total tensors. Took 6.65s
2024-09-27T05:39:12.311255Z INFO mistralrs_core::pipeline::chat_template: bos_toks = "", eos_toks = "<|endoftext|>", "<|end|>", "<|assistant|>", unk_tok =
Error: shape mismatch in add, lhs: [32064], rhs: [32011]


## Latest commit or version

1eb9cae2a4ec89d7cf8a5fc8d9f57b82f2f747fa
@higumachan higumachan added the bug Something isn't working label Sep 27, 2024
@andrewlimmer
Copy link

Mac OS 18
rev = "86f37fa803c40e9ee14c43e0028ad32f841ceb07"

The error only occurs with 1) "microsoft/Phi-3-mini-128k-instruct", 2) with a constrained grammar. There is no error when using "meta-llama/Llama-3.1-8B-Instruct" with constrained grammar.

I believe the error arrises because the vocab_size is set at 32064, but "https://huggingface.co/microsoft/Phi-3-mini-128k-instruct/blob/main/tokenizer.json" only has 3200 tokens, and "https://huggingface.co/microsoft/Phi-3-mini-128k-instruct/blob/main/added_tokens.json" has 11 tokens. I don't know where the missing 53 tokens are.

Screenshot 2024-10-01 at 2 16 56 PM

@haricot
Copy link

haricot commented Oct 16, 2024

this is related : microsoft/phi-2/discussions/97, #epfl-dlab/transformers-CFG/pull/83

i suggest somethings like:

pub(crate) fn build_tok_trie(tokenizer: Tokenizer, cfg_vocal_size: usize) -> TokTrie {
    let bt = ByteTokenizer::from_tokenizer(tokenizer, cfg_vocal_size).unwrap();
    TokTrie::from(&bt.tokrx_info(), &bt.token_bytes())
}

impl ByteTokenizer {
    pub fn from_tokenizer(mut hft: Tokenizer, cfg_vocal_size: usize) -> Result<ByteTokenizer> {
        ...
        for tok_id in 0..vocab_size {
            ...
        }
        if cfg_vocal_size > res.vocab_size {
            let vocab_size_diff = cfg_vocal_size - res.vocab_size;
            res.vocab_size = cfg_vocal_size;
            res.token_bytes.extend(
                (0..vocab_size_diff)
                    .map(|_| Vec::new())
                    .collect::<Vec<_>>(),
            );
        }
    }
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants