Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

the current text generation call will exceed the model's predefined maximum length (4096) #151

Open
waleyW opened this issue Nov 27, 2023 · 4 comments

Comments

@waleyW
Copy link

waleyW commented Nov 27, 2023

Thanks for providing this project.
I am using interface.py to infer the model Llama-2-13b-chat-longlora-32k-sft. I input a text of 32,000 tokens and received the following warning:

This is a friendly reminder: the current text generation call will exceed the model's predefined maximum length (4096). Depending on the model, you may observe exceptions, performance degradation, or nothing at all.

here is my input:

python inference.py
--base_model "Llama-2-13b-chat-longlora-32k-sft"
--question "Please summarize the most important 30 question and answer pairs based on this article."
--context_size 32768
--max_gen_len 32768
--flash_attn True
--material "part_1.txt"

I don't know how to solve this warning. If I run LongAlpaca-7B, it has the same warning.

@cauchy221
Copy link

cauchy221 commented Dec 2, 2023

Same question here.
[Edit] Find an explanation to this question: #80

@waleyW
Copy link
Author

waleyW commented Dec 5, 2023

Same question here. [Edit] Find an explanation to this question: #80

I reset the config, but it still had the warning"the current text generation call will exceed the model's predefined maximum length (4096)"

@cauchy221
Copy link

Same question here. [Edit] Find an explanation to this question: #80

I reset the config, but it still had the warning"the current text generation call will exceed the model's predefined maximum length (4096)"

I also did and it still shows this warning. But I think it's not an error, it's just a warning. Also from my experimental results, my model can correctly understand and generate output with 32k input even with the warning showing up.

@waleyW
Copy link
Author

waleyW commented Dec 5, 2023

Same question here. [Edit] Find an explanation to this question: #80

I reset the config, but it still had the warning"the current text generation call will exceed the model's predefined maximum length (4096)"

I also did and it still shows this warning. But I think it's not an error, it's just a warning. Also from my experimental results, my model can correctly understand and generate output with 32k input even with the warning showing up.

I tested with a long text, and it indeed behaves as you described. Thank you for your response.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants