Requested tokens (XXXX) exceed context window of 2048 #30

rop79 · 2024-10-17T19:03:05Z

There are two existing (closed) issues related to this, but neither offers a solution. I've tweaked the n_ctx value, but the error persists: 2048 tokens aren't enough. So, is this parameter ineffective, or am I missing something?

#18

#7

Traceback (most recent call last): File "c:\coding\Local-File-Organizer-gpu\Local-File-Organizer\main.py", line 339, in <module> main() File "c:\coding\Local-File-Organizer-gpu\Local-File-Organizer\main.py", line 254, in main data_texts = process_text_files(text_tuples, text_inference, silent=silent_mode, log_file=log_file) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "c:\coding\Local-File-Organizer-gpu\Local-File-Organizer\text_data_processing.py", line 60, in process_text_files data = process_single_text_file(args, text_inference, silent=silent, log_file=log_file) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "c:\coding\Local-File-Organizer-gpu\Local-File-Organizer\text_data_processing.py", line 37, in process_single_text_file foldername, filename, description = generate_text_metadata(text, file_path, progress, task_id, text_inference) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "c:\coding\Local-File-Organizer-gpu\Local-File-Organizer\text_data_processing.py", line 71, in generate_text_metadata description = summarize_text_content(input_text, text_inference) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "c:\coding\Local-File-Organizer-gpu\Local-File-Organizer\text_data_processing.py", line 21, in summarize_text_content response = text_inference.create_completion(prompt) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\pflic\miniconda3\envs\local_file_organizer-gpu\Lib\site-packages\nexa\gguf\nexa_inference_text.py", line 234, in create_completion return self.model.create_completion(prompt=prompt, temperature=temperature, max_tokens=max_tokens, top_k=top_k, top_p=top_p, echo=echo, stream=stream, stop=stop, logprobs=logprobs, top_logprobs=top_logprobs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\pflic\miniconda3\envs\local_file_organizer-gpu\Lib\site-packages\nexa\gguf\llama\llama.py", line 1748, in create_completion completion: Completion = next(completion_or_chunks) # type: ignore ^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\pflic\miniconda3\envs\local_file_organizer-gpu\Lib\site-packages\nexa\gguf\llama\llama.py", line 1191, in _create_completion raise ValueError( ValueError: Requested tokens (2302) exceed context window of 2048

The text was updated successfully, but these errors were encountered:

QiuYannnn · 2024-10-18T01:14:11Z

Sorry for late response, you can try to modify the context window to avoid the problem. link

sindbadsailor · 2024-10-18T04:30:01Z

I am on Mac M2 and it is working, I am mainly trying to use it to organize a pdf ebook library, but as soon as I put more than 10 pdf files in a folder I get the (numbers are varying) "Requested tokens (2162) exceed context window of 2048" error, i have already tried your fix to modify the context window, but the problem persists.
is it possible to set a n_ctx greater than 2048 ?
Any ideas?

Is it possible to set it so that only directory structure gets changed but filenames remain the same?

Will there be epub support in the future?

rop79 · 2024-10-18T13:39:32Z

I tried that too, unfortunately it doesn't work. I think the parameter isn't being passed correctly to the nexasdk. If I set it there, then it works.

sindbadsailor · 2024-10-19T00:28:31Z

how to set it correctly in the nexasdk? what value have you set it to there?

QiuYannnn · 2024-10-21T21:51:03Z

As what I said before, you can try to modify the context window by n_ctx to avoid the problem. link

If still doesn't work, I will fix it in next version

rop79 · 2024-10-22T19:01:32Z

Sorry for the late feedback. I tried this before, and it seems that the parameter is not being given to Nexa. However, when I change the value in nexa_inference_text.py on line 108, the change takes effect. the file is located in the virtual enviroment:
~\miniconda3\envs\local_file_organizer-gpu\Lib\site-packages\nexa\gguf\nexa_inference_text.py

RonanDex · 2024-10-24T01:27:40Z

Error Description: ValueError: Requested tokens (3857) exceed context window of 2048.
This error occurs during text inference, specifically when calling the text_inference.create_completion() function. It indicates that the requested number of tokens (3857) exceeds the model’s maximum context window (2048). This happens because the input text provided to the model is longer than the model’s processing capacity.

I checked the code, and max_new_tokens=3000, but I’m still getting an error indicating that the requested tokens exceed the limit.

philpav · 2024-10-25T07:04:53Z

Same here. I added n_ctx=4096 and still receive ValueError: Requested tokens (2997) exceed context window of 2048

rop79 · 2024-10-26T18:09:00Z

did you try to change the value directly in: nexa_inference_text.py?

sindbadsailor · 2024-10-28T00:47:12Z

What is the max value that n_ctx can be set to?

ibagur · 2024-10-29T09:07:16Z

After having a look to the codebase of both the NexaVLMInference and NexaTextInference class definitions, I noticed that the parameter to pass in the call made from main.py is nctx instead of n_ctx. For some reason, internally in those classes the parameter used is n_ctx, but it is retrieved from the arguments as nctx. In summary, there is no need to directly modify the module class in nexa_inference_text.py, just add the parameter value using nctx as follows:


            # Initialize the image inference model
            image_inference = NexaVLMInference(
                model_path=model_path,
                local_path=None,
                stop_words=[],
                temperature=0.3,
                max_new_tokens=4096,
                top_k=3,
                top_p=0.2,
                profiling=False,
                nctx = 4096
            )

            # Initialize the text inference model
            text_inference = NexaTextInference(
                model_path=model_path_text,
                local_path=None,
                stop_words=[],
                temperature=0.5,
                max_new_tokens=4096,  # Adjust as needed
                top_k=3,
                top_p=0.3,
                profiling=False,
                nctx = 10000
            )

I have tested and it works and the parameter is properly passed and set.

ibagur · 2024-10-29T09:13:10Z

What is the max value that n_ctx can be set to?

That depends on the model you refer to. For the text model used here, llama3.2-3b, I think it is 128K and for the multimodal 'llava-v1.6-vicuna-7b' is 8K. In any case you can check on each model card.

ibagur · 2024-10-29T09:22:22Z

I am on Mac M2 and it is working, I am mainly trying to use it to organize a pdf ebook library, but as soon as I put more than 10 pdf files in a folder I get the (numbers are varying) "Requested tokens (2162) exceed context window of 2048" error, i have already tried your fix to modify the context window, but the problem persists. is it possible to set a n_ctx greater than 2048 ? Any ideas?

Is it possible to set it so that only directory structure gets changed but filenames remain the same?

Will there be epub support in the future?

Take into account that for large PDF's such as books it might take very long time if you choose the option '1. By Content'. Maybe the tool could be optimised in that aspect, so it is not needed to digest the whole document in order to speed up things.

RobertWi · 2025-01-05T19:40:43Z

nctx works and got things going then on hundreds of markdown files with most of the time not exceeding 50 lines and each md file takes about 5 a 6 minutes to process. It seems the model is being asked 10 a 15 times before deciding it to assign a category. Takes 66 hour to finish in my case so some optimization will help here.

SBMatthew · 2025-01-12T14:47:16Z

I am on Mac M2 and it is working, I am mainly trying to use it to organize a pdf ebook library, but as soon as I put more than 10 pdf files in a folder I get the (numbers are varying) "Requested tokens (2162) exceed context window of 2048" error, i have already tried your fix to modify the context window, but the problem persists. is it possible to set a n_ctx greater than 2048 ? Any ideas?
Is it possible to set it so that only directory structure gets changed but filenames remain the same?
Will there be epub support in the future?

Take into account that for large PDF's such as books it might take very long time if you choose the option '1. By Content'. Maybe the tool could be optimised in that aspect, so it is not needed to digest the whole document in order to speed up things.

I've reviewed the code, and it appears to process the first three pages of any PDF. The issue lies in the fact that the CPU is handling the processing. It would be more efficient if we could convert the first three pages into images and use the GPU for text extraction, leveraging tools like EasyOCR.

rop79 changed the title ~~Requested tokens (XXXX exceed context window of 2048~~ Requested tokens (XXXX) exceed context window of 2048 Oct 17, 2024

SBMatthew mentioned this issue Jan 12, 2025

Requested tokens (2288) exceed context window of 2048 #43

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Requested tokens (XXXX) exceed context window of 2048 #30

Requested tokens (XXXX) exceed context window of 2048 #30

rop79 commented Oct 17, 2024

QiuYannnn commented Oct 18, 2024

sindbadsailor commented Oct 18, 2024

rop79 commented Oct 18, 2024

sindbadsailor commented Oct 19, 2024

QiuYannnn commented Oct 21, 2024 •

edited

Loading

rop79 commented Oct 22, 2024

RonanDex commented Oct 24, 2024 •

edited

Loading

philpav commented Oct 25, 2024

rop79 commented Oct 26, 2024

sindbadsailor commented Oct 28, 2024

ibagur commented Oct 29, 2024 •

edited

Loading

ibagur commented Oct 29, 2024

ibagur commented Oct 29, 2024

RobertWi commented Jan 5, 2025 •

edited

Loading

SBMatthew commented Jan 12, 2025

Requested tokens (XXXX) exceed context window of 2048 #30

Requested tokens (XXXX) exceed context window of 2048 #30

Comments

rop79 commented Oct 17, 2024

QiuYannnn commented Oct 18, 2024

sindbadsailor commented Oct 18, 2024

rop79 commented Oct 18, 2024

sindbadsailor commented Oct 19, 2024

QiuYannnn commented Oct 21, 2024 • edited Loading

rop79 commented Oct 22, 2024

RonanDex commented Oct 24, 2024 • edited Loading

philpav commented Oct 25, 2024

rop79 commented Oct 26, 2024

sindbadsailor commented Oct 28, 2024

ibagur commented Oct 29, 2024 • edited Loading

ibagur commented Oct 29, 2024

ibagur commented Oct 29, 2024

RobertWi commented Jan 5, 2025 • edited Loading

SBMatthew commented Jan 12, 2025

QiuYannnn commented Oct 21, 2024 •

edited

Loading

RonanDex commented Oct 24, 2024 •

edited

Loading

ibagur commented Oct 29, 2024 •

edited

Loading

RobertWi commented Jan 5, 2025 •

edited

Loading