Error: Could not locate file (500 error) #944

iamhenry · 2024-09-25T17:45:42Z

System Info

node: 22.7
nextjs: 14

Environment/Platform

Description

I'm trying to load https://huggingface.co/onnx-community/Qwen2.5-0.5B-Instruct and i get the error Error: Could not locate file: "https://huggingface.co/onnx-community/Qwen2.5-0.5B-Instruct/resolve/main/onnx/decoder_model_merged_quantized.onnx".

seems like someone else is running into the same issue https://huggingface.co/onnx-community/Qwen2.5-0.5B-Instruct/discussions/1

how do i fix this? i tried copying the download link directly but somehow it appends .json to the URL like so: https://huggingface.co/https://huggingface.co/onnx-community/Qwen2.5-0.5B-Instruct/resolve/main/onnx/model_quantized.onnx/resolve/main/config.json\

how do i fix this to point to the correct model?

I'd love to use the new qwen2.5 0.5B and the 1.5B

thank you in advanced

code:

import { pipeline, env, Pipeline } from '@xenova/transformers';

// Set the Xenova model directory
env.localModelPath = '/tmp/models';

let generator: Pipeline | null = null;

export async function POST(req: Request) {
  let text;
  try {
    ({ text } = await req.json());
    console.log('Received text:', text);
  } catch (error) {
    console.error('Error parsing request body:', error);
    return NextResponse.json({ error: 'Invalid request body' }, { status: 400 });
  }

  try {
    if (!generator) {
      console.log('Initializing model...');
      generator = await pipeline('text-generation', 'https://huggingface.co/onnx-community/Qwen2.5-0.5B-Instruct/resolve/main/onnx/model_quantized.onnx');
      console.log('Model initialized successfully');
    }

    console.log('Generating text...');
    const messages = [
      { role: 'system', content: 'You are a CBT and ACT AI therapist. Provide advice or another way to frame the situation from a therapist perspective. Ensure it\'s only 1 sentence! Omit unnecessary words that aren\'t related to the feedback or advice.' },
      { role: 'user', content: text }
    ];

    // Apply chat template
    const prompt = generator.tokenizer.apply_chat_template(messages, {
      tokenize: false,
      add_generation_prompt: true,
    });

    console.log('Constructed prompt:', prompt);

    console.log('Calling generator...');
    const output = await generator(prompt, {
      max_new_tokens: 128,
      do_sample: true,
      temperature: 0.7,
      return_full_text: false,
    });

    console.log('Raw result:', JSON.stringify(output, null, 2));

    if (!output || !Array.isArray(output) || output.length === 0 || !output[0].generated_text) {
      throw new Error(`Invalid response from model: ${JSON.stringify(output)}`);
    }

    const generatedText = output[0].generated_text.trim();
    console.log('Generated text:', generatedText);
    return NextResponse.json({ result: generatedText });
  } catch (error) {
    console.error('Error in POST request:', error);
    console.error('Error stack:', error instanceof Error ? error.stack : 'No stack available');
    console.error('Error details:', JSON.stringify(error, Object.getOwnPropertyNames(error)));
    return NextResponse.json({ error: `Server error: ${error instanceof Error ? error.message : String(error)}` }, { status: 500 });
  }
}```

### Reproduction

1. I use this url to add it to Cursor IDE (https://huggingface.co/onnx-community/Qwen2.5-0.5B-Instruct)
2. It updates the code for me
3. get 500 error 

it works when i do qwen1.5 but not the latest 2.5

The text was updated successfully, but these errors were encountered:

iamhenry · 2024-09-25T19:34:48Z

im getting the same error for the new llama 3.1 1B

how do i resolve this issue and prevent it for the future models?

Error: Could not locate file: "https://huggingface.co/onnx-community/Llama-3.2-1B/resolve/main/onnx/decoder_model_merged_quantized.onnx"

iamhenry · 2024-09-25T21:17:13Z

i downloaded the model manually and placed in the the onnx directory

now i get this error after renaming the model to decoder_model_merged_quantized.onnx: Something went wrong during model construction (most likely a missing operation)

nicolasembleton · 2024-09-26T03:28:02Z

You need to specify the file name (and sometimes it's missing the folder as the onnx model files are not always in the onnx folder).

In the case of Qwen2.5, you need to specify the model: "model_quantized" (or whichever exact model version you want to use, check in files tab in HuggingFace and then open the onnx folder, pick the model you want to use and use the name, without the extension).

But then that particular model doesn't work indeed. It's failing at .loadModel() when creating an inference session can't create session.

iamhenry · 2024-09-26T14:57:49Z

@nicolasembleton thx.

So does that mean qwen 2.5 or llama 3.2 isn't supported yet by transformers.js?

flatsiedatsie · 2024-09-27T12:06:13Z

Llama 3.2 1B seems to be supported, as there seems to be a demo available. The source code is missing though.

A good starting point might be the example code for Phi 3.5:
https://github.com/huggingface/transformers.js-examples/blob/main/phi-3.5-webgpu/src/worker.js

For one thing, you don't need to provide the specific .onnx file. Transformers.js will grab the correct one based on the settings you provide when you create the pipeline / model.

More examples:
https://github.com/xenova/transformers.js/tree/v3/examples
https://github.com/huggingface/transformers.js-examples/tree/main

iamhenry · 2024-09-27T16:14:04Z

thank you and great find.

do you know if llama 3.2 1B will work with nextjs? I'm new to dev so i'm thinking it could be a pipeline not being compatible yet?

perhaps the @transformers pipeline isnt supported yet but @huggingface is?

flatsiedatsie · 2024-09-27T17:00:09Z

perhaps the @transformers pipeline isnt supported yet but https://github.com/huggingface is?

Do you mean version 2 and version 3 of Transformers.js? if so, you need to use Transformers.js V3.

iamhenry · 2024-09-27T22:56:57Z

@flatsiedatsie how do i do that? is there a step by step guide?

also are all the features from v2 still functional?

i would hate to install it and cause breaking changes.

thanks in advanced!

flatsiedatsie · 2024-09-28T05:48:42Z

V3 is much MUCH faster because it adds support for WebGPU.

Read through the documentation: https://huggingface.co/docs/transformers.js/index

And there are tons of examples (linked above). Always search in these issues too, there is loads of knowledge in them.

iamhenry · 2024-09-28T15:00:31Z

Appreciate your response, thx

iamhenry added the bug Something isn't working label Sep 25, 2024

iamhenry mentioned this issue Sep 25, 2024

Best small model for text generation? #932

Open

iamhenry changed the title ~~Error: Could not locate file (500 error) for Qwen 2.5~~ Error: Could not locate file (500 error) Sep 25, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Error: Could not locate file (500 error) #944

Error: Could not locate file (500 error) #944

iamhenry commented Sep 25, 2024

iamhenry commented Sep 25, 2024

iamhenry commented Sep 25, 2024

nicolasembleton commented Sep 26, 2024

iamhenry commented Sep 26, 2024

flatsiedatsie commented Sep 27, 2024 •

edited

Loading

iamhenry commented Sep 27, 2024

flatsiedatsie commented Sep 27, 2024

iamhenry commented Sep 27, 2024

flatsiedatsie commented Sep 28, 2024

iamhenry commented Sep 28, 2024

Error: Could not locate file (500 error) #944

Error: Could not locate file (500 error) #944

Comments

iamhenry commented Sep 25, 2024

System Info

Environment/Platform

Description

iamhenry commented Sep 25, 2024

iamhenry commented Sep 25, 2024

nicolasembleton commented Sep 26, 2024

iamhenry commented Sep 26, 2024

flatsiedatsie commented Sep 27, 2024 • edited Loading

iamhenry commented Sep 27, 2024

flatsiedatsie commented Sep 27, 2024

iamhenry commented Sep 27, 2024

flatsiedatsie commented Sep 28, 2024

iamhenry commented Sep 28, 2024

flatsiedatsie commented Sep 27, 2024 •

edited

Loading