-
Notifications
You must be signed in to change notification settings - Fork 806
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
getModelJSON
fails with a bundler (see reproduction)
#366
Comments
Hi there 👋 Thanks for making the repro 👍 I believe the problem is the following:
Could you try the following:
This will bypass the local file check and download from the HF hub instead. |
Hi there, facing this exact issue at the moment. I thought about disallowing local models and I am glad to see it suggested here but it does not work for me : I keep getting the error. |
I digged a bit into the generated Parcel code and the source map support is kind of broken (making it hard to inspect values and setting breakpoints), but from a bit of testing I figured that this works at least: import { pipeline, env } from '@xenova/transformers';
env.allowLocalModels = false;
env.useBrowserCache = false; Entire test script: script.js import { pipeline, env } from '@xenova/transformers';
env.allowLocalModels = false;
env.useBrowserCache = false;
console.log('✅ Pipeline function is correctly loaded: \n', pipeline);
console.log('❌ But attempting to create a pipeline fails: ');
// Allocate a pipeline for sentiment-analysis
(async function () {
const pipe = await pipeline('sentiment-analysis');
console.log("pipe", pipe);
})(); |
Thanks, I got to the same conclusion : it only works if you force the download form the remote URL. I could not get to the bottom of it. I was hopeful that another solution existed because, for the project I am working on, this work-around means that users would download the model over and again |
This really helped me! I know it's only tangentially related to the original post, but I ran into the same JSON parsing error when trying to build the client-side example from Hugging Face's Next.JS tutorial (https://huggingface.co/docs/transformers.js/tutorials/next#client-side-inference) and adding the line |
I don’t think it is only tangentially related, I think it is the same bug everywhere : before downloading the tokenizer from HF, transformers.js checks if it exists locally or in browser cache by following a specific path for each option. If the |
@pappitti I see what you mean. Here's more context and details I found for my particular case: Attempting to run in development mode the client-side example provided in Hugging Face's Nextjs tutorial:
|
Also hitting this, but it seems V3 will fix it. Work around for now:
seems to be fixing it |
Also hit this issue and spent some time debugging. Several things contributing to this issue:
|
I Also hitting this, and In my debugging I think cache.match(name) Seems to get data from an abnormal url. this is what cache.match(name) return data and I don't know exactly, but it's confirmed that url is " " In addition, I Think there is a problem that is likely to be a dev environmental problem in next.js with high confirmation, the build result works normally in the space of huggingface. |
Quick follow-up for the record, as it seems others have been encountering the issue:
|
Those using |
@masylum Yes. I'm using it in a React app, and it re-downloads models on each HMR reload. The production build seems okay from what I can tell. |
Same |
are you all saying that browser caching actually works in production build? I obviously did not think about trying given it crashed all the time in dev. |
The problem here is the appalling ignorance of build tools like It all boils down to that
You can test that yourself in this example: https://github.com/xenova/transformers.js/tree/main/examples/text-to-speech-client
Even non-programmers know about the famous 404 when a page doesn't exist. What are they thinking? Only god knows, but what you should know: drop tools that at best slow you down and at worst introduce bugs in good code. Solution?
Of course it requires some extra work, but in the long run you can thank yourself to not waste time on non-issues and the freedom of not being required to open dozens of terminals and messing around with shell commands just to use a simple HTML file. Programming should be simple, fast and working, compare that to over-engineered, slow and non-working. Hence this issue exists. |
for that one project, it was indeed vite... |
usually in single page applications, all the 404 requests go to your root index.html, so your spa routes can deal with it. this is how vite behaves when you're running a spa. i'm working with vue, but it should be the case generally with react or any other frontend framework. so when your models are not found it goes to the default index.html, and the status code will be 200 with the HTML that your spa returns. you can add a vite plugin to overwrite this behaviour for the files concerning here is a simple config that should work. i havent researched into better ways to do this. i'm just setting up a simple prototype to test out transformers.js. also remember this is for the dev server, when deploying to production you'll have to translate this concept into whatever setup you use. import { defineConfig } from "vite";
import vue from "@vitejs/plugin-vue";
import fs from "fs";
import path from "path";
// https://vitejs.dev/config/
export default defineConfig({
plugins: [
vue(),
{
name: "out-json-404",
configureServer(server) {
server.middlewares.use((req, res, next) => {
// check if the request is for a json file or onnx file
if (req.url.endsWith(".json") || req.url.endsWith(".onnx")) {
// check if the file exists
const filePath = path.join(server.config.root, req.url);
// if it doesnt exist, return a 404
if (!fs.existsSync(filePath)) {
res.statusCode = 404;
res.end(JSON.stringify({ error: "JSON file not found" }));
return;
}
}
// continue as normal otherwise
next();
});
},
},
],
}); |
@amk-dev It would be great if this library allowed providing models manually. I would love the ability to load ONNX files with |
Updated to v3; Now I'm getting the following error: @parcel/core: Failed to resolve '#onnxruntime-webgpu' from './node_modules/@huggingface/transformers/src/backends/onnx.js' bug-repro-transformersjs/node_modules/@huggingface/transformers/src/backends/onnx.js:29:27
|
Most likely a parcel issue! I was able to repro the issue with parcel, but indeed Vite works fine: I've filed a bug against the parcel repo: |
Describe the bug
Hi there, I'm trying to use remote models in a basic web app that relies on a bundler (parcel).
getModelJSON
doesn't seem to load models correctly. I put together a minimal reproduction, see below.(Note: no issue observed when not using a bundler, and e.g. using CDN transformers instead)
This may be a duplicate of, or related to, #364.
Thanks!
How to reproduce
Bug repro repo: https://github.com/maudnals/bug-repro-transformersjs/tree/main
Steps to reproduce:
yarn install
yarn parcel index.html
script.js
. We're simply callingawait pipeline('sentiment-analysis')
http://localhost:1234
in your browser, or whichever URL parcel gives youExpected behavior
The model should load properly.
Logs/screenshots
Environment
"@xenova/transformers": "^2.6.2"
Addtional context
When adding a breakpoint in
getModelJSON
, the model path seem fine (openai/...
).I wonder if the issue lays in path rewriting somewhere alongside the bundling steps?
The text was updated successfully, but these errors were encountered: