-
Notifications
You must be signed in to change notification settings - Fork 454
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optional subfolder
if model repository contains one ONNX model behind a subfolder
#2008
Comments
You mean |
@tomaarsen does the suggestion work for you ? |
@IlyasMoutawwakil
No, it sets it as
I think that might be fine as well. I'm also curious about a potential unexpected crash for users. Imagine a scenario where a user is loading an ONNX model from a remote repository on HF. This repository has a Would you consider replacing the error with a warning instead, if a file named
|
Hello!
The Quirk
I've noticed some interesting behaviour, and I think there's a chance that it's unintended. Let's start with this snippet:
Perhaps surprisingly, perhaps not, this fails:
This file indeed does not exist, there is only a
model.onnx
in anonnx
subfolder: https://huggingface.co/BAAI/bge-small-en-v1.5/tree/mainWhen the
file_name
is not specified, such as in the above snippet, then thefrom_pretrained
call will try and infer it:optimum/optimum/onnxruntime/modeling_ort.py
Lines 509 to 529 in 8cb6832
In our case, we take the else branch (as the model is remote):
optimum/optimum/onnxruntime/modeling_ort.py
Lines 513 to 519 in 8cb6832
Here,
repo_files
is:which leads to a
onnx_files
of:This bypasses the
if len(...) == 0
andif len(...) > 1
errors, and setsfile_name
asonnx_files[0].name
, i.e."model.onnx"
.This then fails when actually loading the model, because there is no
"model.onnx"
in the root of the repository, whereas we can be quite sure that the user intended to load this ONNX model. Instead, we currently require that the user specifies eithersubfolder="onnx"
orfile_name="onnx/model.onnx"
.Potential Fixes
Fix A
This would work in the normal cases as well as when the only ONNX file is in a subfolder. The
relative_to
means that it'll also work if asubfolder
was provided. There might still be some missed edge cases.The downside is that this results in the following warning:
Fix B
if file_name is None: if model_path.is_dir(): onnx_files = list(model_path.glob("*.onnx")) else: repo_files, _ = TasksManager.get_model_files( model_id, revision=revision, cache_dir=cache_dir, token=token ) repo_files = map(Path, repo_files) pattern = "*.onnx" if subfolder == "" else f"{subfolder}/*.onnx" onnx_files = [p for p in repo_files if p.match(pattern)] if len(onnx_files) == 0: raise FileNotFoundError(f"Could not find any ONNX model file in {model_path}") elif len(onnx_files) > 1: raise RuntimeError( f"Too many ONNX model files were found in {model_path}, specify which one to load by using the " "file_name argument." ) else: file_name = onnx_files[0].name + subfolder = onnx_files[0].parent.as_posix()
This overrides/sets the subfolder so that we load e.g.
model.onnx
from whatever subfolder it exists in. There might still be some missed edge cases.Will you consider a fix for this quirk?
The text was updated successfully, but these errors were encountered: