Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PyO3: Add optional candle.onnx module #1282

Merged
merged 15 commits into from
Nov 8, 2023

Conversation

LLukas22
Copy link
Contributor

@LLukas22 LLukas22 commented Nov 6, 2023

This PR adds a lightweight wrapper for the candle-onnx crate.

Mainly a ONNXModel class is added, which allows to load onnx models, get some metadata information, and infer the models.

The inputs and outputs get exposed to simply check what the model expects as inputs / produces as outputs.
The description of these in- and outputs gets wrapped into ONNXTensorDescription instances, which expose the expected DType and Shape.

from candle.onnx import ONNXModel

model = ONNXModel([PATH/TO/MODEL])

print(model.inputs)
print(model.outputs)

# The weights can be accessed via `initializers`
print(model.initializers())

To run inference a dict containing the input tensors has to be passed into the run method.
An example of running roberta:

import candle
from candle.onnx import ONNXModel
from tokenizers import Tokenizer


model = ONNXModel([PATH/TO/MODEL])
tokenizer:Tokenizer = Tokenizer.from_pretrained("roberta-base")

sentence = "Hello, my dog is cute"

tokenized = tokenizer.encode(sentence)

result = model.run({"input_ids": candle.tensor(tokenized.ids).to(candle.i64)})

Simillar to the candle-onnx crate, the candle.onnx module is optional and locked behind the onnx feature flag. Meaning that the project has to be build with maturin develop -r --features onnx to enable it. This should probably be the default behaviour for the CI/CD pipeline.

@LaurentMazare
Copy link
Collaborator

This looks pretty good, happy to merge it if you want (it's marked as draft currently, just remove the tag when you feel ready).

@LLukas22
Copy link
Contributor Author

LLukas22 commented Nov 6, 2023

Yeah, i will update the CI tomorrow to build the wheels with the onnx feature flag per default. The only problem i have is that the CI seams to fail with some sort of API rate limit in the setup-protoc action. We probably have to provide the secrets.GITHUB_TOKEN there to prevent this.

@LaurentMazare
Copy link
Collaborator

I've just disabled the protoc bits on the CI for the time being - agreed that we should restore them at some point.

@LLukas22
Copy link
Contributor Author

LLukas22 commented Nov 7, 2023

Alright, i enable the onnx build for the windows and macos wheels. It's currently very difficult to build with onnx support for all the different linux platforms as maturin uses manylinux containers to build the wheels, which means we would need to install protoc into each container seperatly.

Since we only need protoc once to compile the onnx.proto3 into the onnx.rs, maybe we could change the build or include behaviour to first look into some sort of cache directory before re-building the onnx.proto3. This would allow us to build onnx.rs once on a x64 system and then mount it into the different containers, which allows us to build the wheels without the need to install protoc into each container. But thats probably something for a different PR.

Other than that, this should be ready for a review.

@LLukas22 LLukas22 marked this pull request as ready for review November 7, 2023 12:34
@LaurentMazare LaurentMazare merged commit f3a4f3d into huggingface:main Nov 8, 2023
21 of 23 checks passed
@LaurentMazare
Copy link
Collaborator

Great thanks. I think at this stage the pyo3 api offers lots of possibilities and we're mostly lacking on tutorial/get started like material so as to get actual users starting using it (which would be nice to get some feedback on where to push this further). Do you think you could try advertise this a bit, e.g. writing some blog post on whichever platform, posting on reddit (maybe on r/rust and on localllama if it's to advertised the quantized bits) and maybe on some other social platforms?

@LLukas22
Copy link
Contributor Author

LLukas22 commented Nov 8, 2023

Great thanks. I think at this stage the pyo3 api offers lots of possibilities and we're mostly lacking on tutorial/get started like material so as to get actual users starting using it (which would be nice to get some feedback on where to push this further).

I also think, that it would be nice to get some users on board to get some feedback on the api and check if/how it should be expanded. Regarding the tutorials/get started materials, i thought about adding a chapter to the candle book with the basics and maybe some "How to build/port a model" section.

Do you think you could try advertise this a bit, e.g. writing some blog post on whichever platform, posting on reddit (maybe on r/rust and on localllama if it's to advertised the quantized bits) and maybe on some other social platforms?

To be frank, is suck at advertising these kind of things. And we should probably upload the newest wheels to pypi (and add some sort of "How to build section") before posting anywhere about this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants