huggingface / optimum-quanto Public

Notifications You must be signed in to change notification settings
Fork 58
Star 803

Code
Issues 19
Pull requests
Actions
Projects
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Actions
Projects
Security
Insights

Issues: huggingface/optimum-quanto

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

19 Open 105 Closed

Author

Filter by author

Label

Filter by label

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Milestones

Filter by milestone

Assignee

Filter by who’s assigned

Assigned to nobody

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Issues list

Will module output not be quantized when the model is directly trained after Calibration?

#336 opened Oct 11, 2024 by tusiqi1

LayerNorm with None weight throws exception

#335 opened Oct 8, 2024 by doctorpangloss

Corrupted outputs with Marlin int4 kernels as parallelization increases bug

Something isn't working

help wanted

Extra attention is needed

#332 opened Oct 6, 2024 by dacorvo

optimum-quanto 0.25 requires ninja but 'pip check flux' reports 'ninja-1.11.1.1 is not supported on this platform'

#331 opened Oct 5, 2024 by Davros666

issues with non-contiguous Tensor

#327 opened Oct 2, 2024 by bghira

mps low-bit kernels from torchao

#322 opened Sep 28, 2024 by bghira

Accuracy issue when using torch._int_mm on AMD CPUs

#319 opened Sep 26, 2024 by dacorvo

Does AWQ is officially supported now? Stale

#313 opened Sep 20, 2024 by lifelongeeek

qint4 failed for diffusers: QBitsTensor cannot be changed

#312 opened Sep 19, 2024 by liyihao1230

Potential Gradient Error when Reloading Frozen Weights in qmodule.py _load_from_state_dict

#293 opened Aug 24, 2024 by cjfghk5697

Support for FP8 Matmuls

#275 opened Aug 9, 2024 by maktukmak

Support for new diffuser: flux1.schnell

#272 opened Aug 7, 2024 by KoppAlexander

Packages created on the CI are missing cpp and cuda extension files

#254 opened Jul 23, 2024 by dacorvo

Pixart sigma example crash on CUDA arch >= 80 with int4 weights

#248 opened Jul 18, 2024 by dacorvo

qint4 failing with PixArt Transformer

#228 opened Jul 3, 2024 by sayakpaul

Inference from a reload quantized open clip model (by .load_state_dict) resulted in IndexError

#217 opened Jun 24, 2024 by kechan

Verify extension behaviour in google Colab

#206 opened May 31, 2024 by dacorvo

Switch to ruff native formatter good first issue

Good for newcomers

help wanted

Extra attention is needed

#186 opened Apr 22, 2024 by dacorvo

QTensor cannot be created from inside a dynamo graph

#46 opened Dec 11, 2023 by dacorvo

ProTip! What’s not been updated in a month: updated:<2024-09-23.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly