Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error in fp8 quantization: Invalid scale factor : 1.70e+06, make sure the scale is not larger than : 6.55e+04 #1907

Open
yyChen233 opened this issue Jul 9, 2024 · 0 comments
Assignees

Comments

@yyChen233
Copy link

When I use this config to quantize a YOLOv3 model into fp8:
`version: 1.0

model: # mandatory. used to specify model specific information.
name: yolo_v3
framework: pytorch # mandatory. possible values are tensorflow, mxnet, pytorch, pytorch_ipex, onnxrt_integerops and onnxrt_qlinearops.

quantization:
approach: post_training_static_quant # no need for fp8_e5m2
precision: fp8_e4m3 # allowed precision is fp8_e5m2, fp8_e4m3, fp8_e3m4
calibration:
#batchnorm_sampling_size: 3000 # only needed for models w/ BatchNorm
sampling_size: 104

tuning:
accuracy_criterion:
relative: 0.01 # optional. default value is relative, other value is absolute. this example allows relative accuracy loss: 1%.
exit_policy:
max_trials: 50
#timeout: 180 # optional. tuning timeout (seconds). default value is 0 which means early stop. combine with max_trials field to decide when to exit.
random_seed: 1234 # optional. random seed for deterministic tuning.
`

I got these output:
2024-07-09 17:27:54 [INFO] Save tuning history to /mnt/d/LM/neural-compressor/examples/pytorch/object_detection/yolo_v3/quantization/ptq/eager/nc_workspace/2024-07-09_17-27-50/./history.snapshot. 2024-07-09 17:27:54 [INFO] FP32 baseline is: [Accuracy: 0.7232, Duration (seconds): 3.5848] Error: Invalid scale factor : 1.70e+06, make sure the scale is not larger than : 6.55e+04

So how can I handle this problem? Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants