-
Notifications
You must be signed in to change notification settings - Fork 349
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
2024-05-30 Add FP8 PTQ #1877
base: develop
Are you sure you want to change the base?
2024-05-30 Add FP8 PTQ #1877
Conversation
Thanks for your contribution! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
可以增加一下使用文档和使用示例
@@ -21,6 +21,7 @@ | |||
- `EMDObserver`:收集最大绝对值并通过最小化EMD误差,收集量化scale | |||
- `HistObserver`:将张量值收集到直方图中,并根据百分比计算量化scale | |||
- `KLObserver`:以最小化浮点值分布与量化浮点值分布之间的 Kullback-Leibler散度计算量化scale | |||
- `AbsmaxObserver`:根据目标权重的Tensor维度,收集最大绝对值作为量化scale,可使用quant_bits调整量化的数据类型,支持FP8 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
只有 absmaxobserver支持fp8吗?
@@ -60,8 +61,8 @@ model = mobilenet_v1() | |||
q_config = QuantConfig(activation=None, weight=None) | |||
|
|||
# define act_quanter and weight_quanter | |||
act_quanter = MSEObserver() | |||
weight_quanter = MSEObserver() | |||
activation = AbsmaxObserver(quant_bits = (4,3)) # quant_bits = (4,3) and quant_bits = (5,2) for float8_e4m3 and float8_e5m2 formats quantization. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
不要直接修改已有的示例,新增一个示例表示fp8量化,新增文档说明
可以把不同observer的fp8量化实验结果贴上来 |
|
修改Uniform Observer支持FP8量化