Skip to content

Deepseek3.2 W4afp8 convert fail #662

@whybeyoung

Description

@whybeyoung

when i do the :
# Do per-tensor fp8 calibration torchrun --nproc-per-node 8 --master_port=12346 ptq2.py --model_path /work/models/v32-mid --config /work/models/TensorRT-Model-Optimizer/modelopt/DeepSeek-V3.2-Exp/inference/config_671B_v3.2.json --quant_cfg FP8_DEFAULT_CFG --output_path ds_v32_fp8_per_tensor_calibration

then:
Image

in hopper H20 96G

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions