Add EXAONE 4.0 model support for Inference V2#7853
Add EXAONE 4.0 model support for Inference V2#7853tohtana merged 5 commits intodeepspeedai:masterfrom
Conversation
bd52e9d to
400d05a
Compare
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 400d05a36a
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| """ | ||
| tokens = hidden_states.shape[0] | ||
| local_n_heads = self.n_heads // max(self.tp_size, 1) | ||
| local_n_heads_kv = self.n_heads_kv // max(self.tp_size, 1) |
There was a problem hiding this comment.
As EXAONE4 has uneven Q/KV heads (GQA), I think this can produce incorrect results. Shouldn't we use these?
self.n_heads_q_localinstead ofself.n_heads // self.tp_sizeself.n_heads_kv_localinstead ofself.n_heads_kv // self.tp_size
|
@tohtana Thanks for the review! I've updated the code to use I'll validate the model with coherent text generation and share the results. |
Signed-off-by: Bias92 <pewpewplay315@gmail.com>
Signed-off-by: Bias92 <pewpewplay315@gmail.com>
Signed-off-by: Bias92 <pewpewplay315@gmail.com>
Use n_heads_q_local and n_heads_kv_local for GQA compatibility Signed-off-by: Bias92 <pewpewplay315@gmail.com>
fced31a to
8ece3c1
Compare

Summary
Add support for LG AI Research's EXAONE 4.0 model family in DeepSpeed Inference V2.
Closes #7453
Changes
deepspeed/inference/v2/model_implementations/exaone4/container.py: Transformer and non-transformer parameter containersmodel.py: Inference model with post-norm architecture and QK-Norm supportpolicy.py: Inference V2 policyengine_factory.pyand__init__.pyKey architectural differences from Mistral/Llama
layer_typesconfig)Supported models
Requires
transformers >= 4.54.0.Related