Skip to content

[Feature Request]Enable KV head replication in NeMo TRTLLM exporter #290

@snowmanwwg

Description

@snowmanwwg

Request from https://requests-navigator.atlassian.net/jira/servicedesk/projects/NFS/queues/custom/237/NFS-320
Currently TP > kv_head is not supported, we need to implement KV head replication to support it.

"users are blocked by nemo -> trtllm convertor does not support kv head replication yet. When do the converting on their model checkpoint, hit https://github.com/NVIDIA/NeMo/blob/main/nemo/export/trt_llm/converter/utils.py#L487-L491 "

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions