Request from https://requests-navigator.atlassian.net/jira/servicedesk/projects/NFS/queues/custom/237/NFS-320 Currently TP > kv_head is not supported, we need to implement KV head replication to support it. "users are blocked by nemo -> trtllm convertor does not support kv head replication yet. When do the converting on their model checkpoint, hit https://github.com/NVIDIA/NeMo/blob/main/nemo/export/trt_llm/converter/utils.py#L487-L491 "