Skip to content

Conversation

tango4j
Copy link
Collaborator

@tango4j tango4j commented Sep 3, 2025

What does this PR do ?

This PR adds bf16 precision training and inference of Sortformer diarizer models.
Hardware: Starting from Ampere (e.g. A100) architecture, native bf16 operation is supported.

Collection: [Note which collection this PR will affect]

ASR/speaker_task

Changelog

NeMo/nemo/collections/asr/losses/bce_loss.py
NeMo/examples/speaker_tasks/diarization/neural_diarizer/e2e_diarize_speech.py

Usage

Although model weights are FP32, e2e_diarize_speech.py script automatically converts the precision to bf16 then perform inference based on bf16.

python $BASEPATH/neural_diarizer/e2e_diarize_speech.py \
    precision="bf16" \ 
    model_path=/path/to/diar_sortformer_4spk_v1.nemo \
    batch_size=1 \
    dataset_manifest=/path/to/diarization_manifest.json

for training, specify the following configuration:

trainer.precision="bf16"

Before your PR is "Ready for review"

Pre checks:

  • Make sure you read and followed Contributor guidelines
  • Did you write any new necessary tests?
  • Did you add or update any necessary documentation?
  • Does the PR affect components that are optional to install? (Ex: Numba, Pynini, Apex etc)
    • Reviewer: Does the PR have correct import guards for all optional libraries?

PR Type:

  • New Feature
  • Bugfix
  • Documentation

If you haven't finished some of the above items you can still open "Draft" PR.

Who can review?

Anyone in the community is free to review the PR once the checks have passed.
Contributor guidelines contains specific people who can review PRs to various areas.

Additional Information

  • Related to # (issue)

KunalDhawan
KunalDhawan previously approved these changes Sep 3, 2025
Copy link
Collaborator

@KunalDhawan KunalDhawan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks Taejin!

@@ -82,6 +82,7 @@ class DiarizationConfig:
no_der: bool = False
out_rttm_dir: Optional[str] = None
save_preds_tensors: bool = False
precision: str = "bf16" # 32, bf16
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's also add the bf16-mixed option and maybe add a small comment about possible gains expected with bf16 training/inference?

Copy link
Collaborator

@ipmedenn ipmedenn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants