Skip to content

ASR model predicting blank or nothing during training stage #14583

@Houss3m

Description

@Houss3m

Describe the bug

I tried to train Conformer-ctc with the default config and default training script on my custom data, in the logs, in the beginning, the model was predicting dummy text, then after like 300 steps, the model started predicting nothing, blank character. and the model didn't learn anything, the training loss was saturated at 300 and wasn't improving at all. and the model kept predicting nothing even after 10 epochs. my data size was 1.5k hours.

I used a tokenizer of size 1024. I suspected at first that maybe my tokenizer is not working well, I tried and picked up some text from the training set, and the tokenizer works fine.

I thought maybe there is something wrong with my data. so I tried with LibriSpeech960, and got same behaviour, I tested with different models and configs (tdt, ctc, hybrid) and still same behaviour.

is this has something to do with the pytorch/nemo versions I am using? or am I doing something wrong?

Steps/Code to reproduce bug
this is how I ran my training:

python {NEMO}/examples/asr/asr_ctc/speech_to_text_ctc_bpe.py \
    --config-path={NEMO}/examples/asr/conf/conformer/ \
    --config-name=conformer_ctc_bpe \
    \
    exp_manager.name="trainingconformerctc" \
    exp_manager.resume_if_exists=true \
    exp_manager.resume_ignore_no_checkpoint=true \
    exp_manager.exp_dir={NEMO}/experiments/conformerctcls960 \
    model.tokenizer.dir=$TOKENIZER \
    model.train_ds.is_tarred=true \
    model.train_ds.tarred_audio_filepaths=$TRAIN_FILEPATHS \
    model.train_ds.manifest_filepath=$TRAIN_MANIFEST \
    model.validation_ds.manifest_filepath=$VAL_MANIFEST \
    trainer.devices=1 \
    trainer.accelerator=gpu

Expected behavior
is this CTC Collapse? is there someone who had the same issue, how you could fix it?
I expected the model to predict some text at least after 5 epochs, but after 10 epochs the model still predicting blank.

example:

[NeMo I 2025-08-26 15:00:15 wer:330] reference:listening to every sound and it was not until the anchor was weighed the sails hoisted and the vessel began to draw away from carthage that he went into his cabin on the sixth day after leaving carthage the ship entered the port of corinth
[NeMo I 2025-08-26 15:00:15 wer:331] predicted:
[NeMo I 2025-08-26 15:00:15 wer:329] 

Environment overview (please complete the following information)

  • Environment location: GPU-H200-141GB
  • Method of NeMo install: I cloned the original repository, and installed the necessary libraries.

Additional context

Image Image Image

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions