You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I followed the tutorial for NMT multilingual models finetunning. But in the last step where we run the megatron_nmt_training.py as told in the tutorial t gives an error regarding a key(tokenizer not being found in the pretrained models config file).
Hello there,
I followed the tutorial for NMT multilingual models finetunning. But in the last step where we run the megatron_nmt_training.py as told in the tutorial t gives an error regarding a key(tokenizer not being found in the pretrained models config file).
I ran this command :
HYDRA_FULL_ERROR=1
python /opt/NeMo/examples/nlp/machine_translation/megatron_nmt_training.py
trainer.precision=32
trainer.devices=1
trainer.max_epochs=5
trainer.max_steps=200000
trainer.val_check_interval=5000
trainer.log_every_n_steps=5000
model.multilingual=True
model.pretrained_model_path=workspace/model/pretrained_ckpt/megatronnmt_any_en_500m.nemo
model.micro_batch_size=1
model.global_batch_size=2
model.encoder_tokenizer.library=sentencepiece
model.decoder_tokenizer.library=sentencepiece
model.encoder_tokenizer.model=workspace/tokenizer/spm_64k_all_32_langs_plus_en_nomoses.model
model.decoder_tokenizer.model=workspace/tokenizer/spm_64k_all_32_langs_plus_en_nomoses.model
model.src_language=['es, pt']
model.tgt_language=en
model.train_ds.src_file_name=workspace/data/train_src_files
model.train_ds.tgt_file_name=workspace/data/train_tgt_files
model.test_ds.src_file_name=workspace/data/en_es_final_es_test_filepath
model.test_ds.tgt_file_name=workspace/data/en_es_final_en_test_filepath
model.validation_ds.src_file_name=workspace/data/val_src_files
model.validation_ds.tgt_file_name=workspace/data/val_tgt_files
model.optim.lr=0.00001
model.train_ds.concat_sampling_probabilities=['0.1, 0.1']
++model.pretrained_language_list=None
+model.optim.sched.warmup_steps=500
~model.optim.sched.warmup_ratio
exp_manager.resume_if_exists=True
exp_manager.resume_ignore_no_checkpoint=True
exp_manager.create_checkpoint_callback=True
exp_manager.checkpoint_callback_params.monitor=val_sacreBLEU_avg
exp_manager.checkpoint_callback_params.mode=max
exp_manager.checkpoint_callback_params.save_top_k=5
+exp_manager.checkpoint_callback_params.save_best_model=true
and it gives this error :
Traceback (most recent call last):
File "/opt/NeMo/examples/nlp/machine_translation/megatron_nmt_training.py", line 113, in main
pretrained_cfg.encoder_tokenizer = pretrained_cfg.tokenizer
omegaconf.errors.ConfigAttributeError: Missing key tokenizer
full_key: tokenizer
object_type=dict
The text was updated successfully, but these errors were encountered: