-
Notifications
You must be signed in to change notification settings - Fork 136
Description
System Info
I use Linux platform and my python environment setting is the same as requirement.txt.
During my multi-gpu training of PRM, I found that my process will always hangs on at some point in the evaluation loop. After debugging, I found that it's caused by the given "preprocess_logits_for_metrics" function in the prm/code/finetune_qwen.py. What's strange is that the evaluation works well in a single-gpu training.
After making some tests, I found out that this phenomenon will only happen if the two gpus are preprocessing data samples with different nums of steps,a as what I show below, the final logits differ after we take out the step tags.

For now, the bug can be fixed by adding padding code in Transformers' Trainer and doing some small tricks, but I wonder if that could be fixed permanantly. or be fixed by setting some paramters.
Who can help?
No response
Information
- The official example scripts
- My own modified scripts
Tasks
- An officially supported task in the codebase (such as scrips/, ...)
- My own task or dataset (give details below)
Reproduction
Simply run the finetune_qwen.py with a multi_gpu setting.
Expected behavior
The evaluation process should work well.


