You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
An officially supported task in the examples folder
My own task or dataset (give details below)
Reproduction
This issue was reported in the hf transformers repo initially here: huggingface/transformers#29348
I can probably put together a fix for trl when I have some more free time if y'all are interested, since I understand the behaviour now.
Current Behaviour
The base huggingface transformer calls hf_deepspeed_config.trainer_config_finalize(args, model, num_training_steps) to change the values of total_num_steps" and warmup_num_steps from auto to be their calculated value during the inner training loop (when the total_num_steps is know). However, in DPOTrainer if total_num_steps is set to "auto" then the trainer will crash when deepspeed.initialize is called when wrapping the ref model at self.ref_model = self._prepare_deepspeed(self.ref_model).
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
from trl import DPOTrainer, DPOConfig # Make sure you have this module
from datasets import load_dataset
# Load your LLaMA 2 model and tokenizer
model_name = "/home/b3schnei/pretrained/Llama-2-7b" # Change this to the specific LLaMA 2 model you want to use
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name)
from datasets import Dataset
# Load your reference model (if applicable)
ref_model = AutoModelForCausalLM.from_pretrained(model_name)
# Define training arguments
training_args = DPOConfig(
learning_rate=2e-4,
num_train_epochs=1,
per_device_train_batch_size=4,
output_dir='./results',
logging_steps=10,
remove_unused_columns=False,
max_length=1024,
max_prompt_length=512,
fp16=True,
deepspeed="/home/b3schnei/transformers_debug/debug/29348/ds_config.json" # Ensure you have this configuration file
)
train_dataset = load_dataset("json", data_files="debug/29348/dpo.json",split="train")
# Initialize the DPOTrainer
dpo_trainer = DPOTrainer(
model=model,
ref_model=ref_model,
train_dataset=train_dataset,
tokenizer=tokenizer,
args=training_args,
)
# Start training
if __name__ == "__main__":
dpo_trainer.train()
Crash log
[2024-10-02 01:18:15,497] [INFO] [utils.py:789:see_memory_usage] CPU Virtual Memory: used = 121.5 GB, percent = 12.1%
[2024-10-02 01:18:15,497] [INFO] [logging.py:96:log_dist] [Rank 0] DeepSpeed Final Optimizer = DeepSpeedZeroOptimizer_Stage3
[rank0]: Traceback (most recent call last):
[rank0]: File "/home/b3schnei/.vscode-server/extensions/ms-python.debugpy-2024.10.0-linux-x64/bundled/libs/debugpy/_vendored/pydevd/pydevd.py", line 3489, in
[rank0]: main()
[rank0]: File "/home/b3schnei/.vscode-server/extensions/ms-python.debugpy-2024.10.0-linux-x64/bundled/libs/debugpy/_vendored/pydevd/pydevd.py", line 3482, in main
[rank0]: globals = debugger.run(setup['file'], None, None, is_module)
[rank0]: File "/home/b3schnei/.vscode-server/extensions/ms-python.debugpy-2024.10.0-linux-x64/bundled/libs/debugpy/_vendored/pydevd/pydevd.py", line 2510, in run
[rank0]: return self._exec(is_module, entry_point_fn, module_name, file, globals, locals)
[rank0]: File "/home/b3schnei/.vscode-server/extensions/ms-python.debugpy-2024.10.0-linux-x64/bundled/libs/debugpy/_vendored/pydevd/pydevd.py", line 2517, in _exec
[rank0]: globals = pydevd_runpy.run_path(file, globals, 'main')
[rank0]: File "/home/b3schnei/.vscode-server/extensions/ms-python.debugpy-2024.10.0-linux-x64/bundled/libs/debugpy/_vendored/pydevd/_pydevd_bundle/pydevd_runpy.py", line 321, in run_path
[rank0]: return _run_module_code(code, init_globals, run_name,
[rank0]: File "/home/b3schnei/.vscode-server/extensions/ms-python.debugpy-2024.10.0-linux-x64/bundled/libs/debugpy/_vendored/pydevd/_pydevd_bundle/pydevd_runpy.py", line 135, in _run_module_code
[rank0]: _run_code(code, mod_globals, init_globals,
[rank0]: File "/home/b3schnei/.vscode-server/extensions/ms-python.debugpy-2024.10.0-linux-x64/bundled/libs/debugpy/_vendored/pydevd/_pydevd_bundle/pydevd_runpy.py", line 124, in _run_code
[rank0]: exec(code, run_globals)
[rank0]: File "/home/b3schnei/transformers_debug/debug/29348/reproduce.py", line 34, in
[rank0]: dpo_trainer = DPOTrainer(
[rank0]: File "/home/b3schnei/anaconda3/envs/test_transformers/lib/python3.10/site-packages/huggingface_hub/utils/_deprecation.py", line 101, in inner_f
[rank0]: return f(*args, **kwargs)
[rank0]: File "/home/b3schnei/anaconda3/envs/test_transformers/lib/python3.10/site-packages/trl/trainer/dpo_trainer.py", line 883, in init
[rank0]: self.ref_model = self._prepare_deepspeed(self.ref_model)
[rank0]: File "/home/b3schnei/anaconda3/envs/test_transformers/lib/python3.10/site-packages/trl/trainer/dpo_trainer.py", line 924, in prepare_deepspeed
[rank0]: model, * = deepspeed.initialize(model=model, config=config_kwargs)
[rank0]: File "/home/b3schnei/anaconda3/envs/test_transformers/lib/python3.10/site-packages/deepspeed/init.py", line 181, in initialize
[rank0]: engine = DeepSpeedEngine(args=args,
[rank0]: File "/home/b3schnei/anaconda3/envs/test_transformers/lib/python3.10/site-packages/deepspeed/runtime/engine.py", line 307, in init
[rank0]: self._configure_lr_scheduler(lr_scheduler)
[rank0]: File "/home/b3schnei/anaconda3/envs/test_transformers/lib/python3.10/site-packages/deepspeed/runtime/engine.py", line 907, in _configure_lr_scheduler
[rank0]: lr_scheduler = self._scheduler_from_config(self.optimizer)
[rank0]: File "/home/b3schnei/anaconda3/envs/test_transformers/lib/python3.10/site-packages/deepspeed/runtime/engine.py", line 962, in _scheduler_from_config
[rank0]: instantiated_scheduler = scheduler(optimizer, **scheduler_params)
[rank0]: File "/home/b3schnei/anaconda3/envs/test_transformers/lib/python3.10/site-packages/deepspeed/runtime/lr_schedules.py", line 758, in init
[rank0]: if self.total_num_steps < self.warmup_num_steps:
[rank0]: TypeError: '<' not supported between instances of 'str' and 'int'
Expected behavior
I expect the DPOTrainer to initialize under Zero3 when setting ds_config values to "auto" like in transformer's trainer.
The text was updated successfully, but these errors were encountered:
System Info
Information
Tasks
examples
folderReproduction
This issue was reported in the hf transformers repo initially here:
huggingface/transformers#29348
I can probably put together a fix for trl when I have some more free time if y'all are interested, since I understand the behaviour now.
Current Behaviour
The base huggingface transformer calls
hf_deepspeed_config.trainer_config_finalize(args, model, num_training_steps)
to change the values oftotal_num_steps"
andwarmup_num_steps
from auto to be their calculated value during the inner training loop (when the total_num_steps is know). However, in DPOTrainer iftotal_num_steps
is set to "auto" then the trainer will crash whendeepspeed.initialize
is called when wrapping the ref model atself.ref_model = self._prepare_deepspeed(self.ref_model)
.DS config
Script
Crash log
[2024-10-02 01:18:15,497] [INFO] [utils.py:789:see_memory_usage] CPU Virtual Memory: used = 121.5 GB, percent = 12.1%
[2024-10-02 01:18:15,497] [INFO] [logging.py:96:log_dist] [Rank 0] DeepSpeed Final Optimizer = DeepSpeedZeroOptimizer_Stage3
[rank0]: Traceback (most recent call last):
[rank0]: File "/home/b3schnei/.vscode-server/extensions/ms-python.debugpy-2024.10.0-linux-x64/bundled/libs/debugpy/_vendored/pydevd/pydevd.py", line 3489, in
[rank0]: main()
[rank0]: File "/home/b3schnei/.vscode-server/extensions/ms-python.debugpy-2024.10.0-linux-x64/bundled/libs/debugpy/_vendored/pydevd/pydevd.py", line 3482, in main
[rank0]: globals = debugger.run(setup['file'], None, None, is_module)
[rank0]: File "/home/b3schnei/.vscode-server/extensions/ms-python.debugpy-2024.10.0-linux-x64/bundled/libs/debugpy/_vendored/pydevd/pydevd.py", line 2510, in run
[rank0]: return self._exec(is_module, entry_point_fn, module_name, file, globals, locals)
[rank0]: File "/home/b3schnei/.vscode-server/extensions/ms-python.debugpy-2024.10.0-linux-x64/bundled/libs/debugpy/_vendored/pydevd/pydevd.py", line 2517, in _exec
[rank0]: globals = pydevd_runpy.run_path(file, globals, 'main')
[rank0]: File "/home/b3schnei/.vscode-server/extensions/ms-python.debugpy-2024.10.0-linux-x64/bundled/libs/debugpy/_vendored/pydevd/_pydevd_bundle/pydevd_runpy.py", line 321, in run_path
[rank0]: return _run_module_code(code, init_globals, run_name,
[rank0]: File "/home/b3schnei/.vscode-server/extensions/ms-python.debugpy-2024.10.0-linux-x64/bundled/libs/debugpy/_vendored/pydevd/_pydevd_bundle/pydevd_runpy.py", line 135, in _run_module_code
[rank0]: _run_code(code, mod_globals, init_globals,
[rank0]: File "/home/b3schnei/.vscode-server/extensions/ms-python.debugpy-2024.10.0-linux-x64/bundled/libs/debugpy/_vendored/pydevd/_pydevd_bundle/pydevd_runpy.py", line 124, in _run_code
[rank0]: exec(code, run_globals)
[rank0]: File "/home/b3schnei/transformers_debug/debug/29348/reproduce.py", line 34, in
[rank0]: dpo_trainer = DPOTrainer(
[rank0]: File "/home/b3schnei/anaconda3/envs/test_transformers/lib/python3.10/site-packages/huggingface_hub/utils/_deprecation.py", line 101, in inner_f
[rank0]: return f(*args, **kwargs)
[rank0]: File "/home/b3schnei/anaconda3/envs/test_transformers/lib/python3.10/site-packages/trl/trainer/dpo_trainer.py", line 883, in init
[rank0]: self.ref_model = self._prepare_deepspeed(self.ref_model)
[rank0]: File "/home/b3schnei/anaconda3/envs/test_transformers/lib/python3.10/site-packages/trl/trainer/dpo_trainer.py", line 924, in prepare_deepspeed
[rank0]: model, * = deepspeed.initialize(model=model, config=config_kwargs)
[rank0]: File "/home/b3schnei/anaconda3/envs/test_transformers/lib/python3.10/site-packages/deepspeed/init.py", line 181, in initialize
[rank0]: engine = DeepSpeedEngine(args=args,
[rank0]: File "/home/b3schnei/anaconda3/envs/test_transformers/lib/python3.10/site-packages/deepspeed/runtime/engine.py", line 307, in init
[rank0]: self._configure_lr_scheduler(lr_scheduler)
[rank0]: File "/home/b3schnei/anaconda3/envs/test_transformers/lib/python3.10/site-packages/deepspeed/runtime/engine.py", line 907, in _configure_lr_scheduler
[rank0]: lr_scheduler = self._scheduler_from_config(self.optimizer)
[rank0]: File "/home/b3schnei/anaconda3/envs/test_transformers/lib/python3.10/site-packages/deepspeed/runtime/engine.py", line 962, in _scheduler_from_config
[rank0]: instantiated_scheduler = scheduler(optimizer, **scheduler_params)
[rank0]: File "/home/b3schnei/anaconda3/envs/test_transformers/lib/python3.10/site-packages/deepspeed/runtime/lr_schedules.py", line 758, in init
[rank0]: if self.total_num_steps < self.warmup_num_steps:
[rank0]: TypeError: '<' not supported between instances of 'str' and 'int'
Expected behavior
I expect the DPOTrainer to initialize under Zero3 when setting ds_config values to "auto" like in transformer's trainer.
The text was updated successfully, but these errors were encountered: