This folder contains the code for SPO training and inference.
- Pull the Docker Image
sudo docker pull rockeycoss/spo:v1
- Run the Docker Container and Enter It
sudo docker run --gpus all -it --ipc=host rockeycoss/spo:v1 /bin/bash
- Clone the Repository
git clone https://github.com/RockeyCoss/SPO
cd ./SPO/spo_training_and_inference
- Login to wandb
wandb login {Your wandb key}
- (Optional) To customize the location for saving models downloaded from Hugging Face, you can use the following command:
export HUGGING_FACE_CACHE_DIR=/path/to/your/cache/dir
SDXL inference
PYTHONPATH=$(pwd) python inference_scripts/inference_spo_sdxl.py
SD v1.5 inference
PYTHONPATH=$(pwd) python inference_scripts/inference_spo_sd-v1-5.py
The following scripts assume the use of four 80GB A100 GPUs for fine-tuning, as described in the paper.
Before fine-tuning, please download the checkpoints of step-aware preference models. You can do this by following these steps:
sudo apt update
sudo apt install wget
mkdir model_ckpts
cd model_ckpts
wget https://huggingface.co/SPO-Diffusion-Models/Step-Aware_Preference_Models/resolve/main/sd-v1-5_step-aware_preference_model.bin
wget https://huggingface.co/SPO-Diffusion-Models/Step-Aware_Preference_Models/resolve/main/sdxl_step-aware_preference_model.bin
cd ..
To fine-tune SD v1.5, you can use the following command:
PYTHONPATH=$(pwd) accelerate launch --config_file accelerate_cfg/1m4g_fp16.yaml train_scripts/train_spo.py --config configs/spo_sd-v1-5_4k-prompts_num-sam-4_10ep_bs10.py
To fine-tune SDXL, you can use the following command:
PYTHONPATH=$(pwd) accelerate launch --config_file accelerate_cfg/1m4g_fp16.yaml train_scripts/train_spo_sdxl.py --config configs/spo_sdxl_4k-prompts_num-sam-2_3-is_10ep_bs2_gradacc2.py
To fine-tune using step-aware preference model checkpoints you’ve trained with the code in step_aware_preference_model, you can simply update the config.preference_model_func_cfg.ckpt_path
setting in the config file to point to your desired checkpoint path. For example, you can modify this line in the SDXL fine-tuning config.
SPO-SDXL_4k-prompts_10-epochs_LoRA
SPO-SD-v1-5_4k-prompts_10-epochs
SPO-SD-v1-5_4k-prompts_10-epochs_LoRA
If you find this code useful in your research, please consider citing:
@article{liang2024step,
title={Aesthetic Post-Training Diffusion Models from Generic Preferences with Step-by-step Preference Optimization},
author={Liang, Zhanhao and Yuan, Yuhui and Gu, Shuyang and Chen, Bohan and Hang, Tiankai and Cheng, Mingxi and Li, Ji and Zheng, Liang},
journal={arXiv preprint arXiv:2406.04314},
year={2024}
}