This folder contains the code for training the step-aware preference model. The codebase is based on PickScore.
- Pull the Docker Image
sudo docker pull rockeycoss/spo:v1
- Run the Docker Container and Enter It
sudo docker run --gpus all -it --ipc=host rockeycoss/spo:v1 /bin/bash
- Clone the Repository
git clone https://github.com/RockeyCoss/SPO
cd ./SPO/step_aware_preference_model
- Install Dependencies
pip uninstall peft -y
pip install -r requirements.txt
- Login to Weights & Biases (wandb)
wandb login {Your wandb key}
- (Optional) To customize the location for saving models downloaded from Hugging Face, you can use the following command:
export HUGGING_FACE_CACHE_DIR=/path/to/your/cache/dir
from datasets import load_dataset
dataset = load_dataset("yuvalkirstain/pickapic_v1", num_proc=64)
For more details, please visit the PickScore Github repository.
The following scripts assume the use of four 80GB A100 GPUs for training, as described in the paper.
To train the step-aware preference model for SD v1.5, please use the following command:
bash run_commands/train_spm_sd15.sh
To train the step-aware preference model for SDXL, please use the following command:
bash run_commands/train_spm_sdxl.sh
The final checkpoints, i.e., work_dirs/sdv15_spm/final_ckpt.bin
and work_dirs/sdxl_spm/final_ckpt.bin
, can be used for SPO training. Please refer to this for more details.
If you find this code useful in your research, please consider citing:
@article{liang2024step,
title={Aesthetic Post-Training Diffusion Models from Generic Preferences with Step-by-step Preference Optimization},
author={Liang, Zhanhao and Yuan, Yuhui and Gu, Shuyang and Chen, Bohan and Hang, Tiankai and Cheng, Mingxi and Li, Ji and Zheng, Liang},
journal={arXiv preprint arXiv:2406.04314},
year={2024}
}