Skip to content

The Repository for the paper "Rationalize and Align: Enhancing Writing Assistance with Rationale via Self-Training for Improved Alignment"

Notifications You must be signed in to change notification settings

nusnlp/RationAlign

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Create the initial training data

  1. please download the training data (Table 1) from the official website
  2. run generate_testing_data.py to get the training data

Get the rationale

  1. run call_openai.py

Buold the SFT model

  1. Go to finetune folder, and install the dependencies by: pip -r install requirements.txt
  2. Build the SFT model by running finetune.without.mpo.sh

Build the reward model

  1. split the training data into 5 folds
  2. go to finetune folder, and run finetune.without.mpo.sh on each fold
  3. run export_hf_checkpoint.py to convert lora weights into huggingface's weights
  4. go to vllm_inference/inference_scripts folder
  5. run the prediction on the held-out set using prompt.wa.mix.sh
  6. calculate the score using task-specific evaluation metrics
  7. go to /trl/examples/scripts/scripts folder
  8. build the reward model by running train.exp.sh

Build the SFT+Self-Alignment model

  1. conver the SFT's lora weight into huggingface's weights using export_hf_checkpoint.py
  2. go to trl/examples/scripts/scripts folder, and run inference.sh to get the reward value, and build the preference data
  3. go to finetune folder, and run finetune.with.mapo.sh to obtain the final WA model

Inference

  1. go to vllm_inference/inference_scripts folder
  2. run the run_test.accept.input.exp.new.format.sh

Trained Model and Training Data

To be uploaded soon.

Citation

If you found our paper or code useful, please cite as:

@inproceedings{cao-etal-2021-grammatical-error,
    title = "Rationalize and Align: Enhancing Writing Assistance with Rationale via Self-Training for Improved Alignment",
    author = "Cao, Hannan  and
      Ye, Hai and 
      Ng, Hwee Tou",
    booktitle = "Findings of the Association for Computational Linguistics: ACL 2025",
    year = "2025",
}

About

The Repository for the paper "Rationalize and Align: Enhancing Writing Assistance with Rationale via Self-Training for Improved Alignment"

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published