GitHub - nusnlp/RationAlign: The Repository for the paper "Rationalize and Align: Enhancing Writing Assistance with Rationale via Self-Training for Improved Alignment"

Create the initial training data

please download the training data (Table 1) from the official website
run generate_testing_data.py to get the training data

Get the rationale

run call_openai.py

Buold the SFT model

Go to finetune folder, and install the dependencies by: pip -r install requirements.txt
Build the SFT model by running finetune.without.mpo.sh

Build the reward model

split the training data into 5 folds
go to finetune folder, and run finetune.without.mpo.sh on each fold
run export_hf_checkpoint.py to convert lora weights into huggingface's weights
go to vllm_inference/inference_scripts folder
run the prediction on the held-out set using prompt.wa.mix.sh
calculate the score using task-specific evaluation metrics
go to /trl/examples/scripts/scripts folder
build the reward model by running train.exp.sh

Build the SFT+Self-Alignment model

conver the SFT's lora weight into huggingface's weights using export_hf_checkpoint.py
go to trl/examples/scripts/scripts folder, and run inference.sh to get the reward value, and build the preference data
go to finetune folder, and run finetune.with.mapo.sh to obtain the final WA model

Inference

go to vllm_inference/inference_scripts folder
run the run_test.accept.input.exp.new.format.sh

Trained Model and Training Data

To be uploaded soon.

Citation

If you found our paper or code useful, please cite as:

@inproceedings{cao-etal-2021-grammatical-error,
    title = "Rationalize and Align: Enhancing Writing Assistance with Rationale via Self-Training for Improved Alignment",
    author = "Cao, Hannan  and
      Ye, Hai and 
      Ng, Hwee Tou",
    booktitle = "Findings of the Association for Computational Linguistics: ACL 2025",
    year = "2025",
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Create the initial training data

Get the rationale

Buold the SFT model

Build the reward model

Build the SFT+Self-Alignment model

Inference

Trained Model and Training Data

Citation

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
finetune		finetune
trl		trl
vllm_inference		vllm_inference
README.md		README.md
call_openai.py		call_openai.py
generate_testing_data.py		generate_testing_data.py

nusnlp/RationAlign

Folders and files

Latest commit

History

Repository files navigation

Create the initial training data

Get the rationale

Buold the SFT model

Build the reward model

Build the SFT+Self-Alignment model

Inference

Trained Model and Training Data

Citation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages