Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refusal Leaderboard Revamp #3636

Open
wants to merge 23 commits into
base: main
Choose a base branch
from
Open

Refusal Leaderboard Revamp #3636

wants to merge 23 commits into from

Conversation

derixu
Copy link

@derixu derixu commented Nov 30, 2024

@CodingWithTim

Why are these changes needed?

Related issue number (if applicable)

Checks

  • I've run format.sh to lint the changes in this PR.
  • I've included any doc changes needed.
  • I've made sure the relevant tests are passing (if applicable).

@derixu
Copy link
Author

derixu commented Nov 30, 2024

Current changes:

  • Moved chat_completion_openai to utils.py
  • Added api_config function to utils.py, which sets API_MAX_RETRY, API_RETRY_SLEEP, API_ERROR_OUTPUT based on config file before running any labeler (needed because of moving chat_completion_openai to utils.py)
  • Created HuggingFaceRefusalClassifier class to label refusals, used class so we can just create one labeling model for an entire run.
  • Added CategoryRefusalFineTuned class to category.py which formats conversations into lists of query/resp prompts, and has an attribute which holds the labeling model
  • Added conditional block to get_answer specifically handling the refusal classifier

Generally the implementation is a bit ugly because huggingface inference doesn't mesh well with current labeling process, but verified to work. 100 battles labeled in 35 sec on CPU. Because of not fully taking advantage of huggingface multiprocessing, 2x slower than previous labeling on same CPU :\

@derixu derixu marked this pull request as ready for review December 1, 2024 04:22
@CodingWithTim
Copy link
Collaborator

@derixu Could you fix the formatting check error?

fastchat/serve/monitor/classify/category.py Outdated Show resolved Hide resolved
fastchat/serve/monitor/classify/label.py Outdated Show resolved Hide resolved
fastchat/serve/monitor/classify/category.py Outdated Show resolved Hide resolved
@CodingWithTim
Copy link
Collaborator

Overall, great PR! Very clean code. A few fixes and I will poke around more here and there and we should be good to go!

@CodingWithTim CodingWithTim self-assigned this Dec 26, 2024
@CodingWithTim
Copy link
Collaborator

@derixu Thanks for the new changes! I found a very weird bug: when I

  1. Begin running the labeler.
  2. Stop (control + c) the labeling processes half way through refusal labeling process.
  3. Then resume labeling and finishes running refusal labeling, which then begins labeling the other categories.
  4. Stop (control + c) half way through the labeling process.
  5. Then finish running and merging.

The final output json file has missing category tags in some rows (such as missing "creative_writing").

I am using the first 200 rows of the battles_yifan_2000_sample.json battles which I gave you before. I am not too sure what is going on...

Here is my config file:

# Yaml config file for category classification

input_file: battles_yifan_200_sample.json # json
cache_file: null # json
output_file: output.jsonl # json line

convert_to_json: True

task_name:
  - criteria_v0.1
  - if_v0.1
  - math_v0.1
  - creative_writing_v0.1
  - refusal_v0.2

model_name: gpt-4o-mini
name: gpt-4o-mini
endpoints: null
parallel: 64
temperature: 0.0
max_token: 512

max_retry: 2
retry_sleep: 10
error_output: $ERROR$

Btw I add some small fixes:

  1. Switched tqdm.tqdm to tqdm
  2. And added a few print statements to explain what the tqdm is doing.

@CodingWithTim
Copy link
Collaborator

I tried a couple testing, the classifier code should be good to go now.

@CodingWithTim
Copy link
Collaborator

@infwinston This PR is good to go! Thanks for the great work, @derixu!!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants