[feat] Make responder config flexible

UKPLab · Jun 26, 2024 · b2c7c16 · b2c7c16
1 parent b5c29bb
commit b2c7c16
Show file tree

Hide file tree

Showing 3 changed files with 24 additions and 24 deletions.
diff --git a/README.md b/README.md
@@ -63,10 +63,10 @@ aim up
 Let's get started with generating dialogues using the `roleplay` action. The process is simple: just provide the name of the configuration file containing the action, followed by the action name itself. For the `roleplay` action, we'll initiate it by using the Mistral 8x7B model as the inquirer. 🎇
 
 ```bash
-roleplay action_config=roleplay +action_config/task/model_inquirer=mixtral action_config.task.model_inquirer.api_token="YOUR_TOKEN"
+roleplay action_config=roleplay +action_config/task/model_inquirer=mixtral +action_config/task/model_responder=llama action_config.task.model_inquirer.api_token="YOUR_TOKEN"
 ```
 
-The `action_config` parameter specifies which configuration file to use to run the action. After that, we specify the configuration file for the inquirer with the `model_inquirer` argument.
+The `action_config` parameter specifies which configuration file to use to run the action. Afterward, we define the configuration file for the inquirer using the `model_inquirer` argument and set the configuration for the responder with the `model_responder` argument.
 
 To execute the command on a Slurm cluster, configure the `roleplay/config/main.yaml` file with the corresponding fields, and then use the same command to run the job. For more details on how to edit the configuration files, please refer to the upcoming sections.
 

diff --git a/configs/action_config/roleplay.yaml b/configs/action_config/roleplay.yaml
@@ -9,28 +9,8 @@ action_config:
  num_turns: 12
 
  model_inquirer: ???
+ model_responder: ???
 
- model_responder:
- type:
- _target_: roleplay.models.pipeline_model.PipelineModel
- name: "models--llama-2-hf/13B-Chat"
- cache_dir: ""
- dtype: torch.float16
- non_coherent_max_n: 5
- non_coherent_r: 2
- api_token: null
- generate:
- max_new_tokens: 4000
- conv_template:
- first_turn_input: "[INST] <<SYS>>
- You are a helpful, respectful and honest assistant. Always answer as helpfully as possible, while being safe. Your answers should not include any harmful, unethical, racist, sexist, toxic, dangerous, or illegal content. Please ensure that your responses are socially unbiased and positive in nature.
-
- If a question does not make any sense, or is not factually coherent, explain why instead of answering something not correct. If you don't know the answer to a question, please don't share false information.
- <</SYS>>
-
- <OBJECTIVE> [/INST]"
- n_th_turn_input: "[INST] <USER_MSG> [/INST]"
- model_output: "<MODEL_ANSWER>"
  dataset:
  instruction:
  - "You want to know how fast you run different distances. You use a stopwatch to measure the time it takes you to complete a 50-meter, 100-meter, and 200-meter race. You want to know how can you calculate your speed for each race? Based on that, you also want to calculate how many calories you burned during each race."
@@ -218,4 +198,4 @@ action_config:
 hydra:
  sweeper:
  params:
- seed: 73,321,8479
+ seed: 73,321,8479
diff --git a/configs/action_config/task/model_responder/llama.yaml b/configs/action_config/task/model_responder/llama.yaml
@@ -0,0 +1,20 @@
+type:
+ _target_: roleplay.models.pipeline_model.PipelineModel
+name: "models--llama-2-hf/13B-Chat"
+cache_dir: ""
+dtype: torch.float16
+non_coherent_max_n: 5
+non_coherent_r: 2
+api_token: null
+generate:
+ max_new_tokens: 4000
+conv_template:
+ first_turn_input: "[INST] <<SYS>>
+ You are a helpful, respectful and honest assistant. Always answer as helpfully as possible, while being safe. Your answers should not include any harmful, unethical, racist, sexist, toxic, dangerous, or illegal content. Please ensure that your responses are socially unbiased and positive in nature.
+
+ If a question does not make any sense, or is not factually coherent, explain why instead of answering something not correct. If you don't know the answer to a question, please don't share false information.
+ <</SYS>>
+
+ <OBJECTIVE> [/INST]"
+ n_th_turn_input: "[INST] <USER_MSG> [/INST]"
+ model_output: "<MODEL_ANSWER>"