You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We may want to mask intermediate dialogue turns, e.g. if they arise due to the model making a mistake. I propose adding this by creating a train field that is added to datasets in an instance. An instance can now look like:
When train is set, it overrides our original logic for deciding what turns to train on. When it is not present, we use our old logic (train on all assistant turns only). We can then additionally add a basic flag to the dataset mixer when we want to do some higher-level thing automatically (e.g., train_on_final_turn_only).
This gives us the flexbility to train on arbitrary turn combinations, where the user can preprocess the dataset how they want if they want to do something fancy (for example, using another model to judge whether turns are worth training on or not).
Let me know if this makes sense!
The text was updated successfully, but these errors were encountered:
Nothing that like screams we should go one way or the other:
qlora table 10 suggests training on the prompt hurts marginally.
a more recent paper suggests training on the prompt (but not on the special prompt tokens, importantly), but it depends on the length of the inputs. They specifically look at the tulu 2 dataset, and find that training on the full dataset, it isnt that helpful to also train on inputs, but for subsets it is helpful.
We may want to mask intermediate dialogue turns, e.g. if they arise due to the model making a mistake. I propose adding this by creating a
train
field that is added to datasets in an instance. An instance can now look like:When
train
is set, it overrides our original logic for deciding what turns to train on. When it is not present, we use our old logic (train on allassistant
turns only). We can then additionally add a basic flag to the dataset mixer when we want to do some higher-level thing automatically (e.g.,train_on_final_turn_only
).This gives us the flexbility to train on arbitrary turn combinations, where the user can preprocess the dataset how they want if they want to do something fancy (for example, using another model to judge whether turns are worth training on or not).
Let me know if this makes sense!
The text was updated successfully, but these errors were encountered: