You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Currently, carbon-bot is one big NLU file, but doesn't have a dedicated evaluation and validation set. We have annotated 1200 additional NLU examples - data that has been collected from Facebook users or internal Rasa testers. With the additional data we can now create dedicated train/dev/eval sets. In order to create representative sets, it might make sense to first merge all existing data and sample individual sets from the merged amount of data in order to avoid data distribution shifts due to data collections at different points in time.
Definition of Done:
Get hold of the 1200 new annotated training examples (ask @tttthomasssss).
Merge the existing nlu.yml file with the new data, and create train/dev/eval splits (70/10/20) - that will approximately create a training set that is equal to the size of the current NLU file.
Create a PR with the change.
The text was updated successfully, but these errors were encountered:
Currently,
carbon-bot
is one big NLU file, but doesn't have a dedicated evaluation and validation set. We have annotated 1200 additional NLU examples - data that has been collected from Facebook users or internal Rasa testers. With the additional data we can now create dedicated train/dev/eval sets. In order to create representative sets, it might make sense to first merge all existing data and sample individual sets from the merged amount of data in order to avoid data distribution shifts due to data collections at different points in time.Definition of Done:
nlu.yml
file with the new data, and create train/dev/eval splits (70/10/20) - that will approximately create a training set that is equal to the size of the current NLU file.The text was updated successfully, but these errors were encountered: