data partition #3

shengoy · 2022-04-27T13:16:29Z

Hi !
How could I split the data for Train/dev/test?

Hellisotherpeople · 2022-10-23T14:50:33Z

Wow, I am sorry for not seeing this earlier!

On the off chance that you are still interested or care about this problem - I'd say that the most "fair" way to do it might be to randomly sample 10-20% from each of the year splits.

Possibly better, if you are someone from the competitive debate community, would be to randomly sample that same 10-20% but at the level of each individual file. This would prevent the random sampling from favoring certain files/annotators over others and would hopefully maximize the diversity of the samples.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

data partition #3

data partition #3

shengoy commented Apr 27, 2022

Hellisotherpeople commented Oct 23, 2022

data partition #3

data partition #3

Comments

shengoy commented Apr 27, 2022

Hellisotherpeople commented Oct 23, 2022