Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

数据划分问题 #47

Open
cquzys opened this issue Oct 9, 2023 · 1 comment
Open

数据划分问题 #47

cquzys opened this issue Oct 9, 2023 · 1 comment

Comments

@cquzys
Copy link

cquzys commented Oct 9, 2023

请问一下cellLM中的Zheng68K和Baron是怎么划分的train和test?我该怎么去复现论文中的Accuracy结果?

@toycat-I
Copy link
Contributor

toycat-I commented Oct 9, 2023

随机划分即可,可以设置validation集帮助调整训练中的超参数。值得注意的是,由于Zheng68K和Baron有很严重的数据不均衡问题,finetune时的池化方法、数据随机划分、超参数、模型参数冻结等设置都会综合地产生影响。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants