Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[コーパス] -SIP3 comprehensiveからテキスト抽出 #79

Open
ZhishenYang opened this issue Nov 12, 2024 · 0 comments
Open

[コーパス] -SIP3 comprehensiveからテキスト抽出 #79

ZhishenYang opened this issue Nov 12, 2024 · 0 comments
Labels
pretrain Experiment of model pretrain

Comments

@ZhishenYang
Copy link

ZhishenYang commented Nov 12, 2024

Overview
医学文献からテキスト抽出

Details
SIP3 comprehensive, 2.8M pdfs (increasing?): llm-jp:/model/sip3/comprehensive/pdf/*zip
からテキスト抽出

数パラグラフ以内で実験に関する詳細を説明してください。
関連するリンクがあれば適宜してください。

Resources
計算機
クラスタ: llm-jp-nvlink
ノード種別: gpu
ノード台数: 8
コード
リポジトリ: FIXME https://github.com/{org}/{repo}
コミット: FIXME xxxxxx
評価データ:
llm-jp-nvlink:/model/sip3/medical
出力データ:
保存先:
W&B ログ:
*
開始日: 2024/11/28
終了予定日: 2024/12/04

@ZhishenYang ZhishenYang added the pretrain Experiment of model pretrain label Nov 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
pretrain Experiment of model pretrain
Projects
None yet
Development

No branches or pull requests

1 participant