Skip to content

Commit

Permalink
Merge pull request #42 from llm-jp/f/llm-jp-13b
Browse files Browse the repository at this point in the history
add llm-jp-13b
  • Loading branch information
kaisugi authored Oct 23, 2023
2 parents 1ead105 + 743d718 commit d541599
Show file tree
Hide file tree
Showing 2 changed files with 2 additions and 0 deletions.
1 change: 1 addition & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -36,6 +36,7 @@

| | モデル | 学習テキスト | 開発元 | ライセンス | HuggingFace ですぐ使える? [^1] |
|:---|:---:|:---:|:---:|:---:|:---:|
| [LLM-jp-13B](https://www.nii.ac.jp/news/release/2023/1020.html) | GPT (1.3b-v1.0, **13b**-v1.0, **13b**-instruct-full-jaster-v1.0, **13b**-instruct-full-jaster-dolly-oasst-v1.0, **13b**-instruct-full-dolly-oasst-v1.0, **13b**-instruct-lora-jaster-v1.0, **13b**-instruct-lora-jaster-dolly-oasst-v1.0, **13b**-instruct-lora-dolly-oasst-v1.0) | 事前学習: [llm-jp-corpus](https://github.com/llm-jp/llm-jp-corpus) (Wikipedia, Japanese mC4, The Pile, Stack) (計 300B トークン)<br>Instruction Tuning (SFT or LoRA): jaster, Dolly Dataset, OASST1 | LLM-jp | Apache 2.0 | ([1.3b-v1.0](https://huggingface.co/llm-jp/llm-jp-1.3b-v1.0), [13b-v1.0](https://huggingface.co/llm-jp/llm-jp-13b-v1.0), [13b-instruct-full-jaster-v1.0](https://huggingface.co/llm-jp/llm-jp-13b-instruct-full-jaster-v1.0), [13b-instruct-full-jaster-dolly-oasst-v1.0](https://huggingface.co/llm-jp/llm-jp-13b-instruct-full-jaster-dolly-oasst-v1.0), [13b-instruct-full-dolly-oasst-v1.0](https://huggingface.co/llm-jp/llm-jp-13b-instruct-full-dolly-oasst-v1.0), [13b-instruct-lora-jaster-v1.0](https://huggingface.co/llm-jp/llm-jp-13b-instruct-lora-jaster-v1.0), [13b-instruct-lora-jaster-dolly-oasst-v1.0](https://huggingface.co/llm-jp/llm-jp-13b-instruct-lora-jaster-dolly-oasst-v1.0), [13b-instruct-lora-dolly-oasst-v1.0](https://huggingface.co/llm-jp/llm-jp-13b-instruct-lora-dolly-oasst-v1.0)) |
| [PLaMo-13B](https://www.preferred.jp/ja/news/pr20230928/) | Llama[^2] (**13b**) | C4, Project Gutenberg, RedPajama, 日本語 Wikipedia, Japanese mC4<br>(計 1.5T トークン) | Preferred Networks | Apache 2.0 | [](https://huggingface.co/pfnet/plamo-13b) |
| [Weblab-10B](https://www.t.u-tokyo.ac.jp/press/pr2023-08-18-001) | GPT (**10b**, **10b**-instruction-sft) | Japanese mC4 + The Pile(計 600B トークン)<br>\*instruction-sft モデルは Alpaca Dataset, FLAN でファインチューニング | 東大 松尾研 | CC BY-NC 4.0 | ◯ ([10b](https://huggingface.co/matsuo-lab/weblab-10b), [10b-instruction-sft](https://huggingface.co/matsuo-lab/weblab-10b-instruction-sft)) |
| [Japanese StableLM Alpha](https://ja.stability.ai/blog/japanese-stablelm-alpha) | GPT (base-alpha-**7b**, instruct-alpha-**7b**, instruct-alpha-**7b**-v2) | Wikipedia, Japanese CC-100, Japanese mC4, Japanese OSCAR, RedPajama<br>(+ 独自のデータセット)[^3]<br>(計 750B トークン)<br>\*instruct モデルでは Alpaca Dataset, Dolly Dataset, HH RLHF, llm-japanese-datasetのwikinews subsetでファインチューニング<br>(v2では商用利用不可の Alpaca Dataset を除外) | Stability AI | baseモデル: Apache 2.0<br>instruct モデル (v1): [独自のライセンス](https://huggingface.co/stabilityai/japanese-stablelm-instruct-alpha-7b/tree/main)<br>instruct モデル (v2): Apache 2.0 | ◯ ([base-alpha-7b](https://huggingface.co/stabilityai/japanese-stablelm-base-alpha-7b), [instruct-alpha-7b](https://huggingface.co/stabilityai/japanese-stablelm-instruct-alpha-7b), [instruct-alpha-7b-v2](https://huggingface.co/stabilityai/japanese-stablelm-instruct-alpha-7b-v2)) |
Expand Down
1 change: 1 addition & 0 deletions README_en.md
Original file line number Diff line number Diff line change
Expand Up @@ -36,6 +36,7 @@ Please point out any errors on the [issues page](https://github.com/llm-jp/aweso

| | Architecture | Training Data | Developer | License | HuggingFace?[^1] |
|:---|:---:|:---:|:---:|:---:|:---:|
| [LLM-jp-13B](https://www.nii.ac.jp/en/news/release/2023/1020.html) | GPT (1.3b-v1.0, **13b**-v1.0, **13b**-instruct-full-jaster-v1.0, **13b**-instruct-full-jaster-dolly-oasst-v1.0, **13b**-instruct-full-dolly-oasst-v1.0, **13b**-instruct-lora-jaster-v1.0, **13b**-instruct-lora-jaster-dolly-oasst-v1.0, **13b**-instruct-lora-dolly-oasst-v1.0) | Pre-training: [llm-jp-corpus](https://github.com/llm-jp/llm-jp-corpus) (Wikipedia, Japanese mC4, The Pile, Stack) (**300B** tokens)<br>Instruction Tuning (SFT or LoRA): jaster, Dolly Dataset, OASST1 | LLM-jp | Apache 2.0 | ([1.3b-v1.0](https://huggingface.co/llm-jp/llm-jp-1.3b-v1.0), [13b-v1.0](https://huggingface.co/llm-jp/llm-jp-13b-v1.0), [13b-instruct-full-jaster-v1.0](https://huggingface.co/llm-jp/llm-jp-13b-instruct-full-jaster-v1.0), [13b-instruct-full-jaster-dolly-oasst-v1.0](https://huggingface.co/llm-jp/llm-jp-13b-instruct-full-jaster-dolly-oasst-v1.0), [13b-instruct-full-dolly-oasst-v1.0](https://huggingface.co/llm-jp/llm-jp-13b-instruct-full-dolly-oasst-v1.0), [13b-instruct-lora-jaster-v1.0](https://huggingface.co/llm-jp/llm-jp-13b-instruct-lora-jaster-v1.0), [13b-instruct-lora-jaster-dolly-oasst-v1.0](https://huggingface.co/llm-jp/llm-jp-13b-instruct-lora-jaster-dolly-oasst-v1.0), [13b-instruct-lora-dolly-oasst-v1.0](https://huggingface.co/llm-jp/llm-jp-13b-instruct-lora-dolly-oasst-v1.0)) |
| [PLaMo-13B](https://www.preferred.jp/ja/news/pr20230928/) | Llama-1[^2] <br>(**13b**) | C4, Project Gutenberg, RedPajama, Japanese Wikipedia, Japanese mC4<br>(**1.5T** tokens) | Preferred Networks | Apache 2.0 | [](https://huggingface.co/pfnet/plamo-13b) |
| [Weblab-10B](https://www.t.u-tokyo.ac.jp/press/pr2023-08-18-001) | GPT-NeoX <br> (**10b**, **10b**&#x2011;instruction&#x2011;sft) | Japanese mC4, The Pile <br> (**600B** tokens) <br>SFT: Alpaca, FLAN | University of Tokyo Matsuo Lab | CC BY&#x2011;NC 4.0 | ◯ <br>([10b](https://huggingface.co/matsuo-lab/weblab-10b), [10b&#x2011;instruction&#x2011;sft](https://huggingface.co/matsuo-lab/weblab-10b-instruction&#x2011;sft)) |
| [Japanese StableLM Alpha](https://ja.stability.ai/blog/japanese-stablelm-alpha) | GPT-NeoX <br> (base-alpha-**7b**, instruct-alpha-**7b**, instruct-alpha-**7b**-v2) | Wikipedia, Japanese CC&#x2011;100, Japanese mC4, Japanese OSCAR, RedPajama, private datasets[^3]<br>(**750B** tokens)<br>SFT: Dolly, HH&#x2011;RLHF, wikinews, Alpaca (discarded in v2) | Stability AI | base: Apache 2.0<br>instruct (v1): [Research license](https://huggingface.co/stabilityai/japanese-stablelm-instruct-alpha-7b/tree/main)<br>instruct (v2): Apache 2.0 | ◯<br>([base-alpha-7b](https://huggingface.co/stabilityai/japanese-stablelm-base-alpha-7b), [instruct-alpha-7b](https://huggingface.co/stabilityai/japanese-stablelm-instruct-alpha-7b), [instruct-alpha-7b-v2](https://huggingface.co/stabilityai/japanese-stablelm-instruct-alpha-7b-v2)) |
Expand Down

0 comments on commit d541599

Please sign in to comment.