diff --git a/README.md b/README.md index 3febecd..4e5f37d 100644 --- a/README.md +++ b/README.md @@ -52,6 +52,7 @@ | [PLaMo-13B](https://www.preferred.jp/ja/news/pr20230928/) | Llama[^1]
([**13b**](https://huggingface.co/pfnet/plamo-13b)) | C4, Project Gutenberg, RedPajama, 日本語 Wikipedia, Japanese mC4
(計 1.5T トークン) | Preferred Networks | Apache 2.0 | | [Weblab-10B](https://www.t.u-tokyo.ac.jp/press/pr2023-08-18-001) | GPT
([**10b**](https://huggingface.co/matsuo-lab/weblab-10b), [**10b**-instruction-sft](https://huggingface.co/matsuo-lab/weblab-10b-instruction-sft)) | Japanese mC4 + The Pile(計 600B トークン)
\*instruction-sft モデルは Alpaca Dataset, FLAN でファインチューニング | 東大 松尾研 | CC BY-NC 4.0 | | [Japanese StableLM Alpha](https://ja.stability.ai/blog/japanese-stablelm-alpha) | GPT
([base-alpha-**7b**](https://huggingface.co/stabilityai/japanese-stablelm-base-alpha-7b), [instruct-alpha-**7b**](https://huggingface.co/stabilityai/japanese-stablelm-instruct-alpha-7b), [instruct-alpha-**7b**-v2](https://huggingface.co/stabilityai/japanese-stablelm-instruct-alpha-7b-v2)) | Wikipedia, Japanese CC-100, Japanese mC4, Japanese OSCAR, RedPajama
(+ 独自のデータセット)[^2]
(計 750B トークン)
\*instruct モデルでは Alpaca Dataset, Dolly Dataset, HH RLHF, llm-japanese-datasetのwikinews subsetでファインチューニング
(v2では商用利用不可の Alpaca Dataset を除外) | Stability AI | baseモデル: Apache 2.0
instruct モデル (v1): [独自のライセンス](https://huggingface.co/stabilityai/japanese-stablelm-instruct-alpha-7b/tree/main)
instruct モデル (v2): Apache 2.0 | +| [CALM2](https://www.cyberagent.co.jp/news/detail/id=29479) | Llama
([**7b**](https://huggingface.co/cyberagent/calm2-7b), [**7b**-chat](https://huggingface.co/cyberagent/calm2-7b-chat)) | 一般公開されている日本語・英語のデータセット(詳細不明) (計 1.3T トークン) | サイバーエージェント | Apache 2.0 | | [OpenCALM](https://www.cyberagent.co.jp/news/detail/id=28817) | GPT
([small](https://huggingface.co/cyberagent/open-calm-small), [medium](https://huggingface.co/cyberagent/open-calm-medium), [large](https://huggingface.co/cyberagent/open-calm-large), [**1b(1.4b)**](https://huggingface.co/cyberagent/open-calm-1b), [**3b(2.7b)**](https://huggingface.co/cyberagent/open-calm-3b), [**7b(6.8b)**](https://huggingface.co/cyberagent/open-calm-7b)) | 日本語 Wikipedia
+ Jpanese mC4
+ Japanese CC-100 | サイバーエージェント | CC BY-SA 4.0 | | [Stormy](https://jxiv.jst.go.jp/index.php/jxiv/preprint/view/422/1350) | GPT
([**7b(6.8b)**](https://huggingface.co/izumi-lab/stormy-7b-10ep)) | OpenCALM (6.8b) に対して
llm-japanese-dataset v0 のうち翻訳タスクを除いたデータで LoRAチューニング | 東大 和泉・坂地研 | CC BY-SA 4.0 | | [rinna GPT
(英語やコードも含めて学習されたモデル)](https://rinna.co.jp/news/2023/07/20230731.html) | GPT
([**4b(3.8b)**](https://huggingface.co/rinna/bilingual-gpt-neox-4b), [**4b(3.8b)**-8k](https://huggingface.co/rinna/bilingual-gpt-neox-4b-8k), [**4b(3.8b)**-instruction-sft](https://huggingface.co/rinna/bilingual-gpt-neox-4b-instruction-sft), [**4b(3.8b)**-instruction-ppo](https://huggingface.co/rinna/bilingual-gpt-neox-4b-instruction-ppo)) | Wikipedia, Japanese CC-100, Japanese C4, RedPajama, The Pile
(計 524B トークン)
\*8k モデルでは 4,000トークンを超える長いトークン列でファインチューニング
\*instruction-sft モデルでは HH RLHF、FLAN でファインチューニング
\*instruction-ppo モデルでは HH RLHF で PPO ベースの強化学習 | rinna | MIT | diff --git a/README_en.md b/README_en.md index 1415b55..5cbba74 100644 --- a/README_en.md +++ b/README_en.md @@ -52,6 +52,7 @@ Please point out any errors on the [issues page](https://github.com/llm-jp/aweso | [PLaMo-13B](https://www.preferred.jp/en/news/pr20230928/) | Llama[^1]
([**13b**](https://huggingface.co/pfnet/plamo-13b)) | C4, Project Gutenberg, RedPajama, Japanese Wikipedia, Japanese mC4
(**1.5T** tokens) | Preferred Networks | Apache 2.0 | | [Weblab-10B](https://www.t.u-tokyo.ac.jp/press/pr2023-08-18-001) | GPT-NeoX
([**10b**](https://huggingface.co/matsuo-lab/weblab-10b), [**10b**-instruction-sft](https://huggingface.co/matsuo-lab/weblab-10b-instruction-sft)) | Japanese mC4, The Pile
(**600B** tokens)
SFT: Alpaca, FLAN | University of Tokyo Matsuo Lab | CC BY‑NC 4.0 | | [Japanese StableLM Alpha](https://stability.ai/blog/stability-ai-new-jplm-japanese-language-model-stablelm) | GPT-NeoX
([base-alpha-**7b**](https://huggingface.co/stabilityai/japanese-stablelm-base-alpha-7b), [instruct-alpha-**7b**](https://huggingface.co/stabilityai/japanese-stablelm-instruct-alpha-7b), [instruct-alpha-**7b**-v2](https://huggingface.co/stabilityai/japanese-stablelm-instruct-alpha-7b-v2)) | Wikipedia, Japanese CC‑100, Japanese mC4, Japanese OSCAR, RedPajama, private datasets[^2]
(**750B** tokens)
SFT: Dolly, HH‑RLHF, wikinews, Alpaca (discarded in v2) | Stability AI | base: Apache 2.0
instruct (v1): [Research license](https://huggingface.co/stabilityai/japanese-stablelm-instruct-alpha-7b/tree/main)
instruct (v2): Apache 2.0 | +| [CALM2](https://www.cyberagent.co.jp/news/detail/id=29479) | Llama
([**7b**](https://huggingface.co/cyberagent/calm2-7b), [**7b**-chat](https://huggingface.co/cyberagent/calm2-7b-chat)) | publicly available Japanese and English datasets (details unknown)
(**1.3T** tokens) | CyberAgent | Apache 2.0 | | [OpenCALM](https://www.cyberagent.co.jp/news/detail/id=28817) | GPT-NeoX
([small](https://huggingface.co/cyberagent/open-calm-small), [medium](https://huggingface.co/cyberagent/open-calm-medium), [large](https://huggingface.co/cyberagent/open-calm-large), [**1b(1.4b)**](https://huggingface.co/cyberagent/open-calm-1b), [**3b(2.7b)**](https://huggingface.co/cyberagent/open-calm-3b), [**7b(6.8b)**](https://huggingface.co/cyberagent/open-calm-7b)) | Japanese Wikipedia, Japanese mC4, Japanese CC‑100 | CyberAgent | CC BY‑SA 4.0 | | [Stormy](https://jxiv.jst.go.jp/index.php/jxiv/preprint/view/422/1350) | GPT-NeoX
([**7b(6.8b)**](https://huggingface.co/izumi-lab/stormy-7b-10ep)) | OpenCALM fine-tuned on
llm-japanese-dataset v0 non-translation tasks | University of Tokyo Izumi-Sakaji Lab | CC BY‑SA 4.0 | | [rinna GPT
(En-Ja Bilingual)](https://rinna.co.jp/news/2023/07/20230731.html) | GPT-NeoX
([**4b(3.8b)**](https://huggingface.co/rinna/bilingual-gpt-neox-4b), [**4b(3.8b)**-8k](https://huggingface.co/rinna/bilingual-gpt-neox-4b-8k), [**4b(3.8b)**-instruction-sft](https://huggingface.co/rinna/bilingual-gpt-neox-4b-instruction-sft), [**4b(3.8b)**-instruction-ppo](https://huggingface.co/rinna/bilingual-gpt-neox-4b-instruction-ppo)) | Wikipedia, Japanese CC‑100, Japanese C4, RedPajama, The Pile
(**524B** tokens)
SFT: HH‑RLHF, FLAN
PPO: HH‑RLHF for reinforcement learning
8k: trained with long context| rinna | MIT |