From dd93296a53350b34ac938dab68fcb17b79964570 Mon Sep 17 00:00:00 2001 From: Kaito Sugimoto Date: Fri, 27 Oct 2023 22:51:00 +0900 Subject: [PATCH] =?UTF-8?q?=E3=83=AA=E3=83=B3=E3=82=AF=E8=8B=B1=E8=AA=9E?= =?UTF-8?q?=E7=89=88=E3=81=AB=E7=BD=AE=E3=81=8D=E6=8F=9B=E3=81=88=E3=83=BB?= =?UTF-8?q?=E3=81=9D=E3=81=AE=E4=BB=96?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit --- README.md | 2 +- README_en.md | 8 ++++---- 2 files changed, 5 insertions(+), 5 deletions(-) diff --git a/README.md b/README.md index d0a4a5e..b23c0fc 100644 --- a/README.md +++ b/README.md @@ -108,7 +108,7 @@ | [東大BERT](https://sites.google.com/socsim.org/izumi-lab/tools/language-model) | BERT (small) | 日本語 Wikipedia (約2,000万文 (2.9GB)) | 東大 和泉・坂地研 | CC BY-SA 4.0 | [◯](https://huggingface.co/izumi-lab/bert-small-japanese) | | [chiTra (Sudachi Transformers)](https://www.worksap.co.jp/news/2022/0225/) | BERT (base) | 国語研日本語ウェブコーパス (NWJC) (148GB) | NINJAL & ワークス徳島人工知能NLP研 | Apache 2.0 | △ | | [ACCMS BERT](https://huggingface.co/ku-accms/bert-base-japanese-ssuw) | BERT (base) | 日本語 Wikipedia (3.3GB) | 京大 ACCMS | CC BY-SA 4.0 | [◯](https://huggingface.co/ku-accms/bert-base-japanese-ssuw) | -| [日立BERT](https://arxiv.org/pdf/2306.09572.pdf) | BERT (base) | 日本語 Wikipedia
+ Japanese CC-100 | 日立製作所 | CC BY-NC-SA 4.0 | [◯](https://huggingface.co/hitachi-nlp/bert-base-japanese_jumanpp-bpe) [^6] | +| [日立BERT](https://aclanthology.org/2023.acl-srw.5.pdf) | BERT (base) | 日本語 Wikipedia
+ Japanese CC-100 | 日立製作所 | CC BY-NC-SA 4.0 | [◯](https://huggingface.co/hitachi-nlp/bert-base-japanese_jumanpp-bpe) [^6] | | [Bandai Namco DistilBERT](https://github.com/BandaiNamcoResearchInc/DistilBERT-base-jp/blob/main/docs/GUIDE.md) | DistilBERT | - (東北大BERT(base) を親モデルとして知識蒸留) | Bandai Namco Research | MIT | [◯](https://huggingface.co/bandainamco-mirai/distilbert-base-japanese) | | [LINE DistilBERT](https://engineering.linecorp.com/ja/blog/line-distilbert-high-performance-fast-lightweight-japanese-language-model) | DistilBERT | - (LINE社内のBERTを親モデルとして知識蒸留)| LINE | Apache 2.0 | [◯](https://huggingface.co/line-corporation/line-distilbert-base-japanese) | | [rinna RoBERTa](https://rinna.co.jp/news/2021/08/20210825.html) | RoBERTa (base) | 日本語 Wikipedia
+ Japanese CC-100 | rinna | MIT | [◯](https://huggingface.co/rinna/japanese-roberta-base) | diff --git a/README_en.md b/README_en.md index 741101f..5d80f6d 100644 --- a/README_en.md +++ b/README_en.md @@ -44,7 +44,7 @@ Please point out any errors on the [issues page](https://github.com/llm-jp/aweso | | Architecture | Training Data | Developer | License | HuggingFace?[^1] | |:---|:---:|:---:|:---:|:---:|:---:| | [LLM-jp-13B](https://www.nii.ac.jp/en/news/release/2023/1020.html) | GPT (1.3b-v1.0, **13b**-v1.0, **13b**-instruct-full-jaster-v1.0, **13b**-instruct-full-jaster-dolly-oasst-v1.0, **13b**-instruct-full-dolly-oasst-v1.0, **13b**-instruct-lora-jaster-v1.0, **13b**-instruct-lora-jaster-dolly-oasst-v1.0, **13b**-instruct-lora-dolly-oasst-v1.0) | Pre-training: [llm-jp-corpus](https://github.com/llm-jp/llm-jp-corpus) (Wikipedia, Japanese mC4, The Pile, Stack) (**300B** tokens)
Instruction Tuning (SFT or LoRA): jaster, Dolly Dataset, OASST1 | LLM-jp | Apache 2.0 | ([1.3b-v1.0](https://huggingface.co/llm-jp/llm-jp-1.3b-v1.0), [13b-v1.0](https://huggingface.co/llm-jp/llm-jp-13b-v1.0), [13b-instruct-full-jaster-v1.0](https://huggingface.co/llm-jp/llm-jp-13b-instruct-full-jaster-v1.0), [13b-instruct-full-jaster-dolly-oasst-v1.0](https://huggingface.co/llm-jp/llm-jp-13b-instruct-full-jaster-dolly-oasst-v1.0), [13b-instruct-full-dolly-oasst-v1.0](https://huggingface.co/llm-jp/llm-jp-13b-instruct-full-dolly-oasst-v1.0), [13b-instruct-lora-jaster-v1.0](https://huggingface.co/llm-jp/llm-jp-13b-instruct-lora-jaster-v1.0), [13b-instruct-lora-jaster-dolly-oasst-v1.0](https://huggingface.co/llm-jp/llm-jp-13b-instruct-lora-jaster-dolly-oasst-v1.0), [13b-instruct-lora-dolly-oasst-v1.0](https://huggingface.co/llm-jp/llm-jp-13b-instruct-lora-dolly-oasst-v1.0)) | -| [PLaMo-13B](https://www.preferred.jp/ja/news/pr20230928/) | Llama-1[^2]
(**13b**) | C4, Project Gutenberg, RedPajama, Japanese Wikipedia, Japanese mC4
(**1.5T** tokens) | Preferred Networks | Apache 2.0 | [◯](https://huggingface.co/pfnet/plamo-13b) | +| [PLaMo-13B](https://www.preferred.jp/en/news/pr20230928/) | Llama-1[^2]
(**13b**) | C4, Project Gutenberg, RedPajama, Japanese Wikipedia, Japanese mC4
(**1.5T** tokens) | Preferred Networks | Apache 2.0 | [◯](https://huggingface.co/pfnet/plamo-13b) | | [Weblab-10B](https://www.t.u-tokyo.ac.jp/press/pr2023-08-18-001) | GPT-NeoX
(**10b**, **10b**‑instruction‑sft) | Japanese mC4, The Pile
(**600B** tokens)
SFT: Alpaca, FLAN | University of Tokyo Matsuo Lab | CC BY‑NC 4.0 | ◯
([10b](https://huggingface.co/matsuo-lab/weblab-10b), [10b‑instruction‑sft](https://huggingface.co/matsuo-lab/weblab-10b-instruction‑sft)) | | [Japanese StableLM Alpha](https://stability.ai/blog/stability-ai-new-jplm-japanese-language-model-stablelm) | GPT-NeoX
(base-alpha-**7b**, instruct-alpha-**7b**, instruct-alpha-**7b**-v2) | Wikipedia, Japanese CC‑100, Japanese mC4, Japanese OSCAR, RedPajama, private datasets[^3]
(**750B** tokens)
SFT: Dolly, HH‑RLHF, wikinews, Alpaca (discarded in v2) | Stability AI | base: Apache 2.0
instruct (v1): [Research license](https://huggingface.co/stabilityai/japanese-stablelm-instruct-alpha-7b/tree/main)
instruct (v2): Apache 2.0 | ◯
([base-alpha-7b](https://huggingface.co/stabilityai/japanese-stablelm-base-alpha-7b), [instruct-alpha-7b](https://huggingface.co/stabilityai/japanese-stablelm-instruct-alpha-7b), [instruct-alpha-7b-v2](https://huggingface.co/stabilityai/japanese-stablelm-instruct-alpha-7b-v2)) | | [OpenCALM](https://www.cyberagent.co.jp/news/detail/id=28817) | GPT-NeoX
(small(160M), medium(400M), large(800M), **1b**, **3b**, **7b**) | Japanese Wikipedia, Japanese mC4, Japanese CC‑100 | CyberAgent | CC BY‑SA 4.0 | ◯
([small](https://huggingface.co/cyberagent/open-calm-small), [medium](https://huggingface.co/cyberagent/open-calm-medium), [large](https://huggingface.co/cyberagent/open-calm-large), [1b](https://huggingface.co/cyberagent/open-calm-1b), [3b](https://huggingface.co/cyberagent/open-calm-3b), [7b](https://huggingface.co/cyberagent/open-calm-7b)) | @@ -107,12 +107,12 @@ Please point out any errors on the [issues page](https://github.com/llm-jp/aweso | [UniversityOfTokyoBERT](https://sites.google.com/socsim.org/izumi-lab/tools/language-model) | BERT (small) | Japanese Wikipedia (2.9GB) | University of Tokyo Izumi-Sakaji Lab | CC BY‑SA 4.0 | [◯](https://huggingface.co/izumi-lab/bert-small-japanese) | | [chiTra (Sudachi Transformers)](https://www.worksap.co.jp/news/2022/0225/) | BERT (base) | NINJAL Web Japanese Corpus (148GB) | NINJAL & WAP Tokushima Laboratory of AI and NLP | Apache 2.0 | △ | | [ACCMS BERT](https://huggingface.co/ku-accms/bert-base-japanese-ssuw) | BERT (base) | Japanese Wikipedia (3.3GB) | Kyoto University ACCMS | CC BY‑SA 4.0 | [◯](https://huggingface.co/ku-accms/bert-base-japanese-ssuw) | -| [HitachiBERT](https://arxiv.org/pdf/2306.09572.pdf) | BERT (base) | Japanese Wikipedia, Japanese CC‑100 | Hitachi | CC BY‑NC‑SA 4.0 | [◯](https://huggingface.co/hitachi-nlp/bert-base-japanese_jumanpp-bpe)[^6] | -| [Bandai Namco DistilBERT](https://github.com/BandaiNamcoResearchInc/DistilBERT-base-jp/blob/main/docs/GUIDE.md) | DistilBERT | (Distillation of TohokuUniversityBERT(base)) | Bandai Namco Research | MIT | [◯](https://huggingface.co/bandainamco-mirai/distilbert-base-japanese) | +| [HitachiBERT](https://aclanthology.org/2023.acl-srw.5.pdf) | BERT (base) | Japanese Wikipedia, Japanese CC‑100 | Hitachi | CC BY‑NC‑SA 4.0 | [◯](https://huggingface.co/hitachi-nlp/bert-base-japanese_jumanpp-bpe)[^6] | +| [Bandai Namco DistilBERT](https://github.com/BandaiNamcoResearchInc/DistilBERT-base-jp) | DistilBERT | (Distillation of TohokuUniversityBERT(base)) | Bandai Namco Research | MIT | [◯](https://huggingface.co/bandainamco-mirai/distilbert-base-japanese) | | [LINE DistilBERT](https://engineering.linecorp.com/ja/blog/line-distilbert-high-performance-fast-lightweight-japanese-language-model) | DistilBERT | (Distillation of LINE internal BERT model)| LINE | Apache 2.0 | [◯](https://huggingface.co/line-corporation/line-distilbert-base-japanese) | | [rinna RoBERTa](https://rinna.co.jp/news/2021/08/20210825.html) | RoBERTa (base) | Japanese Wikipedia, Japanese CC‑100 | rinna | MIT | [◯](https://huggingface.co/rinna/japanese-roberta-base) | | [WasedaRoBERTa](https://huggingface.co/nlp-waseda/roberta-base-japanese-with-auto-jumanpp) | RoBERTa (base, large) | Japanese Wikipedia, Japanese CC‑100 | Waseda Kawahara Lab | CC BY‑SA 4.0 | ◯
([base](https://huggingface.co/nlp-waseda/roberta-base-japanese-with-auto-jumanpp), [large](https://huggingface.co/nlp-waseda/roberta-large-japanese-with-auto-jumanpp), [large (seq512)](https://huggingface.co/nlp-waseda/roberta-large-japanese-seq512-with-auto-jumanpp))[^7] | -| [InformatixRoBERTa](https://www.informatix.co.jp/pr-roberta/) | RoBERTa (base) | Japanese Wikipedia, Web Articles
(25GB) | Informatix | Apache 2.0 | △ | +| [InformatixRoBERTa](https://www.informatix.co.jp/en/pr-roberta/) | RoBERTa (base) | Japanese Wikipedia, Web Articles
(25GB) | Informatix | Apache 2.0 | △ | | [KyotoUniversityRoBERTa](https://huggingface.co/ku-nlp/roberta-base-japanese-char-wwm) | RoBERTa (base, large) | Japanese Wikipedia, Japanese CC‑100 | Kyoto University Language Media Processing Lab | CC BY‑SA 4.0 | ◯
([base (char-level)](https://huggingface.co/ku-nlp/roberta-base-japanese-char-wwm), [large (char-level)](https://huggingface.co/ku-nlp/roberta-large-japanese-char-wwm)) | | [YokohamaNationalRoBERTa](https://huggingface.co/ganchengguang/RoBERTa-base-janpanese) | RoBERTa (base) | Japanese Wikipedia (3.45GB) | Yokohama National University Mori Lab | Apache 2.0 | [◯](https://huggingface.co/ganchengguang/RoBERTa-base-janpanese) | | [Megagon Labs RoBERTa](https://huggingface.co/megagonlabs/roberta-long-japanese) | RoBERTa (base)[^8] | Japanese mC4 (200M sentences) | Megagon Labs
(Recruit) | MIT | [◯](https://huggingface.co/megagonlabs/roberta-long-japanese) |