Skip to content

Commit

Permalink
add Gemma 2 Baku 2B, Gemma-2-JPN
Browse files Browse the repository at this point in the history
  • Loading branch information
kaisugi committed Oct 4, 2024
1 parent 5d1d3e4 commit 0ffb8d5
Show file tree
Hide file tree
Showing 9 changed files with 45 additions and 12 deletions.
10 changes: 7 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -81,7 +81,7 @@
| [AcademicBART](https://github.com/EhimeNLP/AcademicBART) | 学術 | BART ([base](https://huggingface.co/EhimeNLP/AcademicBART)) | CiNii の日本語論文 | 愛媛大 人工知能研究室 | Apache 2.0 |

<a id="english-based-models"></a>
### 海外モデルに日本語で追加事前学習を行ったモデル(継続事前学習モデル)
### 海外モデルに日本語で継続事前学習を行ったモデル

<a id="generative-continual-general"></a>
#### 汎用
Expand Down Expand Up @@ -128,6 +128,7 @@
| [lightblue/japanese-mpt-7b](https://huggingface.co/lightblue/japanese-mpt-7b) | MPT (**7b**) | Japanese mC4 | Lightblue | Apache 2.0 |
| [Japanese Stable LM 3B-4E1T](https://ja.stability.ai/blog/japanese-stable-lm-3b-4e1tjapanese-stable-lm-gamma-7b)<br>([3b-4e1t-base](https://huggingface.co/stabilityai/japanese-stablelm-3b-4e1t-base), [3b-4e1t-instruct](https://huggingface.co/stabilityai/japanese-stablelm-3b-4e1t-instruct)) | StableLM-3B-4E1T (**3b**) | 事前学習: Wikipedia, Japanese mC4, Japanese CC-100, Japanese OSCAR, SlimPajama(Books3を除外)<br>(計 **100B** トークン)<br>Instruction Tuning: Dolly Dataset, HH RLHF, llm-japanese-datasetのwikinews subset | Stability AI | Apache 2.0 |
| [kotomamba-2.8B-CL](https://huggingface.co/kotoba-tech/kotomamba-2.8B-CL-v1.0) | mamba-2.8b-slimpj<br>(**2.8b**) | 日本語 Wikipedia, Swallow Corpus, SlimPajama | Kotoba Technologies | Apache 2.0 |
| [Gemma 2 Baku 2B](https://rinna.co.jp/news/2024/10/20241003.html)<br>([2b](https://huggingface.co/rinna/gemma-2-baku-2b), [2b-it](https://huggingface.co/rinna/gemma-2-baku-2b-it)) | Gemma 2 (**2b**) | 事前学習: Wikipedia, Japanese C4, Japanese CC-100, Japanese OSCAR, The Pile, 独自のデータセット<br>(計 **80B** トークン)<br>OPRO: 独自のデータセット [^20] | rinna | Gemma Terms of Use |
| [Japanese Stable LM 2 1.6B](https://ja.stability.ai/blog/japanese-stable-lm-2-16b)<br>([base](https://huggingface.co/stabilityai/japanese-stablelm-2-base-1_6b), [instruct](https://huggingface.co/stabilityai/japanese-stablelm-2-instruct-1_6b)) | Stable LM 2 1.6B (**1.6b**) | 事前学習: Wikipedia, CulturaX<br>Instruction Tuning: jaster, [ichikara-instruction](https://liat-aip.sakura.ne.jp/wp/llm%E3%81%AE%E3%81%9F%E3%82%81%E3%81%AE%E6%97%A5%E6%9C%AC%E8%AA%9E%E3%82%A4%E3%83%B3%E3%82%B9%E3%83%88%E3%83%A9%E3%82%AF%E3%82%B7%E3%83%A7%E3%83%B3%E3%83%87%E3%83%BC%E3%82%BF%E4%BD%9C%E6%88%90/), alpaca-gpt4-japanese, ultra-orca-boros-en-ja-v1 | Stability AI | STABILITY AI NON-COMMERCIAL RESEARCH COMMUNITY LICENSE |
| [karasu-1.1B](https://huggingface.co/lightblue/karasu-1.1B) | TinyLlama (**1.1b**) | 事前学習: Japanese OSCAR, Japanese mC4<br>(計 **3B** トークン) | Lightblue | Apache 2.0 |

Expand All @@ -145,7 +146,7 @@
| [NovelAI/genji-jp](https://huggingface.co/NovelAI/genji-jp) | 物語生成 | GPT-J (**6b**) | NovelAI ||

<a id="instruction-only-models"></a>
### 海外モデルに日本語で指示チューニング (Instruction Tuning) のみ行ったモデル
### 海外モデルに日本語で事後学習のみ行ったモデル

<a id="generative-instruction-only-general"></a>
#### 汎用
Expand All @@ -172,6 +173,7 @@
| [lightblue/jod](https://huggingface.co/lightblue/jod) | Mistral-7B-SlimOrca (**7b**) || Lightblue | Apache 2.0 |
| [NTQAI/chatntq-7b-jpntuned](https://huggingface.co/NTQAI/chatntq-7b-jpntuned) | RWKV-4 World (**7b**) || NTQ Solution ||
| [Borea](https://prtimes.jp/main/html/rd/p/000000008.000129878.html)<br>([Jp](https://huggingface.co/AXCXEPT/Borea-Phi-3.5-mini-Instruct-Jp), [Common](https://huggingface.co/AXCXEPT/Borea-Phi-3.5-mini-Instruct-Common), [Coding](https://huggingface.co/AXCXEPT/Borea-Phi-3.5-mini-Instruct-Coding)) | Phi-3.5 (**3.8b**) | | Axcxept | MIT |
| [日本語版 Gemma 2 2B](https://developers-jp.googleblog.com/2024/10/gemma-2-for-japan.html)<br>([2b-jpn-it](https://huggingface.co/google/gemma-2-2b-jpn-it)) | Gemma 2 (**2b**) || Google | Gemma Terms of Use |
| [AXCXEPT/EZO-Common-T2-2B-gemma-2-it](https://huggingface.co/AXCXEPT/EZO-Common-T2-2B-gemma-2-it) | Gemma 2 (**2b**) || Axcxept | Gemma Terms of Use |

<a id="generative-instruction-only-domain-specific"></a>
Expand Down Expand Up @@ -525,4 +527,6 @@

[^18]: それぞれのモデルの詳細は作者らの[論文](https://www.jstage.jst.go.jp/article/jnlp/31/2/31_707/_pdf/-char/ja)の第4章を参照。なお、SC-2M-wiki モデルは Wikipedia でのみ事前学習されているため、厳密にはドメイン特化型モデルではない。

[^19]: 詳細は以下の記事を参照: [大規模言語モデルTanuki-8B, 8x8Bの位置づけや開発指針など](https://zenn.dev/matsuolab/articles/377f7ae8b1169e), [大規模言語モデルを開発するにあたっての事前・事後学習の戦略メモー特に合成データについてー](https://zenn.dev/matsuolab/articles/34036f017fae9e)
[^19]: 詳細は以下の記事を参照: [大規模言語モデルTanuki-8B, 8x8Bの位置づけや開発指針など](https://zenn.dev/matsuolab/articles/377f7ae8b1169e), [大規模言語モデルを開発するにあたっての事前・事後学習の戦略メモー特に合成データについてー](https://zenn.dev/matsuolab/articles/34036f017fae9e)

[^20]: ORPO を行う前に、Gemma 2 Instruct と Gemma 2 Base の差分の Chat Vector を加えている。
8 changes: 6 additions & 2 deletions en/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -127,6 +127,7 @@ Please point out any errors on the [issues page](https://github.com/llm-jp/aweso
| [lightblue/japanese-mpt-7b](https://huggingface.co/lightblue/japanese-mpt-7b) | MPT (**7b**) | Japanese mC4 | Lightblue | Apache 2.0 |
| [Japanese Stable LM 3B-4E1T](https://ja.stability.ai/blog/japanese-stable-lm-3b-4e1tjapanese-stable-lm-gamma-7b)<br>([3b-4e1t-base](https://huggingface.co/stabilityai/japanese-stablelm-3b-4e1t-base), [3b-4e1t-instruct](https://huggingface.co/stabilityai/japanese-stablelm-3b-4e1t-instruct)) | StableLM-3B-4E1T (**3b**) | Pre-training: Wikipedia, Japanese mC4, Japanese CC-100, Japanese OSCAR, SlimPajama(excluding Books3)<br>(**100B** tokens)<br>Instruction Tuning: Dolly Dataset, HH RLHF, wikinews subset of llm-japanese-dataset | Stability AI | Apache 2.0 |
| [kotomamba-2.8B-CL](https://huggingface.co/kotoba-tech/kotomamba-2.8B-CL-v1.0) | mamba-2.8b-slimpj<br>(**2.8b**) | Japanese Wikipedia, Swallow Corpus, SlimPajama | Kotoba Technologies | Apache 2.0 |
| [Gemma 2 Baku 2B](https://rinna.co.jp/news/2024/10/20241003.html)<br>([2b](https://huggingface.co/rinna/gemma-2-baku-2b), [2b-it](https://huggingface.co/rinna/gemma-2-baku-2b-it)) | Gemma 2 (**2b**) | Pre-training: Wikipedia, Japanese C4, Japanese CC-100, Japanese OSCAR, The Pile, undisclosed dataset<br>(**80B** tokens)<br>OPRO: undisclosed dataset [^20] | rinna | Gemma Terms of Use |
| [Japanese Stable LM 2 1.6B](https://ja.stability.ai/blog/japanese-stable-lm-2-16b)<br>([base](https://huggingface.co/stabilityai/japanese-stablelm-2-base-1_6b), [instruct](https://huggingface.co/stabilityai/japanese-stablelm-2-instruct-1_6b)) | Stable LM 2 1.6B (**1.6b**) | Pre-training: Wikipedia, CulturaX<br>Instruction Tuning: jaster, [ichikara-instruction](https://liat-aip.sakura.ne.jp/wp/llm%E3%81%AE%E3%81%9F%E3%82%81%E3%81%AE%E6%97%A5%E6%9C%AC%E8%AA%9E%E3%82%A4%E3%83%B3%E3%82%B9%E3%83%88%E3%83%A9%E3%82%AF%E3%82%B7%E3%83%A7%E3%83%B3%E3%83%87%E3%83%BC%E3%82%BF%E4%BD%9C%E6%88%90/), alpaca-gpt4-japanese, ultra-orca-boros-en-ja-v1 | Stability AI | STABILITY AI NON-COMMERCIAL RESEARCH COMMUNITY LICENSE |
| [karasu-1.1B](https://huggingface.co/lightblue/karasu-1.1B) | TinyLlama (**1.1b**) | Pre-training: Japanese OSCAR, Japanese mC4<br>(**3B** tokens) | Lightblue | Apache 2.0 |

Expand All @@ -144,7 +145,7 @@ Please point out any errors on the [issues page](https://github.com/llm-jp/aweso
| [NovelAI/genji-jp](https://huggingface.co/NovelAI/genji-jp) | Storytelling | GPT-J (**6b**) | NovelAI ||

<a id="instruction-only-models"></a>
### Models built off non-Japanese LLMs (w/ instruction tuning on Japanese)
### Models built off non-Japanese LLMs (w/ post-training on Japanese)

<a id="generative-instruction-only-general"></a>
#### General purpose
Expand All @@ -171,6 +172,7 @@ Please point out any errors on the [issues page](https://github.com/llm-jp/aweso
| [lightblue/jod](https://huggingface.co/lightblue/jod) | Mistral-7B-SlimOrca (**7b**) || Lightblue | Apache 2.0 |
| [NTQAI/chatntq-7b-jpntuned](https://huggingface.co/NTQAI/chatntq-7b-jpntuned) | RWKV-4 World (**7b**)|| NTQ Solution ||
| [Borea](https://prtimes.jp/main/html/rd/p/000000008.000129878.html)<br>([Jp](https://huggingface.co/AXCXEPT/Borea-Phi-3.5-mini-Instruct-Jp), [Common](https://huggingface.co/AXCXEPT/Borea-Phi-3.5-mini-Instruct-Common), [Coding](https://huggingface.co/AXCXEPT/Borea-Phi-3.5-mini-Instruct-Coding)) | Phi-3.5 (**3.8b**) | | Axcxept | MIT |
| [Gemma-2-JPN](https://developers-jp.googleblog.com/2024/10/gemma-2-for-japan.html)<br>([2b-jpn-it](https://huggingface.co/google/gemma-2-2b-jpn-it)) | Gemma 2 (**2b**) || Google | Gemma Terms of Use |
| [AXCXEPT/EZO-Common-T2-2B-gemma-2-it](https://huggingface.co/AXCXEPT/EZO-Common-T2-2B-gemma-2-it) | Gemma 2 (**2b**) || Axcxept | Gemma Terms of Use |

<a id="generative-instruction-only-domain-specific"></a>
Expand Down Expand Up @@ -523,4 +525,6 @@ When referencing this repository, please cite as follows:

[^18]: For details of each model, please refer to Chapter 4 of the authors' [paper](https://www.jstage.jst.go.jp/article/jnlp/31/2/31_707/_pdf/-char/ja). Note that the SC-2M-wiki model is strictly not a domain-specific model as it is pre-trained only on Wikipedia.

[^19]: Refer to the following articles: [大規模言語モデルTanuki-8B, 8x8Bの位置づけや開発指針など](https://zenn.dev/matsuolab/articles/377f7ae8b1169e), [大規模言語モデルを開発するにあたっての事前・事後学習の戦略メモー特に合成データについてー](https://zenn.dev/matsuolab/articles/34036f017fae9e)
[^19]: Refer to the following articles: [大規模言語モデルTanuki-8B, 8x8Bの位置づけや開発指針など](https://zenn.dev/matsuolab/articles/377f7ae8b1169e), [大規模言語モデルを開発するにあたっての事前・事後学習の戦略メモー特に合成データについてー](https://zenn.dev/matsuolab/articles/34036f017fae9e)

[^20]: Before conducting Instruction Tuning, a Chat Vector between Gemma 2 Instruct and Gemma 2 Base is added.
Binary file modified figures/parameter_size_overview_en.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified figures/parameter_size_overview_ja.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
21 changes: 19 additions & 2 deletions figures/scripts/parameter_size_overview.csv
Original file line number Diff line number Diff line change
@@ -1,7 +1,22 @@
Model,Lab,Parameters(B),Announced,Type
LFM-40B,Liquid AI,40.0,2024/09/01,EN-available
NLVM 1.0,NVIDIA,72.0,2024/09/01,EN-available
TeleChat2-115B,China Telecom Artificial Intelligence Research Institute,115.0,2024/09/01,EN-available
AMD-Llama-135m,AMD,0.135,2024/09/01,EN-available
Llama 3.2 90B,Meta AI,90.0,2024/09/01,EN-available
Llama 3.2 3B,Meta AI,3.21,2024/09/01,EN-available
Molmo,Allen AI,72.0,2024/09/01,EN-available
Gemini-1.5-Pro-002 ,Google DeepMind,1500.0,2024/09/01,EN-available
Qwen2.5,Alibaba,72.0,2024/09/01,EN-available
GRIN MoE,Microsoft,60.0,2024/09/01,EN-available
Data-Gemma,Google DeepMind,27.0,2024/09/01,EN-available
o1,OpenAI,200.0,2024/09/01,EN-available
Reader-LM,Jina AI,1.54,2024/09/01,EN-available
Pixtral-12b-240910,Mistral,12.0,2024/09/01,EN-available
DeepSeek-V2.5,DeepSeek-AI,236.0,2024/09/01,EN-available
Yi-Coder,01-ai,9.0,2024/09/01,EN-available
OLMoE-1B-7B,Allen AI,6.9,2024/09/01,EN-available
PLLuM,Consortium,20.0,2024/08/01,EN-available
xLAM,Salesforce,141.0,2024/08/01,EN-available
Rene,Cartesia,1.3,2024/08/01,EN-available
Gemini 1.5 Flash-8B,Google DeepMind,8.0,2024/08/01,EN-available
Expand All @@ -13,6 +28,7 @@ phi-3.5-mini,Microsoft,3.8,2024/08/01,EN-available
Minitron-4B,NVIDIA,4.0,2024/08/01,EN-available
sarvam-2b,Sarvam AI,2.0,2024/08/01,EN-available
Grok-2,xAI,600.0,2024/08/01,EN-available
EXAONE 3.0,LG,7.8,2024/08/01,EN-available
Falcon Mamba 7B,TII,7.0,2024/08/01,EN-available
Palmyra-Med-70B,Writer,70.0,2024/07/01,EN-available
Palmyra-Fin-70B,Writer,70.0,2024/07/01,EN-available
Expand Down Expand Up @@ -54,7 +70,7 @@ Yi-Large,01-ai,1000.0,2024/05/01,EN-available
Chameleon,Meta AI,34.0,2024/05/01,EN-available
Sparse Llama 7B,Cerebras,7.0,2024/05/01,EN-available
Gemini 1.5 Flash,Google DeepMind,,2024/05/01,EN-available
GPT-4o,OpenAI,70.0,2024/05/01,EN-available
GPT-4o,OpenAI,200.0,2024/05/01,EN-available
Falcon 2 11B,TII,11.0,2024/05/01,EN-available
Fugaku-LLM,Fujitsu,13.0,2024/05/01,EN-available
Yi 1.5 34B,01-ai,34.4,2024/05/01,EN-available
Expand Down Expand Up @@ -238,7 +254,7 @@ StableLM,Stability AI,65.0,2023/04/01,EN-available
Dolly 2.0,Databricks,12.0,2023/04/01,EN-available
Pythia,EleutherAI,12.0,2023/04/01,EN-available
Koala-13B,Berkeley,13.0,2023/04/01,EN-available
C1.2,Character.ai,33.0,2023/03/01,EN-available
C1.2,Character.ai,20.0,2023/03/01,EN-available
OpenFlamingo-9B,LAION,8.3,2023/03/01,EN-available
GPT4All-LoRa,Nomic,7.0,2023/03/01,EN-available
Cerebras-GPT,Cerebras,13.0,2023/03/01,EN-available
Expand Down Expand Up @@ -302,6 +318,7 @@ GPT-J,EleutherAI,6.0,2021/06/01,EN-available
ruGPT-3,Huawei/Sberbank,1.3,2021/02/01,EN-available
Switch,Google,1600.0,2021/01/01,EN-available
GPT-3,OpenAI,175.0,2020/05/01,EN-available
SFR-LLaMA-3.1-70B-Judge,Salesforce,70.0,2024/09/01,EN-unavailable
LTM-2-mini,Magic,20.0,2024/08/01,EN-unavailable
SpreadsheetLLM,Microsoft,1760.0,2024/07/01,EN-unavailable
FLAMe,Google DeepMind,24.0,2024/07/01,EN-unavailable
Expand Down
3 changes: 3 additions & 0 deletions figures/scripts/parameter_size_overview_ja.csv
Original file line number Diff line number Diff line change
@@ -1,11 +1,14 @@
Model,Lab,Parameters(B),Announced,Type,Source(JP)
日本語版 Gemma 2 2B,Google,2,2024/10/3,JP-available-CP,https://developers-jp.googleblog.com/2024/10/gemma-2-for-japan.html
Sarashina2-70b,SB Intuitions,70,2024/8/7,JP-available,https://huggingface.co/sbintuitions/sarashina2-70b
Sarashina,SB Intuitions,65,2024/6/14,JP-available,https://www.sbintuitions.co.jp/news/press/20240614_01/
Takane,Fujitsu,104,2024/9/30,JP-available-CP,https://pr.fujitsu.com/jp/news/2024/09/30.html
Fugaku-LLM,"Titech, Tohoku Univ., Fujitsu, RIKEN, Nagoya Univ., CyberAgent, Kotoba Technologies",13,2024/5/10,JP-available,https://www.fujitsu.com/global/about/resources/news/press-releases/2024/0510-01.html
,Rakuten,7,2024/3/21,JP-available-CP,https://corp.rakuten.co.jp/news/press/2024/0321_01.html
KARAKURI 8x7B Instruct,KARAKURI,46.7,2024/6/20,JP-available-CP,https://karakuri.ai/seminar/news/karakuri-lm-8x7b-instruct-v0-1/
KARAKURI 8x7B Chat,KARAKURI,46.7,2024/5/20,JP-available-CP,https://karakuri.ai/seminar/news/aws_trainium_moe/
KARAKURI,KARAKURI,70,2024/1/31,JP-available-CP,https://karakuri.ai/seminar/news/karakuri-lm/
Gemma 2 Baku 2B,rinna,2,2024/10/3,JP-available-CP,https://rinna.co.jp/news/2024/10/20241003.html
Llama 3 Youko (Instruct),rinna,70,2024/07/25,JP-available-CP,https://rinna.co.jp/news/2024/07/20240725.html
Llama 3 Youko,rinna,8,2024/05/07,JP-available-CP,https://rinna.co.jp/news/2024/05/20240507.html
Nekomata,rinna,14,2023/12/21,JP-available-CP,https://rinna.co.jp/news/2023/12/20231221.html
Expand Down
8 changes: 6 additions & 2 deletions fr/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -127,6 +127,7 @@ N'hésitez pas à signaler les erreurs sur la page [issues](https://github.com/l
| [lightblue/japanese-mpt-7b](https://huggingface.co/lightblue/japanese-mpt-7b) | MPT (**7b**) | Japanese mC4 | Lightblue | Apache 2.0 (?)[^12] |
| [Japanese Stable LM 3B-4E1T](https://ja.stability.ai/blog/japanese-stable-lm-3b-4e1tjapanese-stable-lm-gamma-7b)<br>([3b-4e1t-base](https://huggingface.co/stabilityai/japanese-stablelm-3b-4e1t-base), [3b-4e1t-instruct](https://huggingface.co/stabilityai/japanese-stablelm-3b-4e1t-instruct)) | StableLM-3B-4E1T (**3b**) |Pre-training: Wikipedia, Japanese mC4, Japanese CC-100, Japanese OSCAR, SlimPajama(excluding Books3)<br>(**100B** tokens)<br>Instruction Tuning: Dolly Dataset, HH RLHF, wikinews subset of llm-japanese-dataset | Stability AI | Apache 2.0 |
| [kotomamba-2.8B-CL](https://huggingface.co/kotoba-tech/kotomamba-2.8B-CL-v1.0) | mamba-2.8b-slimpj<br>(**2.8b**) | Japanese Wikipedia, Swallow Corpus, SlimPajama | Kotoba Technologies | Apache 2.0 |
| [Gemma 2 Baku 2B](https://rinna.co.jp/news/2024/10/20241003.html)<br>([2b](https://huggingface.co/rinna/gemma-2-baku-2b), [2b-it](https://huggingface.co/rinna/gemma-2-baku-2b-it)) | Gemma 2 (**2b**) | Pre-training: Wikipedia, Japanese C4, Japanese CC-100, Japanese OSCAR, The Pile, undisclosed dataset<br>(**80B** tokens)<br>OPRO: undisclosed dataset [^20] | rinna | Gemma Terms of Use |
| [Japanese Stable LM 2 1.6B](https://ja.stability.ai/blog/japanese-stable-lm-2-16b)<br>([base](https://huggingface.co/stabilityai/japanese-stablelm-2-base-1_6b), [instruct](https://huggingface.co/stabilityai/japanese-stablelm-2-instruct-1_6b)) | Stable LM 2 1.6B (**1.6b**) | Pre-training: Wikipedia, CulturaX<br>Instruction Tuning: jaster, [ichikara-instruction](https://liat-aip.sakura.ne.jp/wp/llm%E3%81%AE%E3%81%9F%E3%82%81%E3%81%AE%E6%97%A5%E6%9C%AC%E8%AA%9E%E3%82%A4%E3%83%B3%E3%82%B9%E3%83%88%E3%83%A9%E3%82%AF%E3%82%B7%E3%83%A7%E3%83%B3%E3%83%87%E3%83%BC%E3%82%BF%E4%BD%9C%E6%88%90/), alpaca-gpt4-japanese, ultra-orca-boros-en-ja-v1 | Stability AI | STABILITY AI NON-COMMERCIAL RESEARCH COMMUNITY LICENSE |
| [karasu-1.1B](https://huggingface.co/lightblue/karasu-1.1B) | TinyLlama (**1.1b**) | Pre-training: Japanese OSCAR, Japanese mC4<br>(**3B** tokens) | Lightblue | Apache 2.0 |

Expand All @@ -144,7 +145,7 @@ N'hésitez pas à signaler les erreurs sur la page [issues](https://github.com/l
| [NovelAI/genji-jp](https://huggingface.co/NovelAI/genji-jp) | Génération de récits | GPT-J (**6b**) | NovelAI ||

<a id="instruction-only-models"></a>
### Modèles développés à partir d'LLM non-japonais (avec un affinement par instructions en japonais)
### Modèles développés à partir d'LLM non-japonais (avec un post-entraînement en japonais)

<a id="generative-instruction-only-general"></a>
#### D'usage général
Expand All @@ -171,6 +172,7 @@ N'hésitez pas à signaler les erreurs sur la page [issues](https://github.com/l
| [lightblue/jod](https://huggingface.co/lightblue/jod) | Mistral-7B-SlimOrca (**7b**) || Lightblue | Apache 2.0 |
| [NTQAI/chatntq-7b-jpntuned](https://huggingface.co/NTQAI/chatntq-7b-jpntuned) | RWKV-4 World (**7b**)|| NTQ Solution ||
| [Borea](https://prtimes.jp/main/html/rd/p/000000008.000129878.html)<br>([Jp](https://huggingface.co/AXCXEPT/Borea-Phi-3.5-mini-Instruct-Jp), [Common](https://huggingface.co/AXCXEPT/Borea-Phi-3.5-mini-Instruct-Common), [Coding](https://huggingface.co/AXCXEPT/Borea-Phi-3.5-mini-Instruct-Coding)) | Phi-3.5 (**3.8b**) | | Axcxept | MIT |
| [Gemma-2-JPN](https://developers-jp.googleblog.com/2024/10/gemma-2-for-japan.html)<br>([2b-jpn-it](https://huggingface.co/google/gemma-2-2b-jpn-it)) | Gemma 2 (**2b**) || Google | Gemma Terms of Use |
| [AXCXEPT/EZO-Common-T2-2B-gemma-2-it](https://huggingface.co/AXCXEPT/EZO-Common-T2-2B-gemma-2-it) | Gemma 2 (**2b**) || Axcxept | Gemma Terms of Use |

<a id="generative-instruction-only-domain-specific"></a>
Expand Down Expand Up @@ -524,4 +526,6 @@ Lorsque vous référencez ce répertoire, veuillez le citer comme suit:

[^18]: Pour les détails de chaque modèle, veuillez vous référer au Chapitre 4 de l'[article](https://www.jstage.jst.go.jp/article/jnlp/31/2/31_707/_pdf/-char/ja) des auteurs. Notez que le modèle SC-2M-wiki n'est strictement pas un modèle spécifique à un domaine car il est pré-entraîné uniquement sur Wikipédia.

[^19]: Référez-vous aux articles suivants: [大規模言語モデルTanuki-8B, 8x8Bの位置づけや開発指針など](https://zenn.dev/matsuolab/articles/377f7ae8b1169e), [大規模言語モデルを開発するにあたっての事前・事後学習の戦略メモー特に合成データについてー](https://zenn.dev/matsuolab/articles/34036f017fae9e)
[^19]: Référez-vous aux articles suivants: [大規模言語モデルTanuki-8B, 8x8Bの位置づけや開発指針など](https://zenn.dev/matsuolab/articles/377f7ae8b1169e), [大規模言語モデルを開発するにあたっての事前・事後学習の戦略メモー特に合成データについてー](https://zenn.dev/matsuolab/articles/34036f017fae9e)

[^20]: Avant de procéder à l'ajustement des instructions, un vecteur de chat entre Gemma 2 Instruct et Gemma 2 Base est ajouté.
Loading

0 comments on commit 0ffb8d5

Please sign in to comment.