Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add LLM-jp-3 172B beta1 #362

Merged
merged 6 commits into from
Sep 20, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .404-links.yml
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
ignore:
urls:
- https://gitlab.llm-jp.nii.ac.jp/datasets/llm-jp-corpus-v2
- https://gitlab.llm-jp.nii.ac.jp/datasets/*
- https://llm-jp.nii.ac.jp/blog/2024/02/09/v1.1-tuning.html
- https://alaginrc.nict.go.jp/nict-bert/index.html
- https://www.anlp.jp/*
Expand Down
12 changes: 12 additions & 0 deletions .vitepress/config.mts
Original file line number Diff line number Diff line change
Expand Up @@ -74,5 +74,17 @@ export default defineConfig({
},
link: '/fr/'
},
},
vite: {
optimizeDeps: {
exclude: [
"@nolebase/vitepress-plugin-enhanced-readabilities/client",
],
},
ssr: {
noExternal: [
"@nolebase/vitepress-plugin-enhanced-readabilities",
],
},
}
})
11 changes: 11 additions & 0 deletions .vitepress/theme/index.css
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
.medium-zoom-overlay {
z-index: 20;
}

.medium-zoom-image {
z-index: 21;
}

.VPSocialLinks.VPNavBarSocialLinks.social-links {
margin-right: 0;
}
34 changes: 34 additions & 0 deletions .vitepress/theme/index.mts
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
import DefaultTheme from 'vitepress/theme';
import { h } from 'vue';

import './index.css';
import {
NolebaseEnhancedReadabilitiesMenu,
NolebaseEnhancedReadabilitiesScreenMenu,
} from "@nolebase/vitepress-plugin-enhanced-readabilities/client";
import "@nolebase/vitepress-plugin-enhanced-readabilities/client/style.css";
import type { Options as NolebaseReadOptions } from '@nolebase/vitepress-plugin-enhanced-readabilities/client'
import { InjectionKey as NolebaseReadInjectionKey } from '@nolebase/vitepress-plugin-enhanced-readabilities/client'

export default {
extends: DefaultTheme,
Layout: () => {
return h(DefaultTheme.Layout, null, {
"nav-screen-content-after": () => h(NolebaseEnhancedReadabilitiesScreenMenu),
"nav-bar-content-after": () => h(NolebaseEnhancedReadabilitiesMenu),
});
},
enhanceApp({ app }) {
app.provide(NolebaseReadInjectionKey, {
layoutSwitch: {
defaultMode: 5,
contentLayoutMaxWidth: {
defaultMaxWidth: 85
},
pageLayoutMaxWidth: {
defaultMaxWidth: 85
}
}
} as NolebaseReadOptions);
}
};
7 changes: 3 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
[ [**English**](./en/) | [**Français**](./fr/) | 日本語 ]

<p align="center">
<img src="./figures/parameter_size_overview.png" alt="日本語LLM・海外LLMのパラメータサイズの推移">
<img src="./figures/parameter_size_overview_ja.png" alt="日本語LLM・海外LLMのパラメータサイズの推移">
</p>
<figcaption style="font-style: italic; font-size: 0.9em; color: #6b7280; text-align: center;">日本語LLM・海外LLMのパラメータ数の推移。日本語モデルの情報は本記事、海外モデルの情報は LifeArchitect.ai の <a href="https://lifearchitect.ai/models-table/" target="_blank" rel="noreferrer">Models table</a> を参照しています(ただし、図のスペース上一部のモデルは省略。また、海外モデルのパラメータ数は推測値を含む)。修正・追加等ありましたらお知らせ下さい。</figcaption>

Expand Down Expand Up @@ -36,6 +36,7 @@

| | アーキテクチャ | 入出力で扱える<br>トークン数 | 学習テキスト | 開発元 | ライセンス |
|:---|:---:|:---:|:---:|:---:|:---:|
| [LLM-jp-3 172B beta1](https://www.nii.ac.jp/news/release/2024/0917.html) | Llama<br>([**172b**-beta1](https://huggingface.co/llm-jp/llm-jp-3-172b-beta1), [**172b**-beta1-instruct](https://huggingface.co/llm-jp/llm-jp-3-172b-beta1-instruct)) | 4,096 | 事前学習: [llm-jp-corpus-v3](https://gitlab.llm-jp.nii.ac.jp/datasets/llm-jp-corpus-v3)<br>Instruction Tuning: [ichikara-instruction](https://liat-aip.sakura.ne.jp/wp/llm%E3%81%AE%E3%81%9F%E3%82%81%E3%81%AE%E6%97%A5%E6%9C%AC%E8%AA%9E%E3%82%A4%E3%83%B3%E3%82%B9%E3%83%88%E3%83%A9%E3%82%AF%E3%82%B7%E3%83%A7%E3%83%B3%E3%83%87%E3%83%BC%E3%82%BF%E4%BD%9C%E6%88%90/), [answer-carefully](https://liat-aip.sakura.ne.jp/wp/answercarefully-dataset/), Dolly Dataset, OASST1, OASST2, Aya Dataset, ichikara-instruction-format, Daring-Anteater, FLAN | LLM研究開発センター (LLMC) | LLM-jp-3 172B beta1 Terms of Use |
| [Stockmark-100b](https://stockmark.co.jp/news/20240516) | Llama<br>([**100b**](https://huggingface.co/stockmark/stockmark-100b), [**100b**-instruct-v0.1](https://huggingface.co/stockmark/stockmark-100b-instruct-v0.1)) | 4,096 | 事前学習: RedPajama, 日本語 Wikipedia, Japanese mC4, Japanese CommonCrawl, 日本語特許, Stockmark Web Corpus<br>(計 **910B** トークン)<br>Instruction Tuning (LoRA): [ichikara-instruction](https://liat-aip.sakura.ne.jp/wp/llm%E3%81%AE%E3%81%9F%E3%82%81%E3%81%AE%E6%97%A5%E6%9C%AC%E8%AA%9E%E3%82%A4%E3%83%B3%E3%82%B9%E3%83%88%E3%83%A9%E3%82%AF%E3%82%B7%E3%83%A7%E3%83%B3%E3%83%87%E3%83%BC%E3%82%BF%E4%BD%9C%E6%88%90/) | ストックマーク | MIT |
| [Sarashina2](https://www.sbintuitions.co.jp/news/press/20240614_01/) | Llama<br>([**7b**](https://huggingface.co/sbintuitions/sarashina2-7b), [**13b**](https://huggingface.co/sbintuitions/sarashina2-13b), [**70b**](https://huggingface.co/sbintuitions/sarashina2-70b)) | 7b, 13b: 4,096<br>70b: 8,192 | 事前学習: Japanese Common Crawl, SlimPajama, StarCoder<br>(計 **2.1T** トークン) | SB Intuitions | MIT |
| [Sarashina1](https://www.sbintuitions.co.jp/news/press/20240614_01/) | GPT-NeoX<br>([**7b**](https://huggingface.co/sbintuitions/sarashina1-7b), [**13b**](https://huggingface.co/sbintuitions/sarashina1-13b), [**65b**](https://huggingface.co/sbintuitions/sarashina1-65b)) | 2,048 | 事前学習: Japanese Common Crawl<br>(計 **1T** トークン) | SB Intuitions | MIT |
Expand Down Expand Up @@ -465,9 +466,7 @@

このプロジェクトに貢献してくれているコントリビューターのみなさんです!

<a href="https://github.com/llm-jp/awesome-japanese-llm/graphs/contributors" target="_blank" rel="noreferrer">
<img src="./figures/contributors.svg" alt="コントリビューター" />
</a>
<img loading="lazy" src="./figures/contributors.svg" alt="コントリビューター" />

<a id="citation"></a>
## 引用
Expand Down
7 changes: 3 additions & 4 deletions en/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
[ English | [**Français**](../fr/) | [**日本語**](../) ]

<p align="center">
<img src="../figures/parameter_size_overview.png" alt="Parameter sizes of Japanese and non-Japanese LLMs over time">
<img src="../figures/parameter_size_overview_en.png" alt="Parameter sizes of Japanese and non-Japanese LLMs over time">
</p>
<figcaption style="font-style: italic; font-size: 0.9em; color: #6b7280; text-align: center;">Evolution of parameter sizes for Japanese LLMs and non-Japanese LLMs. The information on the Japanese models is derived from this article, while the information on the non-Japanese models can be referred from the <a href="https://lifearchitect.ai/models-table/" target="_blank" rel="noreferrer">Models table</a> on LifeArchitect.ai. However, due to space constraints in the figure, some models have been omitted. Additionally, estimates are included in the parameter count for non-Japanese models. Please notify us of any corrections, additions, or updates.</figcaption>

Expand Down Expand Up @@ -35,6 +35,7 @@ Please point out any errors on the [issues page](https://github.com/llm-jp/aweso

| | Architecture | Max Context Length | Training Data | Developer | License |
|:---|:---:|:---:|:---:|:---:|:---:|
| [LLM-jp-3 172B beta1](https://huggingface.co/llm-jp/llm-jp-3-172b-beta1) | Llama<br>([**172b**-beta1](https://huggingface.co/llm-jp/llm-jp-3-172b-beta1), [**172b**-beta1-instruct](https://huggingface.co/llm-jp/llm-jp-3-172b-beta1-instruct)) | 4,096 | Pre-training: [llm-jp-corpus-v3](https://gitlab.llm-jp.nii.ac.jp/datasets/llm-jp-corpus-v3)<br>Instruction Tuning: [ichikara-instruction](https://liat-aip.sakura.ne.jp/wp/llm%E3%81%AE%E3%81%9F%E3%82%81%E3%81%AE%E6%97%A5%E6%9C%AC%E8%AA%9E%E3%82%A4%E3%83%B3%E3%82%B9%E3%83%88%E3%83%A9%E3%82%AF%E3%82%B7%E3%83%A7%E3%83%B3%E3%83%87%E3%83%BC%E3%82%BF%E4%BD%9C%E6%88%90/), [answer-carefully](https://liat-aip.sakura.ne.jp/wp/answercarefully-dataset/), Dolly Dataset, OASST1, OASST2, Aya Dataset, ichikara-instruction-format, Daring-Anteater, FLAN | Research and Development Center for Large Language Models (LLMC) | LLM-jp-3 172B beta1 Terms of Use |
| [Stockmark-100b](https://huggingface.co/stockmark/stockmark-100b) | Llama<br>([**100b**](https://huggingface.co/stockmark/stockmark-100b), [**100b**-instruct-v0.1](https://huggingface.co/stockmark/stockmark-100b-instruct-v0.1)) | 4,096 | Pre-training: RedPajama, Japanese Wikipedia, Japanese mC4, Japanese CommonCrawl, Japanese Patent, Stockmark Web Corpus<br>(**910B** tokens)<br>Instruction Tuning (LoRA): [ichikara-instruction](https://liat-aip.sakura.ne.jp/wp/llm%E3%81%AE%E3%81%9F%E3%82%81%E3%81%AE%E6%97%A5%E6%9C%AC%E8%AA%9E%E3%82%A4%E3%83%B3%E3%82%B9%E3%83%88%E3%83%A9%E3%82%AF%E3%82%B7%E3%83%A7%E3%83%B3%E3%83%87%E3%83%BC%E3%82%BF%E4%BD%9C%E6%88%90/) | Stockmark | MIT |
| [Sarashina2](https://www.sbintuitions.co.jp/news/press/20240614_01/) | Llama<br>([**7b**](https://huggingface.co/sbintuitions/sarashina2-7b), [**13b**](https://huggingface.co/sbintuitions/sarashina2-13b), [**70b**](https://huggingface.co/sbintuitions/sarashina2-70b)) | 7b, 13b: 4,096<br>70b: 8,192 | Pre-training: Japanese Common Crawl, SlimPajama, StarCoder<br>(**2.1T** tokens) | SB Intuitions | MIT |
| [Sarashina1](https://www.sbintuitions.co.jp/news/press/20240614_01/) | GPT-NeoX<br>([**7b**](https://huggingface.co/sbintuitions/sarashina1-7b), [**13b**](https://huggingface.co/sbintuitions/sarashina1-13b), [**65b**](https://huggingface.co/sbintuitions/sarashina1-65b)) | 2,048 | Pre-training: Japanese Common Crawl<br>(**1T** tokens) | SB Intuitions | MIT |
Expand Down Expand Up @@ -463,9 +464,7 @@ Please point out any errors on the [issues page](https://github.com/llm-jp/aweso

We love contributors! Feel free to contribute to this project.

<a href="https://github.com/llm-jp/awesome-japanese-llm/graphs/contributors" target="_blank" rel="noreferrer">
<img src="../figures/contributors.svg" alt="contributors" />
</a>
<img loading="lazy" src="../figures/contributors.svg" alt="contributors" />

<a id="citation"></a>
## Citation
Expand Down
Binary file removed figures/parameter_size_overview.png
Binary file not shown.
Binary file added figures/parameter_size_overview_en.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added figures/parameter_size_overview_ja.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Loading