Skip to content

Commit

Permalink
docs: add introduction about text-classification using Qwen2. (#56)
Browse files Browse the repository at this point in the history
* docs: add document about `BertSelfAttention`

* docs: add doc about text-classification

* docs: add code example for text-classification using Qwen2

* docs: add logo for qwen2 in chaper 6
  • Loading branch information
moyanxinxu authored Nov 27, 2024
1 parent 761c3fa commit 4bbe033
Show file tree
Hide file tree
Showing 13 changed files with 660 additions and 24 deletions.
10 changes: 5 additions & 5 deletions .obsidian/workspace.json
Original file line number Diff line number Diff line change
Expand Up @@ -13,13 +13,13 @@
"state": {
"type": "markdown",
"state": {
"file": "docs/chapter8/bert/modeling/modeling.md",
"file": "docs/chapter6/financial_report/financial_report.md",
"mode": "source",
"backlinks": false,
"source": false
},
"icon": "lucide-file",
"title": "modeling"
"title": "financial_report"
}
}
]
Expand Down Expand Up @@ -68,10 +68,10 @@
"state": {
"type": "outline",
"state": {
"file": "docs/chapter8/bert/modeling/modeling.md"
"file": "docs/chapter6/financial_report/financial_report.md"
},
"icon": "lucide-list",
"title": "modeling 的大纲"
"title": "financial_report 的大纲"
}
}
]
Expand All @@ -87,10 +87,10 @@
},
"active": "a00f9c294cc735a6",
"lastOpenFiles": [
"docs/chapter8/bert/modeling/modeling.md",
"docs/chapter8/repositories/repositories.md",
"docs/chapter8/repositories_index.md",
"docs/chapter8/bert/tokenization/tokenization.md",
"docs/chapter8/bert/modeling/modeling.md",
"docs/chapter8/bert/configuration/configuration.md",
"docs/chapter1/dataset_tour/datasets.md",
"docs/chapter8/bert/tokenization/tokenizer.md",
Expand Down
1 change: 1 addition & 0 deletions .vscode/settings.json
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
{
"markdownlint.config": {
"MD010": false,
"MD033": false
}
}
Binary file added assets/thumbnail.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
4 changes: 4 additions & 0 deletions docker-compose/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -10,12 +10,16 @@ RUN apt-get update && apt-get install -y \
wget \
curl \
git \
unzip \
inetutils-ping \
tmux \
&& apt-get clean

COPY --from=miniconda-stage /opt/conda /opt/conda
ENV PATH="/opt/conda/bin:${PATH}"
WORKDIR /root

RUN echo "export HF_ENDPOINT=https://hf-mirror.com" >> /root/.bashrc
COPY condarc /root/.condarc
COPY Dockerfile /root/dockerfile/Dockerfile
COPY pip.conf /root/.pip/pip.conf
Expand Down
18 changes: 9 additions & 9 deletions docs/appendix/env_config/env.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,13 +15,13 @@ title: 环境配置

注意:选择下载的 Miniconda3 版本需要和电脑处理器的架构吻合。为了方便,在此下方直接提供各大操作系统推荐的下载链接。

| 系统 | 下载地址 |
| :---: | --- |
| Windows | <https://repo.anaconda.com/miniconda/Miniconda3-latest-Windows-x86_64.exe> |
| macOS(Intel) | <https://repo.anaconda.com/miniconda/Miniconda3-latest-MacOSX-x86_64.sh> |
| macOS(M/ARM) | <https://repo.anaconda.com/miniconda/Miniconda3-latest-MacOSX-arm64.sh> |
| Linux(x64) | <https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh> |
| Linux(ARM) | <https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-aarch64.sh> |
| 系统 | 下载地址 |
| :------------: | -------------------------------------------------------------------------- |
| Windows | <https://repo.anaconda.com/miniconda/Miniconda3-latest-Windows-x86_64.exe> |
| macOS(Intel) | <https://repo.anaconda.com/miniconda/Miniconda3-latest-MacOSX-x86_64.sh> |
| macOS(M/ARM) | <https://repo.anaconda.com/miniconda/Miniconda3-latest-MacOSX-arm64.sh> |
| Linux(x64) | <https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh> |
| Linux(ARM) | <https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-aarch64.sh> |

### 安装 Miniconda

Expand Down Expand Up @@ -222,7 +222,7 @@ auto_activate_base: false
![conda_activate_env](./imgs/conda_activate_env.png){ width="600" }

- 安装包:`conda install package_name`或者`pip install package_name`
- `pip`在安装包时临时更换镜像源:`pip install package_name -i https://pypi.tuna.tsinghua.edu.cn/simple`
- `pip`在安装包时临时更换镜像源:`pip install package_name -i https://pypi.tuna.tsinghua.edu.cn/simple`
- 卸载包:`conda remove package_name`或者`pip uninstall package_name`
- 显示所有安装的包:`conda list`
- 删除指定虚拟环境:`conda remove -n env_name --all`
Expand All @@ -235,7 +235,7 @@ auto_activate_base: false
## 安装函数库

???+ warning
💯 当你想在虚拟环境安装包的时候,确认你正处在正确的虚拟环境中!!
:100:当你想在虚拟环境安装包的时候,确认你正处在正确的虚拟环境中!!

```bash title='pip/conda'
pip install numpy pandas matplotlib transformers datasets peft evaluate diffusers gradio torch jupyterlab
Expand Down
4 changes: 2 additions & 2 deletions docs/chapter1/dataset_tour/datasets.md
Original file line number Diff line number Diff line change
Expand Up @@ -93,7 +93,7 @@ data = load_dataset("hfl/cmrc2018")

通过返回结果可以看出 `data` 的数据类型为 `DatasetDict`,它是 `Datasets` 库中重要的数据类型。

!!! Note "train_test_split"
!!! note "train_test_split"

并非所有数据集都包含训练集、验证集和测试集。有些数据集可能只有一个或两个子集。
对于数据集 `hfl/cmrc2018` 存在训练集、验证集和测试集。但是对于 `LooksJuicy/ruozhiba` 却只存在训练集。
Expand Down Expand Up @@ -180,7 +180,7 @@ Dataset({

```

!!! Note "配置"
!!! note "配置"

### 配置 (Configurations)

Expand Down
15 changes: 9 additions & 6 deletions docs/chapter6/code_index.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,9 +5,12 @@ title: 索引

主页

- 多标签分类:[面对有害言论, 是时候让AI重拳出击了](./mlcoftc/multi-label-classification-of-toxic-comments.md)
- 抽取式阅读理解:[CMRC2018](./cmrc/cmrc.md)
- 文本摘要:[LCSTS短文本新闻摘要](./text-summary/text-summary.md)
- 集装箱编号位置检测:[DETR目标检测](./container-detr/container-detr.md)
- 文本翻译:[中英文本翻译](./translation/translation.md)
- 简单去噪:[ddpm-unet](./ddpm-unet-mnist/ddpm-unet-mnist.md)
| 模块名称 | 链接 |
| ------------------ | ----------------------------------------------------------------------------------------------- |
| 多标签分类 | [面对有害言论,是时候让AI重拳出击了](./mlcoftc/multi-label-classification-of-toxic-comments.md) |
| 抽取式阅读理解 | [CMRC2018](./cmrc/cmrc.md) |
| 文本摘要 | [LCSTS短文本新闻摘要](./text-summary/text-summary.md) |
| 集装箱编号位置检测 | [目标检测](./container-detr/container-detr.md) |
| 文本翻译 | [中英文本翻译](./translation/translation.md) |
| 扩散去噪 | [ddpm-unet简单去噪](./ddpm-unet-mnist/ddpm-unet-mnist.md) |
| 文本分类 | [基金年报问答意图识别](./financial_report/financial_report.md) |
2 changes: 1 addition & 1 deletion docs/chapter6/ddpm-unet-mnist/ddpm-unet-mnist.md
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,7 @@ import pandas as pd
from matplotlib import pyplot as plt
```

### 数据集的加载
### 加载数据集

```python
class MnistDataset:
Expand Down
Loading

0 comments on commit 4bbe033

Please sign in to comment.