TextBrewer/examples/msra_ner_example at master · airaria/TextBrewer

History

Name		Name	Last commit message	Last commit date
parent directory ..
README.md		README.md
README_ZH.md		README_ZH.md
config.py		config.py
main.distill.dist.py		main.distill.dist.py
main.train.dist.py		main.train.dist.py
matches.py		matches.py
modeling.py		modeling.py
ner_ElectraDistill_dist.sh		ner_ElectraDistill_dist.sh
ner_ElectraTrain_dist.sh		ner_ElectraTrain_dist.sh
train_eval.py		train_eval.py
utils.py		utils.py
utils_ner.py		utils_ner.py

README.md

中文说明 | English

This example demonstrates distilling a Chinese-ELECTRA-base model on the MSRA NER task with distributed data-parallel training(single node, muliti-GPU).

ner_ElectraTrain_dist.sh : trains a treacher model (Chinese-ELECTRA-base) on MSRA NER.
ner_ElectraDistill_dist.sh : distills the teacher to a ELECTRA-small model.

Set the following variables in the shell scripts before running:

ELECTRA_DIR_BASE : where Chinese-ELECTRA-base locates, should includ vocab.txt, pytorch_model.bin and config.json.
OUTPUT_DIR : this directory stores the logs and the trained model weights.
DATA_DIR : it includes MSRA NER dataset:
- msra_train_bio.txt
- msra_test_bio.txt

For distillation:

ELECTRA_DIR_SMALL : where the pretrained Chinese-ELECTRA-small weight locates, should include pytorch_model.bin. This is optional. If you don't provide the ELECTRA-small weight, the student model will be initialized randomly.
student_config_file : the model config file (i.e., config.json) for the student. Usually it should be in ${ELECTRA_DIR_SMALL}.
trained_teacher_model_file : the ELECTRA-base teacher model that has been fine-tuned.

The scripts have been tested under PyTorch==1.2, Transformers==2.8.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

msra_ner_example

msra_ner_example

README.md

Files

msra_ner_example

Directory actions

More options

Directory actions

More options

Latest commit

History

msra_ner_example

Folders and files

parent directory

README.md