Skip to content

Chinese word segmentation, Part-of-speech tagging and Medical named entity recognition From scratch.

Notifications You must be signed in to change notification settings

pku-nlp-forfun/CWS_POS_NER

Repository files navigation

CWS/POS/NER

Chinese word segmentation, Part-of-speech tagging and Medical named entity recognition From scratch.

Our Final Paper 👉

Getting Started

Dependencies:

  • tensorflow
# training, testing and evaluation
python3 run.py

Generate files:

  • Evaluation.md - markdown table of evaluation result
  • Result/ - prediction result
  • FinalResult/ - Final prediction result

Structure

├── Data         => data set given by TA
│   ├── devset
│   ├── testset1
│   └── trainset
├── Evaluation   => eval scripts given by TA
|
├── CWS          => CWS model
├── POS          => POS tagging model
├── NER          => NER model
|
├── constant.py  => some global constants and variables
|
├── dataset.py   => data preprocessing
├── model.py     => high-level model API for all our model
├── evaluate.py  => high-level evaluation API
└── run.py       => the entire process

Task Description

Data and scripts given by TA

Directory Structure

  • Data: (each has its _cws, _pos, _ner file)
    • devset
    • testset1
    • trainset
    • final
      • test2.txt - raw article
  • Evaluation
    • pos_evaluate.py
    • ner_evaluate.py

Resources

Article

Paper

Sequence Tagging

Chinese Word Segmentation

Tools' reference

Related Tools and Libraries

CRF

Model Structure

image

About

Chinese word segmentation, Part-of-speech tagging and Medical named entity recognition From scratch.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published