End-to-End Text Classification via Image-based Embedding using Character-level Networks

Author: Shunsuke Kitada, Ryunosuke Kotani, Hitoshi Iyatomi

Proposed CE-CLCNN model [1]	Example of data augmentation on image domain with random erasing [2]

Abstract: For analysing and/or understanding languages having no word boundaries based on morphological analysis such as Japanese, Chinese, and Thai, it is desirable to perform appropriate word segmentation before word embeddings. But it is inherently difficult in these languages. In recent years, various language models based on deep learning have made remarkable progress, and some of these methodologies utilizing character-level features have successfully avoided such a difficult problem. However, when a model is fed character-level features of the above languages, it often causes overfitting due to a large number of character types. In this paper, we propose a CE-CLCNN, character-level convolutional neural networks using a character encoder to tackle these problems. The proposed CE-CLCNN is an end-to-end learning model and has an image-based character encoder, i.e. the CE-CLCNN handles each character in the target document as an image. Through various experiments, we found and confirmed that our CE-CLCNN captured closely embedded features for visually and semantically similar characters and achieves state-of-the-art results on several open document classification tasks. In this paper we report the performance of our CE-CLCNN with the Wikipedia title estimation task and analyse the internal behaviour.

Preprint: https://arxiv.org/abs/1810.03595
IEEE Xplore: https://ieeexplore.ieee.org/document/8707407

We recommend that you also check out the following studies related to ours.

Daif et al. "AraDIC: Arabic Document Classification Using Image-Based Character Embeddings and Class-Balanced Loss." Proceedings of ACL SRW. 2020. Code: https://github.com/mahmouddaif/AraDIC
Aoki et al. "Text Classification through Glyph-aware Disentangled Character Embedding and Semantic Sub-character Augmentation." Proceedings of AACL-IJCNLP SRW. 2020. Code: https://github.com/IyatomiLab/GDCE-SSA

Install and Run the code

Install the requirements

pip install -U pip poetry setuptools
poetry install

# If you want to use CUDA 11+, try the following command:
poetry run poe force-cuda11

Run training the model

Our CE-CLCNN models

bash bash scripts/run_exps/run_ceclcnn.sh

# or dry-run the scripts
# DRY_RUN=1 bash bash scripts/run_exps/run_ceclcnn.sh

Liu et al. [ACL'17] models

bash scripts/run_exps/run_liu_acl17.sh

# or dry-run the scripts
# DRY_RUN=1 bash scripts/run_exps/run_liu_acl17.sh

Inference with test data using the pre-trained model

Example of inference using the best model for the Japanese wikipedia title dataset.

CUDA_VISIBLE_DEVICES=0 allennlp predict \
  output/CE-CLCNN/wiki_title/ja/with_RE_and_WT/model.tar.gz \
  https://github.com/frederick0329/Learning-Character-Level/raw/master/data/ja_test.txt \
  --cuda-device 0 \
  --use-dataset-reader \
  --dataset-reader-choice validation \
  --predictor wiki_title \
  --output-file output/CE-CLCNN/wiki_title/with_RE_and_WT/prediction_result.jsonl \
  --silent

If you want to use the following pre-trained model, you should download first and then execute the above command.

mkdir -p output/CE-CLCNN/wiki_title/ja/with_RE_and_WT
wget https://github.com/IyatomiLab/CE-CLCNN/raw/master/pretrained_models/CE-CLCNN/wiki_title/ja/with_RE_and_WT/model.tar.gz -P output/CE-CLCNN/wiki_title/ja/with_RE_and_WT

Pre-trained models

Dataset	Language	Model	Pre-trained Model
Wikipedia Title Dataset [3]	Chinese	CE-CLCNN (proposed)	[Download]
		CE-CLCNN w/ RE and WT (proposed)	[Download]
		Visual model [3]	[Download]
	Japanese	CE-CLCNN (proposed)	[Download]
		CE-CLCNN w/ RE and WT (proposed)	[Download]
		Visual model [3]	[Download]
	Korea	CE-CLCNN (proposed)	[Download]
		CE-CLCNN w/ RE and WT (proposed)	[Download]
		Visual model [3]	[Download]

Citation

@inproceedings{kitada2018end,
  title={End-to-end text classification via image-based embedding using character-level networks},
  author={Kitada, Shunsuke and Kotani, Ryunosuke and Iyatomi, Hitoshi},
  booktitle={2018 IEEE Applied Imagery Pattern Recognition Workshop (AIPR)},
  pages={1--4},
  year={2018},
  organization={IEEE},
  doi={10.1109/AIPR.2018.8707407},
}

Reference

[1] S. Kitada, R. Kotani, and H. Iyatomi. "End-to-end Text Classification via Image-based Embedding using Character-level Networks." In Proceedings of IEEE Applied Imagery Pattern Recognition (AIPR) Workshop. IEEE, 2018. doi: https://doi.org/10.1109/AIPR.2018.8707407
[2] Z. Zhong et al. "Random erasing data augmentation." In Proceedings of the AAAI Conference on Artificial Intelligence. Vol. 34. No. 07. 2020. doi: https://doi.org/10.1609/aaai.v34i07.7000
[3] F. Liu et al. "Learning Character-level Compositionality with Visual Features." In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 2017. doi: http://dx.doi.org/10.18653/v1/P17-1188

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
ceclcnn		ceclcnn
config		config
dataset/fonts		dataset/fonts
notebooks		notebooks
pretrained_models		pretrained_models
scripts/run_exps		scripts/run_exps
test_fixtures/data/dataset_readers		test_fixtures/data/dataset_readers
tests		tests
.allennlp_plugins		.allennlp_plugins
.gitignore		.gitignore
CITATION.cff		CITATION.cff
README.md		README.md
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

End-to-End Text Classification via Image-based Embedding using Character-level Networks

Install and Run the code

Install the requirements

Run training the model

Inference with test data using the pre-trained model

Pre-trained models

Citation

Reference

About

Releases

Packages

Languages

IyatomiLab/CE-CLCNN

Folders and files

Latest commit

History

Repository files navigation

End-to-End Text Classification via Image-based Embedding using Character-level Networks

Install and Run the code

Install the requirements

Run training the model

Inference with test data using the pre-trained model

Pre-trained models

Citation

Reference

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages