Linguistically Motivated Neural Machine Translation

The official repo for the tutorial "Linguistically Motivated Neural Machine Translation" in The 25th Annual Conference of the European Association for Machine Translation (EAMT 2024).

Slides


	Part 1: Introduction slide
	Part 2: Linguistic Features and Encoder slide
	Part 3: Subword, Decoder, Evaluation slide

Presenters


	Haiyue Song is a technical researcher at the Advanced Translation Technology Laboratory, National Institute of Information and Communications Technology (NICT), Japan. He obtained his Ph.D. at Kyoto University. His research interests include machine translation, large language models, subword segmentation, and decoding algorithms. He has MT and LLMs related publications in TALLIP, AACL, LREC, ACL, and EMNLP.
	Hour Kaing is a researcher at the Advanced Translation Technology Laboratory, National Institute of Information and Communications Technology (NICT), Japan. He received his B.S. from Institute of Technology of Cambodia, Cambodia, his M.Sc from University of Grenoble 1, France, and his Ph.D. from NARA Institute of Science and Technology, Japan. He is interested in linguistic analysis, low-resource machine translation, language modeling, and speech processing. He has publications in TALLIP, EACL, PACLIC, LREC, and IWSLT.
	Raj Dabre is a senior researcher at the Advanced Translation Technology Laboratory, National Institute of Information and Communications Technology (NICT), Japan and an Adjunct Faculty at IIT Madras, India. He received his Ph.D. from Kyoto University and Masters from IIT Bombay. His primary interests are in low-resource NLP, language modeling and efficiency. He has published in ACL, EMNLP, NAACL, TMLR, AAAI, AACL, IJCNLP and CSUR.

Programme

Date: June 27, 2024 (Thursday), 9:00 AM - 12:30 PM

Time
9:00-9:20	Introduction
9:20-10:20	Augmenting NMT Architectures with Linguistic Features
10:20-10:50	Coffee break
10:50-11:20	Linguistically Motivated Tokenization and Transfer Learning
11:20-11:40	Linguistically Aware Decoding
11:40-12:00	Linguistically Motivated Evaluation
12:00-12:15	Conclusions
12:15-12:30	QA

Introduction

The tutorial focuses on incorporating linguistics into different stages of the neural machine translation (NMT) pipeline, from pre-processing to model training to evaluation.

Tutorial Overview

Relevance to the MT Community

For machine translation (MT) tasks, purely data-driven approaches have been dominant in recent years, and language knowledge-related approaches are often neglected. This tutorial aims to highlight the importance of linguistic knowledge, especially for low-resource languages where training data is limited.

Outline

Introduction to Neural Machine Translation
Linguistically Motivated Tokenization and Transfer Learning
Augmenting NMT Architectures with Linguistic Features
Linguistically Aware Decoding
Linguistically Motivated Evaluation
Limitations and Future Directions
Summary and Conclusion
Discussion and Q/A

📖 Reading List

Authors

Haiyue Song, Hour Kaing, Raj Dabre
National Institute of Information and Communications Technology (NICT)
Hikaridai 3-5, Seika-cho, Soraku-gun, Kyoto, Japan

Emails:

Haiyue Song: [email protected]
Hour Kaing: [email protected]
Raj Dabre: [email protected]

Citation (bib)

@article{linguistic-mt24,
  title={Linguistically Motivated Neural Machine Translation},
  author={Song, Haiyue and Kaing, Hour and Dabre, Raj},
  booktitle={The 25th Annual Conference of the European Association for Machine Translation (EAMT 2024)},
  year={2024}
}

Name		Name	Last commit message	Last commit date
Latest commit History 32 Commits
fig		fig
slide		slide
part_1_intro.pdf		part_1_intro.pdf
part_2_encoder.pdf		part_2_encoder.pdf
part_3_subword.pdf		part_3_subword.pdf
readme.md		readme.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Linguistically Motivated Neural Machine Translation

Slides

Presenters

Programme

Date: June 27, 2024 (Thursday), 9:00 AM - 12:30 PM

Introduction

Tutorial Overview

Relevance to the MT Community

Outline

📖 Reading List

1. Introduction to Neural Machine Translation

2. Linguistically Motivated Tokenization and Transfer Learning

3. Augmenting NMT Architectures with Linguistic Features

4. Linguistically Aware Decoding

5. Linguistically Motivated Evaluation

Authors

Citation (bib)

About

Releases

Packages

prajdabre/eamt24-linguistic-mt

Folders and files

Latest commit

History

Repository files navigation

Linguistically Motivated Neural Machine Translation

Slides

Presenters

Programme

Date: June 27, 2024 (Thursday), 9:00 AM - 12:30 PM

Introduction

Tutorial Overview

Relevance to the MT Community

Outline

📖 Reading List

1. Introduction to Neural Machine Translation

2. Linguistically Motivated Tokenization and Transfer Learning

3. Augmenting NMT Architectures with Linguistic Features

4. Linguistically Aware Decoding

5. Linguistically Motivated Evaluation

Authors

Citation (bib)

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Packages