- TensorFlow >= 1.3.0
- tqdm >= 4.14.0
- python-Levenshtein >= 0.12.0
- setproctitle >= 1.1.10
- seaborn >= 0.7.1
- Phone (39, 48, 61 phones)
- character
- Phone (under implementation)
- Character
- Word
- Phone (under implementation)
- Japanese kana character (about 150 classes)
- Japanese kanji characters (about 3000 classes)
These corpuses will be added in the future.
- Switchboard
- WSJ
- AMI
This repository does'nt include pre-processing and pre-processing is based on this repo. If you want to do pre-processing, please look at this repo.
- BLSTM
- LSTM
- BGRU
- GRU
- VGG-BLSTM
- VGG-LSTM
- Multi-task BLSTM
- you can set another CTC layer to the aubitrary layer.
- Multi-task LSTM
- VGG
Connectionist Temporal Classification (CTC) [Graves+ 2006]
- Greedy decoder
- Beam Search decoder
- Beam Search decoder w/ CharLM (under implementation)
- Frame-stacking [Sak+ 2015]
- Multi-GPUs training (synchronous)
- Splicing
- Down sampling (under implementation)
- Greedy decoder
- Beam search decoder (under implementation)
- Bahdanau's content-based attention
- Bahdanau's normed content-based attention (under implementation)
- location-based attention
- Hybrid attention
- Luong's dot attention
- Luong's scaled dot attention (under implementation)
- Luong's general attention
- Luong's concat attention
- Baidu's attention (under implementation)
- Sharpning
- Temperature regularization in the softmax layer (Output posteriors)
- Joint CTC-Attention [Kim 2016]
- Coverage (under implementation)
Please refer to docs in each corpuse
- TIMIT
- LibriSpeech
- CSJ
MIT