Metric | Score |
---|---|
Corpus BLEU | 37.0347 |
Dev ppl | 61.4084 |
- You can find the model weights here
- The translation demo is available here on Streamlit Sharing.
- Even is you don't know Spanish you can use the demo as there is Google Translate which will help you to convert your English sentences to Spanish.
- It is a Seq2Seq Model that translates Spanish sentences into English based on Luong et al. 2015.
- It consists of a bidirectional LSTM encoder and unidirectional LSTM decoder.
- It also uses attention mechanism to boost its performance on the translation task.
- The pipeline and the implementations is inspired by the Open-NMT package.
- The model becomes more powerful as we combine character-level with word-level language modelling.
- The idea is that whenever the NMT model generates a <unk> token we run a character-level language model and generate a word in the output character by character.
- This hybrid word-character approach was proposed by Luong and Manning 2016 and turned out to be effective in increasing the performance of the NMT model (+1.2 BLEU).
Install from source:
git clone https://github.com/sahilkhose/Neural_Machine_Translation
cd Neural_Machine_Translation
pip3 install -r requirements.txt
To run the translation demo:
streamlit run stream_translate.py
Or just go here on Streamlit Sharing.
If you find a bug, create a GitHub issue, or even better, submit a pull request. Similarly, if you have questions, simply post them as GitHub issues.