This project has been moved under OdiaNLP GitHub organization. For more details please visit: https://odianlp.github.io/
Approx. Number of En-Or parallel reviewed pairs: 42,000
This is a healthy start to building Automated machine translation for English
to Odia
language.
This has been built mainly to help increase quality odia wikipedia articles by translating for English Wikipedia.
The approach is to start building a parallel corpus between English and Odia language which can later be used in SMT(Statistical Machine Translation) or NMT (Neural Machine Translation) in future by interested people.
Around 9000 English-Odia un-reviewed raw parallel pairs dump available in this file as pipe separated phrases or sentences.
For more details visit the website of this repository : MTEnglish2Odia
- Click here to read a general guide on how to contribute to a Github open source project for beginners.
- You can send English-Odia word/phrase/sentence pairs on the below format in a new file, under your name and types of data.
- Please put the file under Individual_files
- For e.g. if your name is Satyabrata, you want to upload generic phrases:
Key | Example |
---|---|
Filename | satyabrata.txt |
File upload path | data/Individual_files/satyabrata.txt |
File text format | `Why are you so lazy? |
Please make sure you have correct permissions to upload this data in GPL license.
- Tutorial on how to fork a repository and send a PR can be found in this video or this video or this Github doc tutorial for fork and this one for pull request
- Your Pull Request will be reviewed first.
- Please follow up if any comments or modifications are needed on your Pull Request.
- In case of any confusion please contact on [email protected]. You will get a response within a day or two.
GPL v3.0
ଇଂରାଜୀରୁ ଓଡ଼ିଆ ଭାଷାକୁ ମେସିନ ଟ୍ରାନ୍ସଲେସନଦ୍ୱାରା ଅନୁବାଦ କରିବାକୁ ଏହି ପ୍ରକଳ୍ପଟି ତିଆରି ହୋଇଛି । ଏହା ମୁଖ୍ୟତଃ ଓଡ଼ିଆ ଉଇକିପିଡ଼ିଆରେ ଗୁଣାତ୍ମକ ପୃଷ୍ଠାଗୁଡ଼ିକର ସଂଖ୍ୟା ବୃଦ୍ଧି କରିବାକୁ ଗଠନ କରାଯାଇଛି । ବର୍ତ୍ତମାନର ଯୋଜନା ହିସାବରେ ପ୍ରଥମେ ଇଂରାଜୀ-ଓଡ଼ିଆ ଅନୁବାଦର ପାରାଲେଲ ତଥ୍ୟ ସଂଗ୍ରହ ହେବ । ଯଥେଷ୍ଟ ପରିମାଣର ତଥ୍ୟ ସଂଗ୍ରହ ପରେ ଏହାକୁ ପ୍ରଥମେ ଷ୍ଟାଟିଷ୍ଟିକାଲ ମେସିନ ଟ୍ରାନ୍ସଲେସନ ଏବଂ ପରେ ନ୍ୟୂରାଲ ମେସିନ ଟ୍ରାନ୍ସଲେସନ ଦ୍ୱାରା ଉପଯୋଗ କରାଯାଇ ଅନୁବାଦର ଶୁଦ୍ଧତ୍ତା ହିସାବ କରାଯିବ । ଚଳନୀୟ ଶୁଦ୍ଧତା ହାସଲ ପରେ ଏହାକୁ ସର୍ବସାଧାରଣଙ୍କ ନିମନ୍ତେ ଉତ୍ସର୍ଗୀକୃତ କରାଯିବ ।
Thanks goes to these wonderful people (emoji key):
subhadarship 💻 🎨 🤔 |
kamakshyaP 🖋 |
Soumendra kumar sahoo 🤔 🎨 📖 💻 🖋 |
This project follows the all-contributors specification. Contributions of any kind welcome!