Skip to content
@IndoNLP

IndoNLP

We are researchers who push up the lower bound of the Indonesian NLP standard. We are collaborating to release new data resources and benchmarks.

Pinned Loading

  1. indonlu Public

    The first-ever vast natural language processing benchmark for Indonesian Language. We provide multiple downstream tasks, pre-trained IndoBERT models, and a starter code! (AACL-IJCNLP 2020)

    Jupyter Notebook 584 199

  2. nusa-crowd Public

    A collaborative project to collect datasets in Indonesian languages.

    Jupyter Notebook 267 62

  3. nusax Public

    High-quality parallel resource on sentiment analysis for 10 low-resource Indonesian languages, English, and Indonesian (Outstanding Paper at EACL 2023)

    Jupyter Notebook 97 10

  4. indonlg Public

    The first-ever vast natural language generation benchmark for Indonesian, Sundanese, and Javanese. We provide multiple downstream tasks, pre-trained IndoGPT and IndoBART models, and a starter code!…

    Python 71 12

Repositories

Showing 10 of 10 repositories
  • .github Public

    Landing page

    1 0 0 0 Updated Mar 10, 2025
  • indonlg Public

    The first-ever vast natural language generation benchmark for Indonesian, Sundanese, and Javanese. We provide multiple downstream tasks, pre-trained IndoGPT and IndoBART models, and a starter code! (EMNLP 2021)

    Python 71 Apache-2.0 12 1 0 Updated Nov 16, 2024
  • indonlu Public

    The first-ever vast natural language processing benchmark for Indonesian Language. We provide multiple downstream tasks, pre-trained IndoBERT models, and a starter code! (AACL-IJCNLP 2020)

    Jupyter Notebook 584 Apache-2.0 199 5 1 Updated Nov 16, 2024
  • nusa-writes Public

    NusaWrites is an in-depth analysis of corpora collection strategy and a comprehensive language modeling benchmark for underrepresented and extremely low-resource Indonesian local languages.

    Jupyter Notebook 25 Apache-2.0 2 0 0 Updated Sep 27, 2024
  • cendol Public

    Indonesian T0 | Instruction-tuning for low-resource and extremely low-resource Austronesian languages

    Jupyter Notebook 14 Apache-2.0 1 0 1 Updated Jun 24, 2024
  • nusa-crowd Public

    A collaborative project to collect datasets in Indonesian languages.

    Jupyter Notebook 267 Apache-2.0 62 35 (5 issues need help) 2 Updated Jun 2, 2024
  • nusa-catalogue Public

    Dataset Catalogue Homepage for Indonesian Languages

    JavaScript 7 Apache-2.0 8 1 0 Updated Feb 19, 2024
  • nusax Public

    High-quality parallel resource on sentiment analysis for 10 low-resource Indonesian languages, English, and Indonesian (Outstanding Paper at EACL 2023)

    Jupyter Notebook 97 Apache-2.0 10 0 0 Updated May 8, 2023
  • nusacrowd-asr Public

    NusaCrowd ASR Experiment

    Jupyter Notebook 2 Apache-2.0 0 0 0 Updated Jan 5, 2023
  • SCSS 1 Apache-2.0 1 0 0 Updated Jun 12, 2022