Detection, Evaluation and Mitigation of Language Artefacts in the Competition On Legal Information Extraction and Entailment Dataset

This REPO is still WIP. The scripts are not yet finalized.

This repository consists of the codebase for the thesis work on detecting, evaluating and mitigating the language/dataset artefacts in the legal information entailment dataset.

The codebase is categorized into separate folders containing Python notebooks for conducting the experiments.

data —> This folder is a placeholder to place the datasets needed for analysis

src/data scripts —> This folder contains scripts needed for data analysis and data preprocessing.

src/detection —> This folder contains the scripts needed for artefact detection in the dataset.

src/evaluation —> This folder contains the scripts needed for evaluating the BERT-based models for robustness.

src/mitigation —> This folder contains the necessary scripts for data augmentation to mitigate the contradiction word and word overlap artefacts.

Name		Name	Last commit message	Last commit date
Latest commit History 19 Commits
src		src
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Detection, Evaluation and Mitigation of Language Artefacts in the Competition On Legal Information Extraction and Entailment Dataset

This REPO is still WIP. The scripts are not yet finalized.

About

Releases

Packages

Languages

License

VenkateshDas/nli_artefacts_detection

Folders and files

Latest commit

History

Repository files navigation

Detection, Evaluation and Mitigation of Language Artefacts in the Competition On Legal Information Extraction and Entailment Dataset

This REPO is still WIP. The scripts are not yet finalized.

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages