A repository to reproduce the experiments from the paper "Towards Computationally Feasible Deep Active Learning".
Install the library:
pip install -e .
The configs
folder contains config files with general settings. The experiments
folder contains config files with experimental design. To run an experiment with a chosen configuration, specify config file name in HYDRA_CONFIG_NAME
variable and run train.sh
script (see ./examples/al
for details).
For example to launch PLASM on AG-News with ELECTRA as a successor model:
cd PATH_TO_THIS_REPO
HYDRA_CONFIG_PATH=../experiments/ag_news HYDRA_EXP_CONFIG_NAME=ag_plasm python active_learning/run_tasks_on_multiple_gpus.py
cuda_devices
: list of CUDA devices to use: one experiment on one CUDA device.cuda_devices=[0,1]
means using zero-th and first devices.config_name
: name of config from configs folder with general settings: dataset, experiment setting (e.g. LC/ASM/PLASM), model checkpoints, hyperparameters etc.config_path
: path to config with general settings.command
: .py file to run. For AL experiments, use run_active_learning.py.args
: arguments to modify from a general config in the current experiment.acquisition_model.name=xlnet-base-cased
means that xlnet-base-cased will be used as an acquisition model.seeds
: random seeds to use.seeds=[4837, 23419]
means that two separate experiments with the same settings (except for seed) will be run: one with seed == 4837, one with seed == 23419.
By default, the results will be present in the folder RUN_DIRECTORY/workdir_run_active_learning/DATE_OF_RUN/${TIME_OF_RUN}_${SEED}_${MODEL_CHECKPOINT}
. For instance, when launching from the repository folder: al_nlp_feasible/workdir/run_active_learning/2022-06-11/15-59-31_23419_distilbert_base_uncased_bert_base_uncased
.
- When running a classic AL experiment (acquisition and successor models coincide, regardless of using UPS), the file with the model metrics is
acquisition_metrics.json
. - When running an acquisition-successor mismatch experiment, the file with the model metrics is
successor_metrics.json
. - When running a PLASM experiment, the file with the model metrics is
target_tracin_quantile_-1.0_metrics.json
(-1.0 stands for the filtering value, meaning adaptive filtering rate; when using a deterministic filtering rate (e.g. 0.1), the file will be namedtarget_tracin_quantile_0.1_metrics.json
). The file with the metrics of the model without filtering istarget_metrics.json
.
The research has employed 2 NER datasets (CoNLL-2003, OntoNotes-2012) and 2 Text Classification (CLS) datasets (AG-News, IMDB). If one wants to launch an experiment on a custom dataset, they need to use one of the following ways to add it:
- Upload to Hugging Face datasets and set:
config.data.path=datasets, config.data.dataset_name=DATASET_NAME, config.data.text_name=COLUMN_WITH_TEXT_OR_TOKENS_NAME, config.data.label_name=COLUMN_WITH_LABELS_OR_NER_TAGS_NAME
- Upload to data/DATASET_NAME folder, create train.csv / train.json file with the dataset, and set:
config.data.path=PATH_TO_THIS_REPO/data, config.data.dataset_name=DATASET_NAME, config.data.text_name=COLUMN_WITH_TEXT_OR_TOKENS_NAME, config.data.label_name=COLUMN_WITH_LABELS_OR_NER_TAGS_NAME
- * Upload to data/DATASET_NAME train.txt, dev.txt, and test.txt files and set the arguments as in the previous point.
- ** Upload to data/DATASET_NAME with each folder for each class, where each file in the folder contains a text with the label of the folder. For details, please see the bbc_news dataset in ./data. The arguments must be set as in the previous two points.
* - only for NER datasets
** - only for CLS datasets
The current version of the repository supports all models from HuggingFace Transformers, which can be used with AutoModelForSequenceClassification
/ AutoModelForTokenClassification
classes (for CLS / NER). For CNN-based / BiLSTM-CRF models, please see the al_cls_cnn.yaml / al_ner_bilstm_crf.yaml configs from ./configs folder for details.
@inproceedings{tsvigun-etal-2022-plasm,
title = "Towards Computationally Feasible Deep Active Learning",
author = "Tsvigun, Akim and
Shelmanov, Artem and
Kuzmin, Gleb and
Sanochkin, Leonid and
Larionov, Daniil and
Gusev, Gleb and
Avetisian, Manvel and
Zhukov, Leonid",
booktitle = "Findings of the Association for Computational Linguistics: NAACL 2022",
month = jul,
year = "2022",
address = "Seattle, United States",
publisher = "Association for Computational Linguistics",
url = "https://aclanthology.org/2022.findings-naacl.90",
pages = "1198--1218",
}
© 2022 Autonomous Non-Profit Organization "Artificial Intelligence Research Institute" (AIRI). All rights reserved.
Licensed under the GNU GPLv3 License.