Validation and scoring scripts for Task 1 of the PEGS Challenge.
For Task 2, see writeup-workflow/
.
Metrics returned and used for ranking are:
-
Primary: area under the receiver operating characteristic curve (AUROC)
-
Secondary (used for ties): area under the precision-recall curve (AUPRC)
python validate.py \
-p PATH/TO/PREDICTIONS_FILE.CSV \
-g PATH/TO/GOLDSTANDARD_FOLDER [-o RESULTS_FILE]
If -o/--output
is not provided, then full results will output to results.json
.
What it will check for:
- two columns named
id
anddisease_probability
(extraneous columns will be ignored) id
values are stringsdisease_probability
values are floats between 0.0 and 1.0, and cannot be null/None- there is one prediction per patient (so, no missing patient IDs or duplicate patient IDs)
- there are no extra predictions (so, no unknown patient IDs)
The script will either print to STDOUT, VALIDATED
or INVALID
.
python score.py \
-p PATH/TO/PREDICTIONS_FILE.CSV \
-g PATH/TO/GOLDSTANDARD_FOLDER [-o RESULTS_FILE]
If -o/--output
is not provided, then results will output to results.json
.
The script will either print to STDOUT, SCORED
or INVALID
.
Results will be outputted to output/results.json
in your current working directory (assuming you mount $PWD/output
).
docker run --rm \
-v /PATH/TO/PREDICTIONS_FILE.CSV:/predictions.csv:ro \
-v /PATH/TO/GOLDSTANDARD_FOLDER:/goldstandard:ro \
-v $PWD/output:/output:rw \
ghcr.io/sage-bionetworks-challenges/pegs-evaluation:latest \
python3 validate.py \
-p /predictions.csv -g /goldstandard -o /output/results.json
docker run --rm \
-v /PATH/TO/PREDICTIONS_FILE.CSV:/predictions.csv:ro \
-v /PATH/TO/GOLDSTANDARD_FOLDER:/goldstandard:ro \
-v $PWD/output:/output:rw \
ghcr.io/sage-bionetworks-challenges/pegs-evaluation:latest \
python3 score.py \
-p /predictions.csv -g /goldstandard -o /output/results.json