Skip to content

Commit 9a25294

Browse files
authored
Add example plugin & queries (#6)
1 parent 0571474 commit 9a25294

File tree

15 files changed

+296
-11
lines changed

15 files changed

+296
-11
lines changed

.github/workflows/testing.yml

Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -73,6 +73,20 @@ jobs:
7373
- name: "run tests"
7474
run: tox -e py38-nimare
7575

76+
run_test_plugin:
77+
name: "Test plugin"
78+
runs-on: "ubuntu-latest"
79+
steps:
80+
- uses: actions/checkout@v2
81+
- uses: actions/setup-python@v2
82+
with:
83+
python-version: "3.10"
84+
name: "setup python"
85+
- name: "install tox"
86+
run: pip install tox
87+
- name: "run tests"
88+
run: make test_plugin
89+
7690
build_doc:
7791
name: "build documentation"
7892
runs-on: "ubuntu-latest"

Makefile

Lines changed: 8 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,18 +1,18 @@
1-
.PHONY: test_all test test_coverage test_coverage_strict test_mypy \
1+
.PHONY: test_all test test_plugin test_coverage test_coverage_strict test_mypy \
22
test_flake8 test_pylint run_full_pipeline run_full_pipeline_neurosynth \
33
doc black clean clean_all
44

5-
test_all: test_mypy test_flake8 test_coverage_strict test test_pylint
5+
test_all: test_mypy test_flake8 test_coverage_strict test test_plugin test_pylint
66

77
test:
88
tox
99

1010
test_coverage_strict:
11-
pytest --cov=nqdc --cov-report=xml --cov-report=term --cov-fail-under=100
11+
pytest --cov=nqdc --cov-report=xml --cov-report=term --cov-fail-under=100 tests
1212
coverage html
1313

1414
test_coverage:
15-
pytest --cov=nqdc --cov-report=xml --cov-report=term
15+
pytest --cov=nqdc --cov-report=xml --cov-report=term tests
1616
coverage html
1717

1818
test_mypy:
@@ -25,6 +25,10 @@ test_flake8:
2525
test_pylint:
2626
pylint ./src
2727

28+
test_plugin:
29+
tox -e run_plugin
30+
tox -c docs/example_plugin/tox.ini
31+
2832
run_full_pipeline:
2933
python tests/run_full_pipeline.py -o /tmp/
3034

README.md

Lines changed: 8 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -75,9 +75,11 @@ articles. It can be simple such as `fMRI`, or more specific such as
7575
`fMRI[Abstract] AND (2000[PubDate] : 2022[PubDate])`. You can build the
7676
query using the [PMC advanced search
7777
interface](https://www.ncbi.nlm.nih.gov/pmc/advanced). For more information see
78-
[the E-Utilities help](https://www.ncbi.nlm.nih.gov/books/NBK3837/). The query
79-
can be passed either as a string on the command-line or by passing the path of a
80-
text file containing the query.
78+
[the E-Utilities help](https://www.ncbi.nlm.nih.gov/books/NBK3837/).
79+
Some examples are provided in the `nqdc` git repository, in `docs/example_queries`.
80+
81+
The query can be passed either as a string on the command-line or by passing the
82+
path of a text file containing the query.
8183

8284
If we have an Entrez API key (see details in the [E-utilities
8385
documentation](https://www.ncbi.nlm.nih.gov/books/NBK25497/)), we can provide it
@@ -665,6 +667,9 @@ All steps in `pipeline_steps` will be run when `nqdc run` is used. All steps in
665667
`name` of a standalone step is `my_plugin`, the `nqdc my_plugin` command will
666668
become available.
667669

670+
An example plugin that can be used as a template, and more details, are provided
671+
in the `nqdc` git repository, in `docs/example_plugin`.
672+
668673
# Contributing
669674

670675
Feedback and contributions are welcome. Development happens at the

docs/example_plugin/README.md

Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,15 @@
1+
# Example nqdc plugin
2+
3+
This is a plugin that does not do much (it plots the number of downloaded
4+
articles per publication year) to illustrate how to write a plugin and make it
5+
discoverable by nqdc, and to be used as a template.
6+
7+
Plugins are discovered through the `nqdc.plugin_processing_steps` entry point
8+
(see the [setuptools documentation on entry
9+
points](https://setuptools.pypa.io/en/latest/userguide/entry_point.html#entry-points-for-plugins)).
10+
11+
It is defined in the `get_nqdc_processing_steps` function (see
12+
`src/nqdc_example_plugin/__init__.py`), which is referenced in the
13+
`[options.entry_points]` section of `setup.cfg`. This allows our plugin to be
14+
invoked through the `nqdc` command and run as if it were a part of `nqdc`
15+
itself.

docs/example_plugin/pyproject.toml

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
[build-system]
2+
requires = ["setuptools", "wheel"]
3+
build-backend = "setuptools.build_meta"

docs/example_plugin/setup.cfg

Lines changed: 22 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,22 @@
1+
[metadata]
2+
name = nqdc_example_plugin
3+
4+
[options]
5+
packages = find:
6+
package_dir =
7+
=src
8+
install_requires =
9+
pandas
10+
matplotlib
11+
python_requires = >=3.7
12+
13+
[options.extras_require]
14+
dev =
15+
pytest
16+
17+
[options.packages.find]
18+
where = src
19+
20+
[options.entry_points]
21+
nqdc.plugin_processing_steps =
22+
get_nqdc_processing_steps = nqdc_example_plugin:get_nqdc_processing_steps
Lines changed: 147 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,147 @@
1+
"""Example nqdc plugin: plots the number of articles per publication year."""
2+
import argparse
3+
import logging
4+
from pathlib import Path
5+
from typing import Tuple, Mapping, Optional, Union, Dict, List
6+
7+
import pandas as pd
8+
9+
ArgparseActions = Union[argparse.ArgumentParser, argparse._ArgumentGroup]
10+
11+
_LOG = logging.getLogger(__name__)
12+
_STEP_NAME = "plot_pub_dates"
13+
_STEP_DESCRIPTION = "Example plugin: plot histogram of publication years."
14+
15+
16+
def plot_publication_dates(extracted_data_dir: Path) -> Tuple[Path, int]:
17+
"""Make a bar plot of the number of articles per year.
18+
19+
Parameters
20+
----------
21+
extracted_data_dir
22+
The directory containing the articles' metadata. It is a directory
23+
created by `nqdc.extract_data_to_csv`: it contains a file named
24+
`metadata.csv`.
25+
26+
Returns
27+
-------
28+
output_dir
29+
The directory where the plot is stored.
30+
exit_code
31+
Always 0, used by nqdc command-line interface.
32+
33+
"""
34+
output_dir = extracted_data_dir.with_name(
35+
extracted_data_dir.name.replace(
36+
"_extractedData", "_examplePluginPubDatesPlot"
37+
)
38+
)
39+
output_dir.mkdir(exist_ok=True)
40+
meta_data = pd.read_csv(str(extracted_data_dir.joinpath("metadata.csv")))
41+
min_year, max_year = (
42+
meta_data["publication_year"].min(),
43+
meta_data["publication_year"].max(),
44+
)
45+
years = list(range(min_year, max_year + 2))
46+
ax = meta_data["publication_year"].hist(
47+
bins=years, grid=False, rwidth=0.5, align="left"
48+
)
49+
ax.set_xticks(years[:-1])
50+
ax.set_xlabel("Publication year")
51+
ax.set_ylabel("Number of articles")
52+
output_file = output_dir.joinpath("plot.png")
53+
ax.figure.savefig(str(output_file))
54+
_LOG.info(f"Publication dates histogram saved in {output_file}.")
55+
return output_dir, 0
56+
57+
58+
class PlotPubDatesStep:
59+
"""Plot publication dates as part of a pipeline (`nqdc run`)."""
60+
61+
# Used for the command-line help
62+
name = _STEP_NAME
63+
short_description = _STEP_DESCRIPTION
64+
65+
def edit_argument_parser(self, argument_parser: ArgparseActions) -> None:
66+
"""Add an argument to indicate if the plugin should run.
67+
68+
When `nqdc run` is invoked, this optional step is executed only if the
69+
`--plot_pub_dates` flag is passed on the command line.
70+
71+
"""
72+
argument_parser.add_argument(
73+
"--plot_pub_dates",
74+
action="store_true",
75+
help="Save a histogram plot of publication years of "
76+
"downloaded articles.",
77+
)
78+
79+
def run(
80+
self,
81+
args: argparse.Namespace,
82+
previous_steps_output: Mapping[str, Path],
83+
) -> Tuple[Optional[Path], int]:
84+
"""Execute this step: plot the publication dates."""
85+
# `args` are the command-line arguments, we check if running this
86+
# plugin was required with the `--plot_pub_dates` argument.
87+
if not args.plot_pub_dates:
88+
return None, 0
89+
# `previous_steps_output` maps step names to the directories where they
90+
# stored their output; we need the metadata generated by the
91+
# `extract_data` step.
92+
return plot_publication_dates(previous_steps_output["extract_data"])
93+
94+
95+
class StandalonePlotPubDatesStep:
96+
"""Plot publication dates as a standalone step (`nqdc plot_pub_dates`)."""
97+
98+
name = _STEP_NAME
99+
short_description = _STEP_DESCRIPTION
100+
101+
def edit_argument_parser(self, argument_parser: ArgparseActions) -> None:
102+
"""Add an argument to specify the extracted data dir.
103+
104+
This directory contains the metadata file that provides the publication
105+
dates.
106+
107+
"""
108+
argument_parser.add_argument(
109+
"extracted_data_dir",
110+
help="Directory containing extracted data CSV files."
111+
"It is a directory created by nqdc whose name ends "
112+
"with 'extractedData'.",
113+
)
114+
115+
def run(
116+
self,
117+
args: argparse.Namespace,
118+
previous_steps_output: Mapping[str, Path],
119+
) -> Tuple[Path, int]:
120+
"""Execute the `nqdc plot_pub_dates` command."""
121+
# In this case the plugin is run on its own rather than as a step in
122+
# the full pipeline, so the `extracted_data_dir` is not produced by a
123+
# previous step but it is passed as a command-line argument.
124+
return plot_publication_dates(args.extracted_data_dir)
125+
126+
127+
def get_nqdc_processing_steps() -> Dict[str, List]:
128+
"""Entry point used by nqdc.
129+
130+
Needed to discover the plugin steps and add them to the command-line
131+
interface. It returns a mapping with 2 (optional) keys: "pipeline_steps"
132+
for steps that must be added to the full pipeline (executed when `nqdc run`
133+
is invoked), and "standalone_steps" for steps that run on their own (are
134+
added as separate subcommands, in this case `nqdc plot_pub_dates`).
135+
136+
The values are lists of objects that provide the same interface as
137+
`nqdc.BaseProcessingStep`: they have `name` and `short_description`
138+
attributes, and `edit_argument_parser` and `run` methods.
139+
140+
This entry point must be referenced in the `[options.entry_points]` section
141+
in `setup.cfg`.
142+
143+
"""
144+
return {
145+
"pipeline_steps": [PlotPubDatesStep()],
146+
"standalone_steps": [StandalonePlotPubDatesStep()],
147+
}
Lines changed: 29 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,29 @@
1+
#! /usr/bin/env python3
2+
3+
from pathlib import Path
4+
import subprocess
5+
import tempfile
6+
7+
with tempfile.TemporaryDirectory(suffix="_nqdc") as tmp_dir:
8+
subprocess.run(
9+
[
10+
"nqdc",
11+
"run",
12+
"--plot_pub_dates",
13+
"-q",
14+
"fMRI[Abstract] AND aphasia[Title] "
15+
"AND (2017[PubDate] : 2019[PubDate])",
16+
tmp_dir,
17+
]
18+
)
19+
assert (
20+
Path(tmp_dir)
21+
.joinpath(
22+
"query-49e0abb9869a532a31d37ed788c76780",
23+
"subset_allArticles_examplePluginPubDatesPlot",
24+
"plot.png",
25+
)
26+
.is_file()
27+
), "Plugin output not found!"
28+
29+
print("nqdc and plugin ran successfully")
Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,13 @@
1+
import pandas as pd
2+
import nqdc_example_plugin
3+
4+
5+
def test_example_plugin(tmp_path):
6+
meta_data = pd.DataFrame(
7+
{"pmcid": [1, 2, 3], "publication_year": [2018, 2020, 2020]}
8+
)
9+
extracted_data = tmp_path.joinpath("nqdc_extractedData")
10+
extracted_data.mkdir()
11+
meta_data.to_csv(extracted_data.joinpath("metadata.csv"), index=False)
12+
out_dir, code = nqdc_example_plugin.plot_publication_dates(extracted_data)
13+
assert out_dir.joinpath("plot.png").is_file()

docs/example_plugin/tox.ini

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,9 @@
1+
[tox]
2+
isolated_build = True
3+
envlist = py
4+
5+
[testenv]
6+
deps =
7+
pytest
8+
commands =
9+
pytest tests

0 commit comments

Comments
 (0)