This repository provides cookiecutter templates for reproducible research projects. The templates do not attempt to be generic, but have a clear and opinionated focus.
Projects build with these templates aim at full automation, and use Python 3.11
, mamba/conda
, Git
, Snakemake
, and pandoc
to create a HTML report out of raw data, code, and Markdown
text. Fork, clone, or download this repository on GitHub if you want to change any of these.
The template includes a few lines of code as a demo to allow you to create a HTML report out of made-up simulation results right away. Read the README.md
in the generated repository to see how.
default
This generates the basic structure of a reproducible workflow.
cluster
The cluster template extends the basic template by adding infrastructure to support running on a compute cluster.
Make sure you have cookiecutter installed, otherwise install it with conda:
conda install cookiecutter -c conda-forge
Then create a repository using:
cookiecutter gh:timtroendle/cookiecutter-reproducible-research --directory=[default/cluster]
You will be asked for the following parameters:
Parameter | Description |
---|---|
project_name |
The name of your project, used in the documentation and report. |
project_short_name |
An abbreviation, used for environments and such. Avoid special characters and whitespace. |
author |
Your name. |
institute |
The name of your institute, used for report metadata. |
short_description |
A short description of the project, used for documentation and report. |
path_to_conda_envs |
The path to the directory hosting your conda envs (leave untouched for Snakemake default). |
The cluster
template requires the following parameter values in addition:
Parameter | Description |
---|---|
cluster_url |
The address of the cluster to allow syncing to and from the cluster. |
cluster_base_dir |
The base path for the project on the cluster (default: ~/<project-short-name> ). |
cluster_type |
The type of job scheduler used on the cluster. Currently, only LSF is supported. |
The generated repository will have the following structure:
├── config <- Configuration files, e.g., for your model if needed.
│ └── default.yaml <- Default set of configuration parameter values.
├── data <- Raw input data.
├── envs <- Execution environments.
│ ├── default.yaml <- Default execution environment.
│ ├── report.yaml <- Environment for compilation of the report.
│ └── test.yaml <- Environment for executing tests.
├── profiles <- Snakemake profiles.
│ └── default <- Default Snakemake profile folder.
│ └── config.yaml <- Default Snakemake profile.
├── report <- All files creating the final report, usually text and figures.
│ ├── apa.csl <- Citation style definition to be used in the report.
│ ├── literature.yaml <- Bibliography file for the report.
│ ├── report.md <- The report in Markdown.
│ └── pandoc-metadata.yaml<- Metadata for the report.
├── rules <- The place for all your Snakemake rules.
├── scripts <- Scripts go in here.
│ ├── model.py <- Demo file.
│ └── vis.py <- Demo file.
├── tests <- Automatic tests of the source code go in here.
│ └── test_model.py <- Demo file.
├── .editorconfig <- Editor agnostic configuration settings.
├── .ruff <- Linter and formatter settings for ruff.
├── .gitignore
├── environment.yaml <- A file to create an environment to execute your project in.
├── LICENSE.md <- MIT license description
├── Snakefile <- Description of all computational steps to create results.
└── README.md
cluster
templates additionally contain the following files:
├── envs
│ └── shell.yaml <- An environment for shell rules.
├── profiles
│ └── cluster <- Cluster Snakemake profile folder.
│ └── config.yaml <- Cluster Snakemake profile.
├── rules
│ └── sync.yaml <- Snakemake rules to sync to and from the cluster.
├── .syncignore-receive <- Build files to ignore when receiving from the cluster.
└── .syncignore-send <- Local files to ignore when sending to the cluster.
Some ideas for this cookiecutter template are taken from cookiecutter-data-science and mkrapp/cookiecutter-reproducible-science. This template is MIT licensed itself.