Infrastructure Resilience Assessment Data Packages

Standalone workflow to create national scale open-data packages from global open datasets.

Setup

Get the latest code by cloning this repository:

git clone [email protected]:nismod/irv-datapkg.git

or

git clone https://github.com/nismod/irv-datapkg.git

Install Python and packages - suggest using micromamba:

micromamba create -f environment.yml

Activate the environment:

micromamba activate datapkg

Run

The data packages are produced using a snakemake workflow.

The workflow expects ZENODO_TOKEN, CDSAPI_KEY and CDSAPI_URL to be set as environment variables - these must be set before running any workflow steps.

If not interacting with Zenodo or the Copernicus Climate Data Store, these can be dummy strings:

echo "placeholder" > ZENODO_TOKEN
echo "https://cds-beta.climate.copernicus.eu/api" > CDSAPI_URL
echo "test" > CDSAPI_KEY

See Climate Data Store API docs and Zenodo API docs for access details.

Export from the file to the environment:

export ZENODO_TOKEN=$(cat ZENODO_TOKEN)
export CDSAPI_KEY=$(cat CDSAPI_KEY)
export CDSAPI_URL=$(cat CDSAPI_URL)

Check what will be run, if we ask for everything produced by the rule all, before running the workflow for real:

snakemake --dry-run all

Run the workflow, asking for all, using 8 cores, with verbose log messages:

snakemake --cores 8 --verbose all

Upload and publish

To publish, first create a Zenodo token, save it and export it as the ZENODO_TOKEN environment variable.

Upload a single data package:

snakemake --cores 1 zenodo/GBR.deposited

Publish (cannot be undone) either programmatically:

snakemake --cores 1 zenodo/GBR.published

Or after review online, through the Zenodo website (sandbox, live)

Post-publication

To get a quick list of DOIs from the Zenodo package json:

cat zenodo/*.deposition.json | jq '.metadata.prereserve_doi.doi'

To generate records.csv with details of published packages:

python scripts/published_metadata.py

Development Notes

In case of warnings about GDAL_DATA not being set, try running:

export GDAL_DATA=$(gdal-config --datadir)

To format the workflow definition Snakefile:

snakefmt Snakefile

To format the Python helper scripts:

black scripts

Related work

These Python libraries may be a useful place to start analysis of the data in the packages produced by this workflow:

snkit helps clean network data
nismod-snail is designed to help implement infrastructure exposure, damage and risk calculations

The open-gira repository contains a larger workflow for global-scale open-data infrastructure risk and resilience analysis.

Acknowledgments

MIT License, Copyright (c) 2023 Tom Russell and irv-datapkg contributors

This research received funding from the FCDO Climate Compatible Growth Programme. The views expressed here do not necessarily reflect the UK government's official policies.

Name		Name	Last commit message	Last commit date
Latest commit History 104 Commits
.github/workflows		.github/workflows
bundled_data		bundled_data
config		config
metadata		metadata
rules		rules
scripts		scripts
src/irv_datapkg		src/irv_datapkg
tests		tests
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
Snakefile		Snakefile
environment.yml		environment.yml
pyproject.toml		pyproject.toml
records.csv		records.csv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Infrastructure Resilience Assessment Data Packages

Setup

Run

Upload and publish

Post-publication

Development Notes

Related work

Acknowledgments

About

Uh oh!

Releases 6

Packages

Uh oh!

Contributors 3

Uh oh!

Languages

License

nismod/irv-datapkg

Folders and files

Latest commit

History

Repository files navigation

Infrastructure Resilience Assessment Data Packages

Setup

Run

Upload and publish

Post-publication

Development Notes

Related work

Acknowledgments

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 6

Packages 0

Uh oh!

Contributors 3

Uh oh!

Languages

Packages