echoIA Datasets Registry

This repository is the index and governance hub for datasets curated by the echoIA Collaboration. It is where you create, update, and register echoIA datasets before publishing them on Zenodo.

How to use this repository

Use this repo to scaffold and register datasets following echoIA’s standardized schemas. Each schema (meaning your dataset type e.g. aia-lum, aia-z, eta) has its own scaffolding script and validation tool to ensure consistency.

Workflow

Create a branch for your dataset addition or update. Name the branch descriptively.
Pick a schema — choose from available options in schemas/.
Run the matching scaffold script in scripts/ to generate the dataset files.
Fill in the generated CSV and metadata YAML with your measurements and details.
Upload the ZIP (data + short README; not metadata yaml file) to Zenodo, then record its DOI in the metadata.
Validate with the corresponding validate_*.py tool.
Commit only the metadata and registry entry (not the data files).
Open a pull request to add or update the dataset in the registry.

See datasets/aia-lum/README.md for full details on aia-lum datasets.

Directory structure

schemas/ — dataset schema definitions
scripts/ — scaffolding and validation tools
datasets/ — scaffolded records and metadata

Each dataset corresponds to a single Zenodo record, linked here by its canonical ID and DOI.

Dataset naming convention

Each dataset is assigned a canonical ID, used both on Zenodo and in this registry. This ensures consistent citation, easy discovery, and machine-readable linking across echoIA datasets.

General pattern:

<category>-<yyyy>-<sample>-<firstauthoryy>

where:

<category> — short dataset family tag (e.g., aia-lum, aia-z, eta, mock-ia)
<yyyy> — publication year of the original measurement or dataset source
<sample> — short descriptor of the galaxy sample, survey, or subsample (lowercase, hyphenated)
<firstauthoryy> — first author tag with publication year, e.g. Author et al. (2025) → a25

Example:

aia-lum-2017-sdss-redmapper-u17

Rules:

Lowercase, hyphen-separated
<sample> may include short survey tag or subsample identifier
<yyyy> = publication year of the original measurement

Currently supported dataset schemas:

aia-lum — intrinsic alignment amplitude vs luminosity

Example: intrinsic alignment vs luminosity (`aia-lum` schema)

Run from the repo root:

python scripts/make_ia_vs_lum_dataset.py \
    --schema aia-lum \
    --id aia-lum-YYYY-sample-firstauthor \
    --title "Descriptive dataset title" \
    --year YYYY \
    --first-author SurnameYY \
    --sample sample-tag \
    --creator-name "echoIA Collaboration" \
    --creator-affil "echoIA"

This creates three files in datasets/aia-lum/:

<id>-metadata.yaml — echoIA-side metadata (kept in Git)
<id>-data.csv — the dataset (uploaded to Zenodo, not stored in Git)
<id>-README.txt — helper notes (temporary)

License & Attribution

License: CC-BY-4.0
We do not claim ownership of original research results.
All numerical values are factual reproductions from cited works.
When using an echoIA dataset:
- Cite the original paper(s) listed in metadata
- Optionally cite the echoIA registry DOI

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

echoIA Datasets Registry

How to use this repository

Workflow

Directory structure

Dataset naming convention

Currently supported dataset schemas:

Example: intrinsic alignment vs luminosity (`aia-lum` schema)

License & Attribution

About

Uh oh!

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
datasets/aia-lum		datasets/aia-lum
schemas		schemas
scripts		scripts
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
index.yaml		index.yaml

License

echo-IA/echoia-datasets

Folders and files

Latest commit

History

Repository files navigation

echoIA Datasets Registry

How to use this repository

Workflow

Directory structure

Dataset naming convention

Currently supported dataset schemas:

Example: intrinsic alignment vs luminosity (aia-lum schema)

License & Attribution

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Uh oh!

Languages

Example: intrinsic alignment vs luminosity (`aia-lum` schema)