Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Demo Notebooks #7

Open
2 of 13 tasks
jejjohnson opened this issue Dec 4, 2023 · 0 comments
Open
2 of 13 tasks

Demo Notebooks #7

jejjohnson opened this issue Dec 4, 2023 · 0 comments
Assignees

Comments

@jejjohnson
Copy link
Member

jejjohnson commented Dec 4, 2023

Add demo notebooks to demonstrate different aspects of how Helio-Tools can be used and its potential interface to machine learning.


Preprocessing

SDO Pipeline

  • Add demo preprocessing steps
  • Add demo config for preprocessing
  • Add demo config for preprocessing + dataset + dataloader
  • Add demo domain-expert validation for dataset
  • Add analysis functions for plots (see ITI paper)
  • [ ]

ML-Ready Data

Demo showcasing how we can create ML-Ready Data using scripts and parallelization. We will create 3 dataset types for each tutorial 1) single channel datasets, 2) multi-channel datasets, and 3) multi-channel time-series datasets.

  • Numpy Dataset
  • Image Dataset
  • xarray.Dataset

Machine Learning DataModules

Demo Training Datasets + DataLoaders

These demos will showcase how we can create datasets and dataloaders for different datastructures. We will focus on PyTorch datasets for three data structures: 1) numpy.ndarray, 2) .png/.jpeg/… image files, and 3) xarray.Datasets. Well will discuss things like global dataset normalization and (random) patching.

  • Demo DataLoader with numpy (see example notebook)
  • Demo DataLoader with xarray (see example)
  • Demo DataLoader with RasterVision (see example)

These demos will showcase how we can create datasets and dataloaders for inference (making predictions for new datasets). . We will focus on PyTorch datasets for three data structures: 1) numpy.ndarray, 2) .png/.jpeg/… image files, and 3) xarray.Datasets. We will discuss additional transformations that are needed for the datasets like training dataset normalizing and sliding window patching.

@jejjohnson jejjohnson self-assigned this Dec 4, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant