Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: use dask to speed up cxg conversion #7364

Open
wants to merge 11 commits into
base: main
Choose a base branch
from
Open

Conversation

Bento007
Copy link
Contributor

@Bento007 Bento007 commented Oct 16, 2024

Reason for Change

  • During the schema 5.2 migration it was found the cxg conversion was taking exponentially longer for larger datasets. This work makes use of the Dask library and anndata 0.11.0 to speed up conversion.
  • This PR is not intended to provide memory optimizations, but some improvements have been observed in testing.

Changes

  • add dask dependecy for propessing
  • update anndata dependecy for processing
  • modify cxg conversion code to use dask to write tiledb arrays.

Testing steps

  • TODO

Checklist 🛎️

  • TODO

Notes for Reviewer

  • Do not merge until the offical release of anndata 0.11.0

Copy link
Contributor

Deployment Summary

@Bento007 Bento007 changed the title Fix: use dask to speed up cxg conversion fix: use dask to speed up cxg conversion Oct 17, 2024
@Bento007
Copy link
Contributor Author

Bento007 commented Oct 17, 2024

Memory test result

dataset main dask
10x plot_10x plot_10x
slide seq plot_seq plot_seq
visium plot_vis plot_vis

Overall the speed improved across the board. The remaining memory spike is cause by tiledb.consolidate

Peak Memory

dataset main dask
10x 9021.359 MiB 8481.504 MiB
seq 8847.473 MiB 8811.594 MiB
vis 8642.473 MiB 9412.105 MiB

The memory usage was stable across the datasets

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants