Skip to content

Commit

Permalink
Updated atac tools
Browse files Browse the repository at this point in the history
  • Loading branch information
royfrancis committed Aug 27, 2024
1 parent d635b45 commit c8f95b8
Show file tree
Hide file tree
Showing 3 changed files with 62 additions and 2 deletions.
Binary file added chapters/atacseq/assets/wang-2022-1.webp
Binary file not shown.
29 changes: 27 additions & 2 deletions chapters/atacseq/index.qmd
Original file line number Diff line number Diff line change
Expand Up @@ -15,10 +15,35 @@ Our benchmarking results highlight SnapATAC, cisTopic, and Cusanovich2018 as the

@chen2019assessment

[{{< fa brands youtube >}} scATAC-Seq analysis in R](https://www.youtube.com/watch?v=e2396GKFMRY)
## Feature selection

[{{< fa toolbox >}} Signac](https://stuartlab.org/signac/index.html)
The performance of various methods for analyzing datasets with different cell structures and sizes is discussed. For simple datasets with distinct cell types, all methods are effective. For datasets with small cell classes or with hierarchical clustering and similar subtypes, SnapATAC and SnapATAC2 are preferred. SnapATAC is not memory-efficient for large datasets (over 20,000 cells); in such cases, SnapATAC2 is better. Signac outperforms ArchR, but ArchR is more memory-efficient. Adding aggregation steps to Signac does not significantly increase time or memory usage. Feature engineering choices like peak versus bins calling do not majorly affect performance, so users can choose based on preference. Recommended latent space dimensions vary by method: 10-30 for SnapATAC/SnapATAC2, 10-50 for Signac/ArchR, and even larger for aggregation methods.

@de2024systematic

## Celltyping

![Performance of label transfer methods on single-cell data from selected mouse and human tissues. (A) Overall metrics considering performance on all scATAC-seq cells.](assets/wang-2022-1.webp)

Here, we evaluated the performance of five scATAC-seq annotation methods on both their classification accuracy and scalability using publicly available single-cell datasets from mouse and human tissues including brain, lung, kidney, PBMC, and BMMC. Using the BMMC data as basis, we further investigated the performance of these methods across different data sizes, mislabeling rates, sequencing depths and the number of cell types unique to scATAC-seq. Bridge integration, which is the only method that requires additional multimodal data and does not need gene activity calculation, was overall the best method and robust to changes in data size, mislabeling rate and sequencing depth. Conos was the most time and memory efficient method but performed the worst in terms of prediction accuracy. scJoint tended to assign cells to similar cell types and performed relatively poorly for complex datasets with deep annotations but performed better for datasets only with major label annotations. The performance of scGCN and Seurat v3 was moderate, but scGCN was the most time-consuming method and had the most similar performance to random classifiers for cell types unique to scATAC-seq.

@wang2022benchmarking

## Tools

![Comparison of toolkits](https://media.springernature.com/full/springer-static/esm/art%3A10.1038%2Fs41588-021-00790-6/MediaObjects/41588_2021_790_Fig5_ESM.jpg) @granja2021archr

- [{{< fa toolbox >}} Signac](https://stuartlab.org/signac/index.html) (R)
- [{{< fa toolbox >}} ArchR](https://github.com/GreenleafLab/ArchR) (R) @granja2021archr
- [{{< fa toolbox >}} SnapATAC](https://github.com/r3fang/SnapATAC) (R)
- [{{< fa toolbox >}} pycisTopic](https://github.com/aertslab/pycisTopic) (Python)
- [{{< fa toolbox >}} Scasat](https://github.com/ManchesterBioinference/Scasat) (Bash,Python)

@stuart2021single

## Tutorials

- [{{< fa brands youtube >}} scATAC-Seq analysis in R using Signac by Sanbomics](https://www.youtube.com/watch?v=e2396GKFMRY)
- [{{< fa brands youtube >}} scATAC-Seq analysis in R using Signac by Bioinformagician](https://www.youtube.com/watch?v=yEKZJVjc5DY)

## References {.unnumbered}
35 changes: 35 additions & 0 deletions references.bib
Original file line number Diff line number Diff line change
Expand Up @@ -634,6 +634,41 @@ @article{chen2019assessment
url={https://genomebiology.biomedcentral.com/articles/10.1186/s13059-019-1854-5}
}

@article{wang2022benchmarking,
title={Benchmarking automated cell type annotation tools for single-cell ATAC-seq data},
author={Wang, Yuge and Sun, Xingzhi and Zhao, Hongyu},
journal={Frontiers in Genetics},
volume={13},
pages={1063233},
year={2022},
publisher={Frontiers Media SA},
url={https://www.frontiersin.org/journals/genetics/articles/10.3389/fgene.2022.1063233/full}
}

@article{de2024systematic,
title={Systematic benchmarking of single-cell ATAC-sequencing protocols},
author={De Rop, Florian V and Hulselmans, Gert and Flerin, Chris and Soler-Vila, Paula and Rafels, Albert and Christiaens, Valerie and Gonz{\'a}lez-Blas, Carmen Bravo and Marchese, Domenica and Caratu, Ginevra and Poovathingal, Suresh and others},
journal={Nature biotechnology},
volume={42},
number={6},
pages={916--926},
year={2024},
publisher={Nature Publishing Group US New York},
url={https://genomebiology.biomedcentral.com/articles/10.1186/s13059-024-03356-x}
}

@article{granja2021archr,
title={ArchR is a scalable software package for integrative single-cell chromatin accessibility analysis},
author={Granja, Jeffrey M and Corces, M Ryan and Pierce, Sarah E and Bagdatli, S Tansu and Choudhry, Hani and Chang, Howard Y and Greenleaf, William J},
journal={Nature genetics},
volume={53},
number={3},
pages={403--411},
year={2021},
publisher={Nature Publishing Group US New York},
url={https://www.nature.com/articles/s41588-021-00790-6}
}

## airr ----------------------------------------------------------------------------------------
@article{irac2024single,
Expand Down

0 comments on commit c8f95b8

Please sign in to comment.