Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

quick upgrade #143

Merged
merged 7 commits into from
Jun 19, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
39 changes: 24 additions & 15 deletions rmd/30_BatchEffects.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -72,7 +72,7 @@ if (!file.exists(fdest)) {
# utils::download.file("https://zenodo.org/record/7891484/files/panc_sub_processed.RDS?download=1", destfile = fdest, method = "curl")
file.copy("/scratch/local/rseurat/datasets/preprocessed_rds/panc_sub_processed.RDS", fdest, overwrite = FALSE)
}
panc_sub <- UpdateSeuratObject(readRDS(fdest)) ## Seurat v5 would require this
panc_sub <- UpdateSeuratObject(readRDS(fdest)) ## Seurat v5 would require this
```

> 🧭✨ Poll:
Expand Down Expand Up @@ -132,19 +132,36 @@ p1 + p2
Let's now explore some options of mitigating batch effects:

- Seurat Integration
- Seurat SCTransform
- Conos
- Harmony
- SCTransform
- kBET
- Conos
- ComBat/SVA
- ...

# Harmony

This method is quite straightforward. It will account for batch effects by correcting our principal components, its functionality is conveniently wrapped under function `RunHarmony`. Just keep in mind that it will create a new 'harmony' reduction that's the one you want to use downstream...


```{r, eval=FALSE}
# ...
harmony::RunHarmony(object, h.group.by.vars = "orig.ident")
FindNeighbors(object, reduction = 'harmony')
FindClusters(object)
RunUMAP(object, reduction = 'harmony')
# ...
```

For further details, [check their vignette](http://htmlpreview.github.io/?https://github.com/immunogenomics/harmony/blob/master/doc/Seurat.html)

# Seurat Integration

> There's a few more slides, [here](https://docs.google.com/presentation/d/1cdeiIY2I7lXwYYJOyUYViOZkv3BRQ2PTfrmwKYMD6Jc/edit?usp=sharing).
> There's a few more slides, [here](https://docs.google.com/presentation/d/1cdeiIY2I7lXwYYJOyUYViOZkv3BRQ2PTfrmwKYMD6Jc/edit?usp=sharing). For further details, check the [original article](https://www.sciencedirect.com/science/article/pii/S0092867419305598).

## Seurat Integration: prep datasets

Prior to the integration, we want to normalize each dataset to be integrated separately.
Prior to the integration, we want to normalize each dataset separately to be integrated.

```{r seurat_integrate1}
# split the dataset into a list of seurat objects
Expand All @@ -159,7 +176,6 @@ panc.list <- lapply(X = panc.list, FUN = function(x) {

## Seurat Integration

In this paragraph, we're going to run the three functions key to Seurat Integration.

```{r seurat_integrate2}
# select features that are repeatedly variable across datasets for integration
Expand All @@ -180,16 +196,14 @@ gc()

Let's have a brief look at the panc.combined dataset - a new Assay has been created by the integration procedure.

> 🧭✨ Poll: What is the new assay called?](<https://PollEv.com/multiple_choice_polls/4ArWqQjwtyVafBYCiaqbr/respond>) Hint: you can access assays of a Seurat object with `Assays()`.
> 🧭✨ Poll: [What is the new assay called?](<https://PollEv.com/multiple_choice_polls/4ArWqQjwtyVafBYCiaqbr/respond>) Hint: you can access assays of a Seurat object with `Assays()`.

## Process the newly integrated dataset

After the integration, data scaling on the new assay is necessary, as well as calculation of PCA and UMAP embeddings.

```{r process_integrated}
# specify that we will perform downstream analysis on the corrected data note that the
# original unmodified data still resides in the 'RNA' assay
DefaultAssay(panc.combined) <- "integrated"
stopifnot(DefaultAssay(panc.combined) == "integrated")

# Run the standard workflow for visualization and clustering
panc.combined <- ScaleData(panc.combined, verbose = FALSE)
Expand Down Expand Up @@ -249,11 +263,6 @@ p2 <- DimPlot(panc.combined, reduction = "umap", group.by = "celltype")
p1 + p2
```

## Outlook

- Verify that expected marker genes are expressed per cell population
- Note: Seurat v5 has additional modalities of integrating cells: (Harmony and scVI, bridge integration across modalities)

## SessionInfo

```{r sessionInfo}
Expand Down
22 changes: 15 additions & 7 deletions rmd/31_SCTransform.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -49,7 +49,7 @@ stim <- ifnb.list[["STIM"]]

Let's remove some redundant objects from memory:

```{r cleanup}
```{r cleanup, eval=FALSE}
rm(list = c("ifnb", "ifnb_sub", "ifnb.list"))
gc()
```
Expand Down Expand Up @@ -101,19 +101,25 @@ We're going to use the SCTransform function with some more recent implementation
library(sctransform)
library(glmGamPoi)
ctrl <- SCTransform(ctrl, method = "glmGamPoi", vst.flavor = "v2", verbose = FALSE)
```

We could've also regressed out covariates during this process.

```{r, eval=FALSE}
# store mitochondrial percentage in object meta data
# ctrl <- PercentageFeatureSet(ctrl, pattern = "^MT-", col.name = "percent.mt")
# ctrl <- SCTransform(ctrl, method = "glmGamPoi", vars.to.regress = "percent.mt", verbose = FALSE)
ctrl <- PercentageFeatureSet(ctrl, pattern = "^MT-", col.name = "percent.mt")
ctrl <- SCTransform(ctrl, method = "glmGamPoi", vars.to.regress = "percent.mt", verbose = FALSE)
```


## Recalculate Dimensional Reductions

We have now gotten a new Assay added to the Seurat object.
All we need to do now is to run PCA and UMAP for visualization in lowD.
We're also going to save the elbow plot for comparison with the log-normed dataset.

```{r sct_umap_calc}
DefaultAssay(ctrl) <- "SCT" # explain
stopifnot(DefaultAssay(ctrl) == "SCT")
ctrl <- RunPCA(ctrl, verbose = FALSE)
ctrl_sct <- ElbowPlot(ctrl, ndims = "30")
```
Expand All @@ -127,7 +133,9 @@ ctrl_ln + ctrl_sct
What has changed and what hasn't?

> 🧭✨ Poll: [What amount of standard deviation does PC1 explain after sc-transform?](https://PollEv.com/multiple_choice_polls/AQwroH3oNpZ2uQRcly2eO/respond)
> Hint: if you'd like to determine it numerically, you can use the `Stdev` accessor function to access the standard deviations for PC components in a Seurat object.
> Hint: Function `Stdev(object)` returns the absolute values of standard deviations for principal components. It's the analogous to `object@reductions$pca@stdev`.



## Revisit UMAP

Expand Down Expand Up @@ -182,7 +190,7 @@ immune.combined.sct <- IntegrateData(anchorset = immune.anchors, normalization.m

Let's cleanup again:

```{r cleanup2}
```{r cleanup2, eval=FALSE}
rm(list = c("ctrl", "stim", "ifnb.list", "features", "all.genes", "immune.anchors"))
gc()
```
Expand Down Expand Up @@ -235,7 +243,7 @@ head(b.interferon.response, n = 10)
```{r DE_plot,fig.height=4, fig.width=8}
Idents(immune.combined.sct) <- "seurat_annotations"
inf_genes <- rownames(b.interferon.response)
DefaultAssay(immune.combined.sct) <- "SCT"
DefaultAssay(immune.combined.sct) == "SCT" # integrated
FeaturePlot(immune.combined.sct,
features = inf_genes[1], split.by = "stim",
cols = c("grey", "red")
Expand Down
1 change: 0 additions & 1 deletion rmd/41_BrainIntegration.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -151,7 +151,6 @@ By default, the column `seurat_clusters` in our `meta.data` slot will point to t
# Inspect Metadata

# Plot UMAP for different resolutions (hint: change group.by= ...)

```

1. Explore the different cluster solutions with respect to their sizes and pairwise relationships (hint: table).
Expand Down
Loading