Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Failed to remove batch effect #358

Closed
Somnvs opened this issue Dec 2, 2024 · 8 comments
Closed

Failed to remove batch effect #358

Somnvs opened this issue Dec 2, 2024 · 8 comments

Comments

@Somnvs
Copy link

Somnvs commented Dec 2, 2024

Hello, I have trouble when I analyze my spatial transcriptome data. I follow the official course to remove batch effect(https://stereopy.readthedocs.io/en/latest/Tutorials%28Multi-sample%29/Batch_Effect.html#Integrating), however, the results of my analysis show that the samples are separated and do not overlap.

image

I’m wondering if there’s some specific parameter I overlooked, which is causing the results to be quite bizarre.

The script for my analysis is as follows:

import stereo as st
import warnings
warnings.filterwarnings('ignore')

import csv
import os

import pandas as pd

# outPrefix="/home/weigs/Project/01_LilinSpatial/14_mergeSurgery/10_filter9SampleBin40/result"
# outDir="/home/weigs/Project/01_LilinSpatial/14_mergeSurgery/10_filter9SampleBin40"
outPrefix="/home/weigs/Project/04_Liuzk_spatial/06_stRnaMergeSample/01_mergeSampleBin100/result"
outDir="/home/weigs/Project/04_Liuzk_spatial/06_stRnaMergeSample/01_mergeSampleBin100"
# tableFile="/home/weigs/Project/01_LilinSpatial/14_mergeSurgery/000fileListFirstColumn.txt"
binSize=100
# binSize=50

neighborNum=6 # default 6

pcNum=30 # default 30
findNeighborNum=10 # default 10

gem1="/home/weigs/Project/04_Liuzk_spatial/01_dataInfo/02_stR_data/D01567F5.tissue.gem.gz"
gem2="/home/weigs/Project/04_Liuzk_spatial/01_dataInfo/02_stR_data/D01656A1.tissue.gem.gz"
gem3="/home/weigs/Project/04_Liuzk_spatial/01_dataInfo/02_stR_data/D01656B3.tissue.gem.gz"
gem4="/home/weigs/Project/04_Liuzk_spatial/01_dataInfo/02_stR_data/D01656C3.tissue.gem.gz"


data1 = st.io.read_gem(file_path=gem1, bin_size=binSize)
data2 = st.io.read_gem(file_path=gem2, bin_size=binSize)
data3 = st.io.read_gem(file_path=gem3, bin_size=binSize)
data4 = st.io.read_gem(file_path=gem4, bin_size=binSize)


data1.tl.cal_qc()
data2.tl.cal_qc()
data3.tl.cal_qc()
data4.tl.cal_qc()


data = st.utils.data_helper.merge(data1, data2, data3, data4)
# check the shape of merged data
data.shape

# Since normalization will change the expression matrix, save raw data beforehand.
data.tl.raw_checkpoint()

data.tl.normalize_total()
data.tl.log1p()

data.tl.pca(use_highly_genes=False, n_pcs=pcNum, res_key='pca')
# data.tl.pca(use_highly_genes=True, n_pcs=pcNum, res_key='pca')
data.tl.batches_integrate(pca_res_key='pca', res_key='pca_integrated')

data.tl.neighbors(pca_res_key='pca_integrated', n_pcs=pcNum, res_key='neighbors_integrated',n_neighbors=findNeighborNum)
# compute spatial neighbors
data.tl.spatial_neighbors(
       neighbors_res_key='neighbors_integrated',
       res_key='spatial_neighbors_integrated',
       n_neighbors=neighborNum
       )

data.tl.umap(pca_res_key='pca_integrated', neighbors_res_key='spatial_neighbors_integrated', res_key='umap_integrated')
data.plt.batches_umap(res_key='umap_integrated')

@tanliwei-coder
Copy link
Collaborator

Try to run umap with neighbors instead of spatial_neighbors.

@Somnvs
Copy link
Author

Somnvs commented Dec 5, 2024

Try to run umap with neighbors instead of spatial_neighbors.

Thank you for your response. I had try run umap with neighbors instead of spatial_neighbors, but the results didn't change significantly.

@tanliwei-coder
Copy link
Collaborator

Let's see what the spatial distribution of the four samples is like, run data.plt.spatial_scatter on the data object merged from four samples:

data.plt.spatial_scatter(reorganize_coordinate=2)

@Somnvs
Copy link
Author

Somnvs commented Dec 9, 2024

plot spatial scatter, result was as follow:
image

@tanliwei-coder
Copy link
Collaborator

Run BatchQC to check whether batch effect removal is needed.

@Somnvs
Copy link
Author

Somnvs commented Dec 11, 2024

Sorry for the delayed response over the past two days; I took some time to run the BatchQC program. The final results indicate that batch effect correction is necessary for this dataset. Attached is the BatchQC report.
BatchQC_report_raw.zip

@tanliwei-coder
Copy link
Collaborator

Sorry for replying so late, our batch effect removal is based on harmonypy, and the paper about this algorithm is here, I guess this algorithm could not work on all data

@Somnvs
Copy link
Author

Somnvs commented Dec 30, 2024

It's ok, I hope you can develop and integrate more algorithms and technologies in the future to solve this issue that cannot be handled.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants