Something problem with batch QC #360

jiangli2941 · 2024-12-06T03:31:40Z

I imported six file merge analysis, their format is in order as follows:
ms_data: {'0': (31381, 23559), '1': (24041, 23059), '2': (26034, 23518), '3': (28167, 23087), '4': (29321, 23074), '5': (29299, 23810)}
num_slice: 6
names: ['0', '1', '2', '3', '4', '5']

I followed the tutorial step by step, but there was a long analysis and pause at this step
ms_data.tl.batch_qc(scope=slice_generator[:],mode='integrate', cluster_res_key='leiden', report_path='./batch_qc', res_key='batch_qc')
Output:
[2024-12-05 22:47:24][Stereo][3971300][MainThread][131909463319616][ms_pipeline][113][INFO]: register algorithm batch_qc to <class 'stereo.core.stereo_exp_data.StereoExpData'>-131907745263232
[2024-12-05 23:03:08][Stereo][3971300][MainThread][131909463319616][classifier][144][INFO]: Model Training Finished!
[2024-12-05 23:03:08][Stereo][3971300][MainThread][131909463319616][classifier][145][INFO]: Trained checkpoint file has been saved to ./batch_qc

Due to the analysis on the cloud server, I waited for a night and did not get the corresponding output result, please ask: 1. Is there a code to simplify the output? 2. Is there a way to improve the speed of operation?

jiangli2941 · 2024-12-08T15:30:56Z

Another key question I want to ask is, how do you manually annotate subpopulations of cells? The tutorial only explains singleR's automatic annotation, which is mainly capable of distinguishing immune cells of PBMC. How to manually annotate other parenchymal cells, such as kidney CD-PC,PODO,EC, etc., through marker clustering?

tanliwei-coder · 2024-12-09T09:05:56Z

Another key question I want to ask is, how do you manually annotate subpopulations of cells? The tutorial only explains singleR's automatic annotation, which is mainly capable of distinguishing immune cells of PBMC. How to manually annotate other parenchymal cells, such as kidney CD-PC,PODO,EC, etc., through marker clustering?

I think if you have a reference about parenchymal cells, singleR also could be used to annotate automatically.

tanliwei-coder · 2024-12-09T09:10:47Z

I imported six file merge analysis, their format is in order as follows: ms_data: {'0': (31381, 23559), '1': (24041, 23059), '2': (26034, 23518), '3': (28167, 23087), '4': (29321, 23074), '5': (29299, 23810)} num_slice: 6 names: ['0', '1', '2', '3', '4', '5']

I followed the tutorial step by step, but there was a long analysis and pause at this step ms_data.tl.batch_qc(scope=slice_generator[:],mode='integrate', cluster_res_key='leiden', report_path='./batch_qc', res_key='batch_qc') Output: [2024-12-05 22:47:24][Stereo][3971300][MainThread][131909463319616][ms_pipeline][113][INFO]: register algorithm batch_qc to <class 'stereo.core.stereo_exp_data.StereoExpData'>-131907745263232 [2024-12-05 23:03:08][Stereo][3971300][MainThread][131909463319616][classifier][144][INFO]: Model Training Finished! [2024-12-05 23:03:08][Stereo][3971300][MainThread][131909463319616][classifier][145][INFO]: Trained checkpoint file has been saved to ./batch_qc

Due to the analysis on the cloud server, I waited for a night and did not get the corresponding output result, please ask: 1. Is there a code to simplify the output? 2. Is there a way to improve the speed of operation?

In the same directory of the notebook you ran the BatchQC, there is a subdirectory called batch_qc, in which there is an html file called BatchQC_reprot_raw.html, it can be opened directly on notebook.

jiangli2941 · 2024-12-09T14:49:49Z

Another key question I want to ask is, how do you manually annotate subpopulations of cells? The tutorial only explains singleR's automatic annotation, which is mainly capable of distinguishing immune cells of PBMC. How to manually annotate other parenchymal cells, such as kidney CD-PC,PODO,EC, etc., through marker clustering?

I think if you have a reference about parenchymal cells, singleR also could be used to annotate automatically.

I imported six file merge analysis, their format is in order as follows: ms_data: {'0': (31381, 23559), '1': (24041, 23059), '2': (26034, 23518), '3': (28167, 23087), '4': (29321, 23074), '5': (29299, 23810)} num_slice: 6 names: ['0', '1', '2', '3', '4', '5']
I followed the tutorial step by step, but there was a long analysis and pause at this step ms_data.tl.batch_qc(scope=slice_generator[:],mode='integrate', cluster_res_key='leiden', report_path='./batch_qc', res_key='batch_qc') Output: [2024-12-05 22:47:24][Stereo][3971300][MainThread][131909463319616][ms_pipeline][113][INFO]: register algorithm batch_qc to <class 'stereo.core.stereo_exp_data.StereoExpData'>-131907745263232 [2024-12-05 23:03:08][Stereo][3971300][MainThread][131909463319616][classifier][144][INFO]: Model Training Finished! [2024-12-05 23:03:08][Stereo][3971300][MainThread][131909463319616][classifier][145][INFO]: Trained checkpoint file has been saved to ./batch_qc
Due to the analysis on the cloud server, I waited for a night and did not get the corresponding output result, please ask: 1. Is there a code to simplify the output? 2. Is there a way to improve the speed of operation?

In the same directory of the notebook you ran the BatchQC, there is a subdirectory called batch_qc, in which there is an html file called BatchQC_reprot_raw.html, it can be opened directly on notebook.

But I only found to look at the suffix bgi batchQC file, does this mean that the output failed?

jiangli2941 · 2024-12-10T14:09:37Z

I imported six file merge analysis, their format is in order as follows: ms_data: {'0': (31381, 23559), '1': (24041, 23059), '2': (26034, 23518), '3': (28167, 23087), '4': (29321, 23074), '5': (29299, 23810)} num_slice: 6 names: ['0', '1', '2', '3', '4', '5']
I followed the tutorial step by step, but there was a long analysis and pause at this step ms_data.tl.batch_qc(scope=slice_generator[:],mode='integrate', cluster_res_key='leiden', report_path='./batch_qc', res_key='batch_qc') Output: [2024-12-05 22:47:24][Stereo][3971300][MainThread][131909463319616][ms_pipeline][113][INFO]: register algorithm batch_qc to <class 'stereo.core.stereo_exp_data.StereoExpData'>-131907745263232 [2024-12-05 23:03:08][Stereo][3971300][MainThread][131909463319616][classifier][144][INFO]: Model Training Finished! [2024-12-05 23:03:08][Stereo][3971300][MainThread][131909463319616][classifier][145][INFO]: Trained checkpoint file has been saved to ./batch_qc
Due to the analysis on the cloud server, I waited for a night and did not get the corresponding output result, please ask: 1. Is there a code to simplify the output? 2. Is there a way to improve the speed of operation?

In the same directory of the notebook you ran the BatchQC, there is a subdirectory called batch_qc, in which there is an html file called BatchQC_reprot_raw.html, it can be opened directly on notebook.

Thank u for the guiding! Could I ask in deep about the detail in which we can establish the private singleR reference? In R with Seurat package, I usually take use of some cell marker and draw the dotplot, how about in Stereopy?

BTW, the reference data of SingleR as h5ad was undownable for me. What is the format of this data?Look forward to reply

jiangli2941 · 2024-12-11T15:24:18Z

I eventually tried the singleR annovation with MouseRNAseqData. As I expected, it did perform very badly in the annotation of kidney cells. I tried to convert the data to rda format and annotate the cells in R studio, using the scCATCH package (a annotation toolkit based on single cell clusters, from cluster marker gene identification to cluster annotation based on evidence scoring). This could show some reasonably good comment results.

So the question is, how can you build an individual's singR annotated gene set for matching?

I tried 3 methods:

Export the cell_table of the scCATCH package and create it as h5ad file, but it fails because the remaining necessary information of singleR is missing
MouseRNAseqData in singR was replaced with gene and cell information in cell_table, but the replacement failed because the number of rows did not match.
Export the stereopy standard file to h5ad and use scanpy for annotation, but the export file seems to lack some necessary information, and the annotation still fails.

In short, I really hope to get the author's help in the annotation, I think this is part of the distress after choosing your company's service.

tanliwei-coder · 2024-12-27T07:17:03Z

Another key question I want to ask is, how do you manually annotate subpopulations of cells? The tutorial only explains singleR's automatic annotation, which is mainly capable of distinguishing immune cells of PBMC. How to manually annotate other parenchymal cells, such as kidney CD-PC,PODO,EC, etc., through marker clustering?

I think if you have a reference about parenchymal cells, singleR also could be used to annotate automatically.

I imported six file merge analysis, their format is in order as follows: ms_data: {'0': (31381, 23559), '1': (24041, 23059), '2': (26034, 23518), '3': (28167, 23087), '4': (29321, 23074), '5': (29299, 23810)} num_slice: 6 names: ['0', '1', '2', '3', '4', '5']
I followed the tutorial step by step, but there was a long analysis and pause at this step ms_data.tl.batch_qc(scope=slice_generator[:],mode='integrate', cluster_res_key='leiden', report_path='./batch_qc', res_key='batch_qc') Output: [2024-12-05 22:47:24][Stereo][3971300][MainThread][131909463319616][ms_pipeline][113][INFO]: register algorithm batch_qc to <class 'stereo.core.stereo_exp_data.StereoExpData'>-131907745263232 [2024-12-05 23:03:08][Stereo][3971300][MainThread][131909463319616][classifier][144][INFO]: Model Training Finished! [2024-12-05 23:03:08][Stereo][3971300][MainThread][131909463319616][classifier][145][INFO]: Trained checkpoint file has been saved to ./batch_qc
Due to the analysis on the cloud server, I waited for a night and did not get the corresponding output result, please ask: 1. Is there a code to simplify the output? 2. Is there a way to improve the speed of operation?

In the same directory of the notebook you ran the BatchQC, there is a subdirectory called batch_qc, in which there is an html file called BatchQC_reprot_raw.html, it can be opened directly on notebook.

But I only found to look at the suffix bgi batchQC file, does this mean that the output failed?

I guess your data is so lager that the BatchQC tend to take more time, I don't have your data, I can not judge it correctly.

tanliwei-coder · 2024-12-27T07:34:11Z

h5ad is a file format used to save AnnData, the reference only needs to be an h5ad file in which the obs has a column representing cell type, when you run the singleR, set the parameter ref_use_col to the obs column name of the cell type.

tanliwei-coder · 2024-12-27T07:44:09Z

There is also a method, you can use the clustering methods of stereopy to cluster the data you want to annotate, then observe the cluster result and use method data.tl.annotation to annotate the cluster result manually.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Something problem with batch QC #360

Something problem with batch QC #360

jiangli2941 commented Dec 6, 2024

jiangli2941 commented Dec 8, 2024

tanliwei-coder commented Dec 9, 2024 •

edited

Loading

tanliwei-coder commented Dec 9, 2024

jiangli2941 commented Dec 9, 2024

jiangli2941 commented Dec 10, 2024 •

edited

Loading

jiangli2941 commented Dec 11, 2024

tanliwei-coder commented Dec 27, 2024 •

edited

Loading

tanliwei-coder commented Dec 27, 2024 •

edited

Loading

tanliwei-coder commented Dec 27, 2024 •

edited

Loading

Something problem with batch QC #360

Something problem with batch QC #360

Comments

jiangli2941 commented Dec 6, 2024

jiangli2941 commented Dec 8, 2024

tanliwei-coder commented Dec 9, 2024 • edited Loading

tanliwei-coder commented Dec 9, 2024

jiangli2941 commented Dec 9, 2024

jiangli2941 commented Dec 10, 2024 • edited Loading

jiangli2941 commented Dec 11, 2024

tanliwei-coder commented Dec 27, 2024 • edited Loading

tanliwei-coder commented Dec 27, 2024 • edited Loading

tanliwei-coder commented Dec 27, 2024 • edited Loading

tanliwei-coder commented Dec 9, 2024 •

edited

Loading

jiangli2941 commented Dec 10, 2024 •

edited

Loading

tanliwei-coder commented Dec 27, 2024 •

edited

Loading

tanliwei-coder commented Dec 27, 2024 •

edited

Loading

tanliwei-coder commented Dec 27, 2024 •

edited

Loading