Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cell type order file #1

Open
RemyLau opened this issue Aug 15, 2023 · 2 comments
Open

Cell type order file #1

RemyLau opened this issue Aug 15, 2023 · 2 comments

Comments

@RemyLau
Copy link

RemyLau commented Aug 15, 2023

Hi @michellemli! Love your work. Is there any way I can find out the corresponding cell type in the provided embeddings on figshare? Or is it safe to assume that it's just the alphabetically sorted list?

@RemyLau
Copy link
Author

RemyLau commented Aug 16, 2023

After some investigation, this is my current belief: they are ordered as the "default" file ordering in the filesystem (not sorted)...

In the reader function, glob was used to iterate over the contexts; this order gets propagated into the ppi_layers dictionary. Meanwhile, glob returns file names in unsorted default order (here).

Finally, since the GAT modules are constructed in the order of the default ppi order (here), and the ppi data follows the same order as ppi_layers (here), we can conclude that the generated embeddings follow the "default" system ordering of the context ppi files.

I'm not fully confident that this system ordering is persistent across systems, e.g., will it be the same if I downloaded them on my system? But for now, I'm going to assume the answer is yes..

Although in the same StackOverflow thread, someone pointed out the the ordering is generally not guaranteed:

glob.glob() is a wrapper around os.listdir() so the underlaying OS is in charge for delivering the data. In general: you can not make an assumption on the ordering here. The basic assumption is: no ordering. If you need some sorting: sort on the application level.

@Sophon-0
Copy link

Sophon-0 commented Jul 6, 2024

is it possible to provide a script/function to get the embedding taking as input the cell type and the tissue ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants