Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pandas.errors.InvalidIndexError: Reindexing only valid with uniquely valued Index objects #171

Open
Quentin-bioinfo opened this issue Sep 19, 2024 · 4 comments

Comments

@Quentin-bioinfo
Copy link

Describe the bug
After creating the citopic objects for each samples and merge it, i want to use the function cistopic_obj.add_cell_data(cell_data, split_pattern='-').

I get an error saying "Reindexing only valid with uniquely valued Index objects".

But there is no duplicated index in my cell data annotation. when I did the pre processing of the scRNA seq part I used:
matrix.obs_names_make_unique()
and also try to add :
matrix.obs_names = [f"{idx}_{sample}" for idx in matrix.obs_names]
before to concat all the samples together to create the object I used go clusterize and generate the cell data annotation.

I don't see where that error could come from ?

To Reproduce


cistopic_obj_list = []
for sample_id in fragments_dict:
    sample_metrics = pl.read_parquet(
        os.path.join(pycistopic_qc_output_dir, f'{sample_id}.fragments_stats_per_cb.parquet')
    ).to_pandas().set_index("CB").loc[ sample_id_to_barcodes_passing_filters[sample_id] ]
    cistopic_obj = create_cistopic_object_from_fragments(
        path_to_fragments = fragments_dict[sample_id],
        path_to_regions = path_to_regions,
        path_to_blacklist = path_to_blacklist,
        metrics = sample_metrics,
        valid_bc = sample_id_to_barcodes_passing_filters[sample_id],
        n_cpu = 1,
        project = sample_id,
        split_pattern = '-'
    )
    cistopic_obj_list.append(cistopic_obj)



cistopic_obj = cistopic_obj_list[0]
cistopic_obj.merge(cistopic_obj_list[1:])


import pickle
pickle.dump(
    cistopic_obj,
    open(os.path.join(out_dir, "cistopic_obj.pkl"), "wb")
)

import pandas as pd
cell_data = pd.read_csv('../Data/scanpy/cell_annotation_data.csv', index_col = 0)
cistopic_obj.add_cell_data(cell_data, split_pattern='-')
pickle.dump(
    cistopic_obj,
    open(os.path.join(out_dir, "cistopic_obj.pkl"), "wb")
)

Error output

Traceback (most recent call last):
  File "Notebook/pycistonic.py", line 77, in <module>
    cistopic_obj.add_cell_data(cell_data, split_pattern='-')
  File "/home/user/miniconda3/envs/scenicplus/lib/python3.11/site-packages/pycisTopic/cistopic_class.py", line 136, in add_cell_data
    new_cell_data = pd.concat([obj_cell_data, cell_data], axis=1, sort=False)
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/miniconda3/envs/scenicplus/lib/python3.11/site-packages/pandas/util/_decorators.py", line 317, in wrapper
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/miniconda3/envs/scenicplus/lib/python3.11/site-packages/pandas/core/reshape/concat.py", line 382, in concat
    return op.get_result()
           ^^^^^^^^^^^^^^^
  File "/home/uswer/miniconda3/envs/scenicplus/lib/python3.11/site-packages/pandas/core/reshape/concat.py", line 613, in get_result
    indexers[ax] = obj_labels.get_indexer(new_labels)
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/miniconda3/envs/scenicplus/lib/python3.11/site-packages/pandas/core/indexes/base.py", line 3902, in get_indexer
    raise InvalidIndexError(self._requires_unique_msg)
pandas.errors.InvalidIndexError: Reindexing only valid with uniquely valued Index objects


@yojetsharma
Copy link

yojetsharma commented Oct 3, 2024

I posted about a similar issue. The work around of that was omitting split_pattern Unfortunately, that leads to NaNs in my sample _id and cell _type column.

@yojetsharma
Copy link

Could this be due to the pandas 2.0?

@Quentin-bioinfo
Copy link
Author

I think that the tuto isn't clear for that part. But i was able to move forward by adding the metadata to the individual cystopic object and only then merge it.

@yojetsharma
Copy link

Can you share the code how you did it?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants