Skip to content

size of res_df doesn't equal size of clr pixel table? #1

@gfudenberg

Description

@gfudenberg

from https://github.com/open2c/open2c_vignettes/blob/main/sparse_eigendecomp.ipynb

it was unclear why this code:

# collect obs/exp for chunks of pixel table (in memory for 1Mb cooler)
results = []
for oe_chunk in obs_over_exp_generator(
        clr,
        expected_df,
        view_df=hg38_arms,
        expected_column_name='expected',
        oe_column_name='oe',
        chunksize=1_000_000,
    ):
    results.append(oe_chunk)
# concat chunks into single DataFrame - res_df - is a new pixel table - sparse matrix
res_df = pd.concat(results, ignore_index=True)

leads to a res_df with a different shape than the original clr.

image

@sergpolly: this is most likely because bad_bins are dropped.

  • Q: should we drop Nans when making res_df, or not?

cc @nvictus

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions