Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BulkRNASeq] Handling Technical Replicates #32

Open
Tracked by #29
J-81 opened this issue Jun 10, 2023 · 2 comments
Open
Tracked by #29

[BulkRNASeq] Handling Technical Replicates #32

J-81 opened this issue Jun 10, 2023 · 2 comments

Comments

@J-81
Copy link
Contributor

J-81 commented Jun 10, 2023

Description

Workflow should handle technical replicates appropriately.

Approaches

DESeq2 provides a collapseReplicates function that sums counts based on a factor to group samples by.
The rationale has two major points:

  1. Summing opposed to averaging is appropriate for maintaining expected Poisson distribution
  2. DESeq2 is designed to normalize for library size differences. Summing technical replicates is akin to having a higher sequencing depth for a sample.

Implementation Suggested

Encode Technical Replicate Groups in the Runsheet

Encode technical replicates as a column in the runsheet simply using integers for each technical replicate group.
Eventually, this technical replicate column should be automatically derived from ISA archive metadata; however, in the meantime, a workflow user should be able to supply a two column csv mapping sample name to technical replicate group which will be incorporated into the runsheet.

Use Technical Replicate Groups Column in Runsheet to for DESeq2 collapseReplicates

https://rdrr.io/bioc/DESeq2/man/collapseReplicates.html

Validation Plan

  1. Validate reasonable approach results as follows:

Run the following approaches

  • NF_RCP-F_1.0.3 (i.e. no technical replicate handling)
  • collapseReplicates (summed tech. replicates)
  • median replicates
  • mean replicates
  • filter to first replicate only (drop others)

Assessment Metrics:

  • DGE results
  1. Regression Test Criteria
  • Core tests should run without change in outcomes (since core tests don't include any technical replicates)
@J-81
Copy link
Contributor Author

J-81 commented Jun 10, 2023

Implementation Steps

  • Runsheet generation now ingests optional technical replicate group table
  • DESeq2 script updated to use tech. rep group data for four handling approaches

@J-81 J-81 mentioned this issue Jun 10, 2023
3 tasks
@J-81
Copy link
Contributor Author

J-81 commented Jun 14, 2023

Additional considerations:

  • How to handle technical replicates on multiple levels (e.g. multiple tissue cuts and multiple library preps for same biological sample)
  • Group statistic and sample count
    • Perform after or before collapsing replicates?
    • Group stats related code does require reworking as currently written after collapsing replicates

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant