Skip to content
Will Casazza edited this page Oct 22, 2017 · 7 revisions

What We're Thinking of Building

Features

A tool that is capable of:

  • Importing results from different fusion callers tools (or can convert the results of a set of tools into a common format)
    • Initial idea: have team members write simple parsers to convert tool output formats to a common standard (most likely BEDPE)
  • From the common file format, aggregating results from different tools into a consensus set
    • Initial idea: require some level of minimum overlap between features
    • Handling duplicate or 'synonym' calls might be difficult here
  • Augmenting the consensus set with additional information from the RNA-Seq data set or existing tools
    • Initial idea: add gene expression data for candidate fusion partners
    • Initial idea: use Oncofuse?
  • Importing information from existing databases
    • Initial idea: start from dump of CIViC, filtered for fusions
    • Initial idea: use ReCount to provide view of how common the fusion junction is in different data sets
  • Review interface
    • Some kind of dashboard or web view for 'reviewers' to view evidence associated with particular fusions
  • Visualization
    • TBD

NOTE that we will almost certainly not be able to tackle all aspects of this project during HackSeq 2017. We'll meet at the outset of the project to refine the scope and pick stories to work on during the Hackathon.

Implementation

  • Currently, this repository is set up as an R package. The initial idea was to implement most of the components principally in 'tidyverse'-style R, but this is not a hard requirement (it might not make sense for some parts)

Initial Data Sets

  • See data_sets
  • Small test sets from Chimeraviz
  • Synthetic test set containing 9 known fusions (from FusionCatcher)
  • AML cell line RNA-Seq data (3x replicates from cell line data)
Clone this wiki locally