Contacts

Overview

Developed by BioTuring (www.bioturing.com), mdup is a tool that preprocess cloud-read data (read has barcode). mdup will do:

Remove duplicate reads, remove not primary reads, secondary alignment, unmapped reads.
Detect molecule by clustering reads have same barcode into group.
Get stats about sequencing and GEM performance.

Two reads are consider duplicate if they share same mapped position, mapped target, cigar, mate info (if paired-end).

Install

git clone https://github.com/kspham/mdup.git
cd mdup
bash build.sh

Usage

mdup take a BAM file as input, the Bam file must be sorted by coordinate and be indexed. Recommend using BWA to align cloud-read to referenece. All alignment record must have BX:Z: tag present for barcode.

mdup will generate some file in output directory:

output.bam : new BAM file after remove unneeded reads.
molecule.tsv : all molecule detected info.
summary.inf : stats about sequencing and GEM performance.
plot.html : plot of some metrics of stats.

./mdup [option] in.bam

Optional arguments:
  -t INT                number of threads [default: 1]
  -o DIR                output directory [default: "./mdup_out/"]
  -g FILE               reference file that generated bam file (for better stats)
  -n INT                minimum number of reads require for a molecule (default: 4)
  -l INT                minimum length require for a molecule (default: 1000)
  -k                    don't mark duplicate.

Contacts

Please report any issues directly to the github issue tracker.

Name	Name	Last commit message	Last commit date
Latest commit tuan-tt update README to version 1.4 Jun 10, 2018 199ba68 · Jun 10, 2018 History 23 Commits
htslib @ 96defa1	htslib @ 96defa1	add htslib to submodule & fix Makefile & add build.sh	May 24, 2018
.gitignore	.gitignore	seperate plot data to data.js & make code render plot look better	May 27, 2018
.gitmodules	.gitmodules	fix bug when link htslib	May 25, 2018
Makefile	Makefile	add draw plot for coverage & mlc length distribution	May 25, 2018
README.md	README.md	update README to version 1.4	Jun 10, 2018
argument.c	argument.c	add 2 parameter for filter molecule	Jun 2, 2018
argument.h	argument.h	add 2 parameter for filter molecule	Jun 2, 2018
attr.h	attr.h	add 2 parameter for filter molecule	Jun 2, 2018
bam.c	bam.c	change keep markdup reads in bam file to don't markdup	May 30, 2018
bam.h	bam.h	first commit	May 24, 2018
build.sh	build.sh	change keep markdup reads in bam file to don't markdup	May 30, 2018
duplicate.c	duplicate.c	change keep markdup reads in bam file to don't markdup	May 30, 2018
duplicate.h	duplicate.h	first commit	May 24, 2018
khash_bx.c	khash_bx.c	first commit	May 24, 2018
khash_bx.h	khash_bx.h	first commit	May 24, 2018
markdup.c	markdup.c	add log to summary	Jun 2, 2018
molecule.c	molecule.c	add 2 parameter for filter molecule	Jun 2, 2018
molecule.h	molecule.h	fix bug & add read cloud cover plot	May 26, 2018
plot.c	plot.c	seperate plot data to data.js & make code render plot look better	May 27, 2018
plot.h	plot.h	seperate plot data to data.js & make code render plot look better	May 27, 2018
stats.c	stats.c	fix bug when calculate coverage	May 26, 2018
stats.h	stats.h	fix bug & add read cloud cover plot	May 26, 2018
utils.c	utils.c	add log to summary	Jun 2, 2018
utils.h	utils.h	add log to summary	Jun 2, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Overview

Install

Usage

Contacts

About

Releases 2

Packages

Languages

kspham/mdup

Folders and files

Latest commit

History

Repository files navigation

Overview

Install

Usage

Contacts

About

Resources

Stars

Watchers

Forks

Releases 2

Packages 0

Languages

Packages