Skip to content
Jared Simpson edited this page Nov 25, 2013 · 9 revisions

Since v0.10.10, sga comes with a quality control and data exploration module. This module will estimate sequence coverage, per-base error rates, genome size, heterozygosity and repeat content. A full description can be found in this announcement post on the sga mailing list:

https://groups.google.com/forum/#!msg/sga-users/95dTwpJCARU/oKoq54EZqKwJ

A preprint of the preqc manuscript is available about arxiv:

http://arxiv.org/abs/1307.8026

To generate a preqc report for your data, run these four commands:

sga preprocess *.fastq > mygenome.fastq
sga index -a ropebwt --no-reverse -t 4 mygenome.fastq
sga preqc -t 8 mygenome.fastq > mygenome.preqc
sga-preqc-report.py mygenome.preqc sga/src/examples/*.preqc