|
39 | 39 | and cancer tumor/normal pairings. However, rapidly changing best |
40 | 40 | practice approaches in alignment and variant calling, coupled with |
41 | 41 | large data sizes, make it a challenge to develop scalable, accurate |
42 | | -pipelines. Coordinated community development overcomes these |
| 42 | +pipelines that can remain up to date. Coordinated community development overcomes these |
43 | 43 | challenges by sharing testing and updates across groups relying on the |
44 | 44 | same infrastructure. |
45 | 45 |
|
@@ -69,7 +69,7 @@ \section*{Introduction} |
69 | 69 | mechanism to assess variant quality and interfaces with downstream tools for |
70 | 70 | variant analysis. Practically, it installs with a single command on multiple |
71 | 71 | computing architectures, scales to large whole genome analyses, and is community |
72 | | -developed. The goal is to provide a platform for moving from raw sequencing data |
| 72 | +developed. The goal is to provide a robust platform for moving from raw sequencing data |
73 | 73 | to high-quality variant calls that evolves as algorithms and sequencing |
74 | 74 | technologies change. |
75 | 75 |
|
@@ -116,7 +116,8 @@ \section*{Introduction} |
116 | 116 |
|
117 | 117 | \item Community developed: Due to the focus on solving the problems |
118 | 118 | of setting up and maintaining a complex analysis pipeline, multiple |
119 | | - sequencing centers and research laboratories use bcbio-nextgen. We |
| 119 | + sequencing centers and research laboratories use bcbio-nextgen <<<SUCH AS |
| 120 | + AND REFER TO A TABLE OF THE SITES AT WHICH IT IS EMPLOYED TOGETEHR WITH THE ARCHITECTURES>>>>. We |
120 | 121 | actively encourage contributors to the code base and make it easy to |
121 | 122 | get started with a fully automated installer and updater that |
122 | 123 | prepares all third party software and reference genomes. |
@@ -213,9 +214,9 @@ \section*{Validation} |
213 | 214 | calling without recalibration and realignment, both HaplotypeCaller and |
214 | 215 | FreeBayes perform as good or better without these steps. |
215 | 216 |
|
216 | | -The main benefit of validation is to enables experiments that quantitatively |
| 217 | +The main benefit of validation is to enable experiments that quantitatively |
217 | 218 | assess widely held approaches. We expect best practices to change with new |
218 | | -releases and algorithms, and the automated assessment mechanism allows |
| 219 | +releases and algorithms. The automated assessment mechanism allows |
219 | 220 | bcbio-nextgen to track and adapt to continuously improving tools. |
220 | 221 |
|
221 | 222 | \FloatBarrier |
@@ -265,7 +266,7 @@ \section*{Scaling} |
265 | 266 | memory usage and disk IO to maximize the throughput of multiple simultaneous |
266 | 267 | processes. An input configuration files specifies available memory usage for |
267 | 268 | programs that allow memory restrictions, and expected memory usage for those |
268 | | -that do not. These inputs allow an accurate estimate of memory consumption and |
| 269 | +that do not. These inputs allow for an accurate estimate of memory consumption and |
269 | 270 | bcbio-nextgen avoids overscheduling jobs relative to available memory on each |
270 | 271 | machine. Similarly, simultaneous disk IO on shared filesystems is a common |
271 | 272 | bottleneck during processing. bcbio-nextgen minimizes this by use of streaming |
|
0 commit comments