-
Notifications
You must be signed in to change notification settings - Fork 20
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Internal parameter default setting (#12)
- Loading branch information
Showing
17 changed files
with
261 additions
and
218 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,37 +1,6 @@ | ||
#### REQUIRED #### | ||
bam: "/path/to/bam" # path to the alignments file (BAM/CRAM format) | ||
fai: "/path/to/fai" # path to the referene FASTA FAI file | ||
fai: "/path/to/fai" # path to the reference FASTA FAI file | ||
#### OPTIONAL #### | ||
n_cpus: 1 # number of CPUs to use (parallelized by chromosome) | ||
chr_names: null # list of chromosomes to process: null (all) or a specific list e.g. ["chr1", "chr21"] | ||
logging_level: "INFO" # verbosity level (set to "ERROR" to reduce logging volume) | ||
min_qual_score: 50 # minimum SV confidence/quality score to output | ||
#### FIXED (do not modify) #### | ||
bam_type: "SHORT" | ||
min_sv_len: 4000 | ||
signal_set: "SHORT" | ||
signal_set_origin: "SHORT" | ||
signal_vmax: {"RD": 600, "RD_LOW": 800, "RD_CLIPPED": 600, "SM": 200, "SR_RP": 600, "LR": 600, "LLRR": 100, "RL": 100, "LLRR_VS_LR": 1} | ||
signal_mapq: {"RD": 20, "RD_LOW": 0, "RD_CLIPPED": 20, "SM": 20, "SR_RP": 0, "LR": 0, "LLRR": 1, "RL": 1, "LLRR_VS_LR": 1} | ||
blacklist_bed: null | ||
bed: null | ||
bin_size: 750 | ||
interval_size: 150000 | ||
step_size: 50000 | ||
shift_size: null | ||
min_pair_support: 2 | ||
min_pair_distance: 4000 | ||
max_pair_distance: 1000000 | ||
scan_target_intervals: True | ||
bins_per_block: 8000 | ||
stream: True | ||
min_refine_buffer: 2000 | ||
refine_buffer_frac_size: 5 | ||
refine_pair_dist_frac_size: 2 | ||
refine_bp_kernels: [0, 50, 500] | ||
refine_min_support: 2 | ||
heatmap_dim: 1000 | ||
image_dim: 256 | ||
class_set: "BASIC5ZYG" | ||
num_keypoints: 1 | ||
bbox_padding: 0 | ||
logging_level: "INFO" # verbosity level (set to "ERROR" to reduce logging volume) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,18 +1,8 @@ | ||
#### REQUIRED #### | ||
model_path: "/path/to/model" # path to the pretrained Cue model | ||
#### OPTIONAL #### | ||
gpu_ids: [] # list of GPU ids to use for calling -- a CPU will be used if empty | ||
gpu_ids: [] # list of GPU ids to use for calling (default: CPU(s) will be used if empty) | ||
n_jobs_per_gpu: 1 # how many parallel jobs to launch on the same GPU | ||
report_interval: 10 # frequency (in number of batches) for reporting training stats and image predictions | ||
pretrained_refinenn_path: null # path to the pretrained keypoint refinement model | ||
logging_level: "INFO" # verbosity level (set to "ERROR" to reduce logging volume) | ||
#### FIXED (do not modify) #### | ||
image_dim: 256 | ||
class_set: "BASIC5ZYG" | ||
signal_set: "SHORT" | ||
num_keypoints: 1 | ||
model_architecture: "HG" | ||
batch_size: 16 | ||
sigma: 10 | ||
stride: 4 | ||
heatmap_peak_threshold: 0.4 | ||
n_cpus: 1 # number of CPUs to use for calling if no GPUs are listed | ||
report_interval: 100 # frequency (in number of batches) for reporting image predictions | ||
batch_size: 16 # number of images per batch |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,20 @@ | ||
#### REQUIRED #### | ||
bam: "/path/to/bam" # path to the alignments file (BAM/CRAM format) | ||
fai: "/path/to/fai" # path to the reference FASTA FAI file | ||
#### OPTIONAL #### | ||
chr_names: null # list of chromosomes to process: null (all) or a specific list e.g. ["chr1", "chr21"] | ||
logging_level: "INFO" # verbosity level (set to "ERROR" to reduce logging volume) | ||
min_sv_len: 4000 # minimum length of SVs to output | ||
min_qual_score: 50 # minimum SV confidence/quality score to output | ||
blacklist_bed: null # blacklist intervals to filter out SVs | ||
min_pair_support: 2 # minimum number of discordant read pairs to retain an interval pair in targeted mode | ||
min_pair_distance: 4000 # minimum discordant read-pair distance | ||
max_pair_distance: 1000000 # maximum discordant read-pair distance | ||
refine_disable: False # disable SV refinement | ||
bin_size: 750 # size of index bins (in bps) | ||
interval_size: 150000 # size of genome intervals on each axis (in bps) | ||
step_size: 50000 # sliding-window step size in interval generation | ||
stream: True # set to True to enable streaming during targeted indexing (to reduce RAM requirements) | ||
bins_per_block: 8000 # streaming block size | ||
signal_vmax: {"RD": 600, "RD_LOW": 800, "RD_CLIPPED": 600, "SM": 200, "SR_RP": 600, "LR": 600, "LLRR": 100, "RL": 100, "LLRR_VS_LR": 1} | ||
signal_mapq: {"RD": 20, "RD_LOW": 0, "RD_CLIPPED": 20, "SM": 20, "SR_RP": 0, "LR": 0, "LLRR": 1, "RL": 1, "LLRR_VS_LR": 1} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,22 @@ | ||
#### REQUIRED #### | ||
bam: "/path/to/bam" # path to the alignments BAM or CRAM file | ||
bed: "/path/to/gt_bed_or_vcf" # path to the ground truth BED or VCF file | ||
fai: "/path/to/fai" # path to the reference FASTA FAI file | ||
#### OPTIONAL #### | ||
n_cpus: 1 # number of CPUs (parallelized by chromosome) | ||
chr_names: null # list of chromosomes to process: null (all) or a specific list e.g. ["chr1", "chr21"] | ||
logging_level: "INFO" # verbosity level (set to "ERROR" to reduce logging volume) | ||
store_img: True # store generated images and labels | ||
allow_empty: False # set to True to include images that don't overlap any SVs | ||
scan_target_intervals: False # set to True to keep only interval pairs with discordant read pairs | ||
stream: False # set to True to enable streaming during targeted indexing (to reduce RAM requirements) | ||
bins_per_block: 8000 # streaming block size | ||
min_pair_support: 2 # minimum number of discordant read pairs to retain an interval pair in targeted mode | ||
min_pair_distance: 4000 # minimum discordant read-pair distance | ||
max_pair_distance: 1000000 # maximum discordant read-pair distance | ||
shift_size: [0, 75000, 150000] # y-interval shifts (set to null for targeted interval pairs) | ||
bin_size: 750 # size of index bins (in bps) | ||
interval_size: 150000 # size of genome intervals on each axis (in bps) | ||
step_size: 50000 # sliding-window step size in interval generation | ||
signal_vmax: {"RD": 600, "RD_LOW": 800, "RD_CLIPPED": 600, "SM": 200, "SR_RP": 600, "LR": 600, "LLRR": 100, "RL": 100, "LLRR_VS_LR": 1} | ||
signal_mapq: {"RD": 20, "RD_LOW": 0, "RD_CLIPPED": 20, "SM": 20, "SR_RP": 0, "LR": 0, "LLRR": 1, "RL": 1, "LLRR_VS_LR": 1} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,18 @@ | ||
#### REQUIRED #### | ||
dataset_dirs: ["/path/to/imageset"] | ||
num_epochs: 32 # number of epochs | ||
#### OPTIONAL #### | ||
batch_size: 16 # number of images per batch | ||
gpu_ids: [] # id of the GPU to use for training | ||
pretrained_model: null | ||
logging_level: "INFO" # verbosity level (set to "ERROR" to reduce logging volume) | ||
report_interval: 50 # frequency (in number of batches) for reporting training stats and predictions | ||
model_checkpoint_interval: 10000 # how often to checkpoint the model as it trains | ||
validation_ratio: 0.1 # fraction of the data to use for validation | ||
plot_confidence_maps: False # output the predicted confidence maps | ||
learning_rate: 0.0001 | ||
learning_rate_decay_interval: 5 | ||
learning_rate_decay_factor: 1 | ||
sigma: 10 | ||
stride: 4 | ||
heatmap_peak_threshold: 0.4 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,27 +1,16 @@ | ||
#### REQUIRED #### | ||
bam: "/path/to/bam" # path to the alignments BAM or CRAM file | ||
bed: "/path/to/gt_bed_or_vcf" # path to the ground truth BED or VCF file | ||
fai: "/path/to/fai" # path to the referene FASTA FAI file | ||
fai: "/path/to/fai" # path to the reference FASTA FAI file | ||
#### OPTIONAL #### | ||
n_cpus: 1 # number of CPUs (parallelized by chromosome) | ||
chr_names: null # list of chromosomes to process: null (all) or a specific list e.g. ["chr1", "chr21"] | ||
allow_empty: False | ||
empty_annotation: False | ||
#### FIXED #### | ||
scan_target_intervals: False | ||
stream: False | ||
store_img: True | ||
bam_type: "SHORT" | ||
signal_set: "SHORT" | ||
signal_set_origin: "SHORT" | ||
class_set: "BASIC5ZYG" | ||
signal_vmax: {"RD": 600, "RD_LOW": 800, "RD_CLIPPED": 600, "SM": 200, "SR_RP": 600, "LR": 600, "LLRR": 100, "RL": 100, "LLRR_VS_LR": 1} | ||
signal_mapq: {"RD": 20, "RD_LOW": 0, "RD_CLIPPED": 20, "SM": 20, "SR_RP": 0, "LR": 0, "LLRR": 1, "RL": 1, "LLRR_VS_LR": 1} | ||
bin_size: 750 | ||
interval_size: 150000 | ||
step_size: 50000 | ||
shift_size: [0, 75000, 150000] | ||
heatmap_dim: 1000 | ||
image_dim: 256 | ||
num_keypoints: 1 | ||
bbox_padding: 0 | ||
logging_level: "INFO" # verbosity level (set to "ERROR" to reduce logging volume) | ||
store_img: True # store generated images and labels | ||
allow_empty: False # set to True to include images that don't overlap any SVs | ||
scan_target_intervals: False # set to True to keep only interval pairs with discordant read pairs | ||
stream: False # set to True to enable streaming during targeted indexing (to reduce RAM requirements) | ||
min_pair_support: 2 # minimum number of discordant read pairs to retain an interval pair in targeted mode | ||
min_pair_distance: 4000 # minimum discordant read-pair distance | ||
max_pair_distance: 1000000 # maximum discordant read-pair distance | ||
shift_size: [0, 75000, 150000] # y-interval shifts (set to null for targeted interval pairs) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,26 +1,11 @@ | ||
#### REQUIRED #### | ||
dataset_dirs: ["/path/to/imageset"] | ||
num_epochs: 32 # number of epochs | ||
#### OPTIONAL #### | ||
gpu_ids: [] | ||
batch_size: 16 | ||
num_epochs: 32 | ||
batch_size: 16 # number of images per batch | ||
gpu_ids: [] # id of the GPU to use for training | ||
pretrained_model: null | ||
logging_level: "INFO" | ||
report_interval: 50 | ||
model_checkpoint_interval: 10000 | ||
plot_confidence_maps: False | ||
validation_ratio: 0.1 | ||
#### FIXED #### | ||
n_jobs_per_gpu: 1 | ||
signal_set: "SHORT" | ||
signal_set_origin: "SHORT" | ||
class_set: "BASIC5ZYG" | ||
image_dim: 256 | ||
num_keypoints: 1 | ||
model_architecture: "HG" | ||
learning_rate: 0.0001 | ||
learning_rate_decay_interval: 5 | ||
learning_rate_decay_factor: 1 | ||
sigma: 10 | ||
stride: 4 | ||
heatmap_peak_threshold: 0.4 | ||
logging_level: "INFO" # verbosity level (set to "ERROR" to reduce logging volume) | ||
report_interval: 50 # frequency (in number of batches) for reporting training stats and predictions | ||
model_checkpoint_interval: 10000 # how often to checkpoint the model as it trains | ||
validation_ratio: 0.1 # fraction of the data to use for validation |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,18 +1,3 @@ | ||
#### REQUIRED #### | ||
model_path: "../data/demo/models/cue.pt" # path to the pretrained Cue model | ||
#### OPTIONAL #### | ||
gpu_ids: [] # list of GPU ids to use for calling -- a CPU will be used if empty | ||
n_jobs_per_gpu: 1 # how many parallel jobs to launch on the same GPU | ||
report_interval: 5 # frequency (in number of batches) for reporting training stats and image predictions | ||
pretrained_refinenn_path: null # path to the pretrained keypoint refinement model | ||
logging_level: "INFO" # verbosity level (set to "ERROR" to reduce logging volume) | ||
#### FIXED (do not modify) #### | ||
image_dim: 256 | ||
class_set: "BASIC5ZYG" | ||
signal_set: "SHORT" | ||
num_keypoints: 1 | ||
model_architecture: "HG" | ||
batch_size: 16 | ||
sigma: 10 | ||
stride: 4 | ||
heatmap_peak_threshold: 0.4 | ||
n_cpus: 1 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
__version__ = "v0.2.2" |
Oops, something went wrong.