Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add pick_count option - number of results to return by pick #1733

Draft
wants to merge 2 commits into
base: main
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions modules/Bio/EnsEMBL/VEP/Config.pm
Original file line number Diff line number Diff line change
Expand Up @@ -115,6 +115,7 @@ our @VEP_PARAMS = (
'flag_pick_allele_gene', # flag one con per gene, allele
'flag_gencode_primary', # flag gencode primary transcripts
'pick_order=s', # define the order of categories used by the --*pick* flags
'pick_count=i', # define the number of transcripts to return used by the --*pick* flags
'buffer_size=i', # number of variations to read in before analysis
'failed=i', # include failed variations when finding existing
'gp', # read coords from GP part of INFO column in VCF (probably only relevant to 1KG)
Expand Down Expand Up @@ -303,6 +304,7 @@ our %DEFAULTS = (
core_type => 'core',
polyphen_analysis => 'humvar',
pick_order => [qw(mane_select mane_plus_clinical canonical appris tsl biotype ccds rank length ensembl refseq )],
pick_count => 1,
terminal_width => 48,
vcf_info_field => 'CSQ',
ucsc_data_root => 'http://hgdownload.cse.ucsc.edu/goldenpath/',
Expand Down
16 changes: 6 additions & 10 deletions modules/Bio/EnsEMBL/VEP/OutputFactory.pm
Original file line number Diff line number Diff line change
Expand Up @@ -150,6 +150,7 @@ sub new {
output_format
no_escape
pick_order
pick_count
allele_number
show_ref_allele
use_transcript_ref
Expand Down Expand Up @@ -769,7 +770,7 @@ sub pick_worst_VariationFeatureOverlapAllele {
push @vfoa_info, $info;
}
if(scalar @vfoa_info) {
my @order = @{$self->{pick_order}};
my @order = reverse @{$self->{pick_order}};
my $picked;

# go through each category in order
Expand All @@ -793,17 +794,12 @@ sub pick_worst_VariationFeatureOverlapAllele {
# now add to @tmp those vfoas that have the same value self $cat as $picked
push @tmp, shift @vfoa_info while @vfoa_info && $vfoa_info[0]->{$cat} eq $picked->{$cat};

# if there was only one, return
return $picked->{vfoa} if scalar @tmp == 1;

# otherwise shrink the array to just those that had the lowest
# this gives fewer to sort on the next round
@vfoa_info = @tmp;

# if there are sufficient transcripts, return
return map {$_->{vfoa}} @tmp if scalar @tmp == $self->pick_count;
}

# probably shouldn't get here, but if we do, return the first
return $vfoa_info[0]->{vfoa};
# return the number of picK_count transcript
return map {$_->{vfoa}} @vfoa_info[0..($self->pick_count-1)];
}

return undef;
Expand Down