Scripts useful for working with bulk FlyBase data locally.
extract_seq_from_fasta.pl - Extract longest, unique, and specific IDs from the FlyBase FASTA files.
flybase_id_to_fasta.py - Extract FASTA sequences for lists of FlyBase IDs from gzipped FlyBase FASTA files.
python3 flybase_id_to_fasta.py --fasta /path/to/flybase.fasta.gz cluster1_ids.txt cluster2_ids.txt
FASTA files will be written to the same directory as the input file with the same name, but with the extension .fasta
.
problem_case_filter.pl - Script for removing complicated biological corner cases from GFF files that can sometimes cause issues with various analysis tools.
dmel_r5_to_r6_converter.pl - Convert D. melanogaster coordinates from genome assembly release 5 to release 6.
symbol_to_id_lookup.py - Script for converting symbols (current or old) into their current FlyBase IDs. This script currently only handles Dmel genes and transcripts but could be easily modified to handle other species or data types.
fbgn_updater.py - Script for updating FBgn ids into their current FlyBase IDs.
constructs_by_gene.py - Simple example script for querying the FlyBase GraphQL API to retrieve construct information for one or more genes.
- Added GraphQL example