bulkfile-scripts

Scripts useful for working with bulk FlyBase data locally.

Scripts

FASTA

extract_seq_from_fasta.pl - Extract longest, unique, and specific IDs from the FlyBase FASTA files.

flybase_id_to_fasta.py - Extract FASTA sequences for lists of FlyBase IDs from gzipped FlyBase FASTA files.

python3 flybase_id_to_fasta.py --fasta /path/to/flybase.fasta.gz cluster1_ids.txt cluster2_ids.txt

FASTA files will be written to the same directory as the input file with the same name, but with the extension .fasta.

GFF

problem_case_filter.pl - Script for removing complicated biological corner cases from GFF files that can sometimes cause issues with various analysis tools.

Assembly

dmel_r5_to_r6_converter.pl - Convert D. melanogaster coordinates from genome assembly release 5 to release 6.

Symbols

symbol_to_id_lookup.py - Script for converting symbols (current or old) into their current FlyBase IDs. This script currently only handles Dmel genes and transcripts but could be easily modified to handle other species or data types.

IDs

fbgn_updater.py - Script for updating FBgn ids into their current FlyBase IDs.

GraphQL

constructs_by_gene.py - Simple example script for querying the FlyBase GraphQL API to retrieve construct information for one or more genes.

Changelog

v0.1.0 - 08/13/2020

Added GraphQL example

Name		Name	Last commit message	Last commit date
Latest commit History 42 Commits
dmel_r5_to_r6		dmel_r5_to_r6
fasta		fasta
gff		gff
graphql		graphql
ids		ids
symbols		symbols
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

bulkfile-scripts

Scripts

FASTA

GFF

Assembly

Symbols

IDs

GraphQL

Changelog

v0.1.0 - 08/13/2020

About

Releases

Packages

Contributors 2

Languages

License

FlyBase/bulkfile-scripts

Folders and files

Latest commit

History

Repository files navigation

bulkfile-scripts

Scripts

FASTA

GFF

Assembly

Symbols

IDs

GraphQL

Changelog

v0.1.0 - 08/13/2020

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages