-
Notifications
You must be signed in to change notification settings - Fork 33
Gecco convert #506
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Gecco convert #506
Changes from 15 commits
fa9db01
5a2afa6
0012b8a
518a7a1
f636670
0fd918f
a1bcca7
8e4f163
f5ff043
686b2ff
ed55447
242592d
a5d0119
ecfdfd3
81b1160
43a2964
2e62888
c9244b7
000326b
d6a9dcb
8104d87
ecd72c7
fb9a4ff
df1e3ac
a9a561a
47a9254
e789973
1ec8cee
a06d281
de6c87a
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change | ||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
|
@@ -451,21 +451,30 @@ Note that filtered FASTA is only used for BGC workflow for run-time optimisation | |||||||||||||
|
|
||||||||||||||
| [deepBGC](https://github.com/Merck/deepbgc) detects BGCs in bacterial and fungal genomes using deep learning. DeepBGC employs a Bidirectional Long Short-Term Memory Recurrent Neural Network and a word2vec-like vector embedding of Pfam protein domains. Product class and activity of detected BGCs is predicted using a Random Forest classifier. | ||||||||||||||
|
|
||||||||||||||
| #### GECCO | ||||||||||||||
| #### GECCO & GECCO CONVERT | ||||||||||||||
|
||||||||||||||
|
|
||||||||||||||
| <details markdown="1"> | ||||||||||||||
| <summary>Output files</summary> | ||||||||||||||
|
|
||||||||||||||
| - `gecco/` | ||||||||||||||
| - **GECCO** | ||||||||||||||
| - `*.genes.tsv/`: TSV file containing detected/predicted genes with BGC probability scores | ||||||||||||||
| - `*.features.tsv`: TSV file containing identified domains | ||||||||||||||
| - `*.clusters.tsv`: TSV file containing coordinates of predicted clusters and BGC types | ||||||||||||||
| - `*_cluster_*.gbk`: GenBank file (if clusters were found) containing sequence with annotations; one file per GECCO hit | ||||||||||||||
|
|
||||||||||||||
| </details> | ||||||||||||||
| - **GECCO CONVERT** | ||||||||||||||
jfy133 marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||||||||||||||
| - `*.gff`: GFF3 converted cluster tables containing the position and metadata for all the predicted clusters | ||||||||||||||
| - `*.region*.gbk`: Converted and aliased GenBank files so that they can be loaded by BiG-SLiCE | ||||||||||||||
| - `*.faa`: Amino-acid FASTA converted GenBank files of all the proteins in a cluster | ||||||||||||||
| - `*.fna`:Nucleotide sequence FASTA converted GenBank files from the cluster | ||||||||||||||
| **ONLY IF --run_gecco_convert** | ||||||||||||||
|
||||||||||||||
| **ONLY IF --run_gecco_convert** | |
| - **GECCO CONVERT** | |
| - `*.gff`: GFF3 converted cluster tables containing the position and metadata for all the predicted clusters (only if `--bgc_gecco_runconvert`) | |
| - `*.region*.gbk`: Converted and aliased GenBank files so that they can be loaded by BiG-SLiCE (only if `--bgc_gecco_runconvert`) | |
| - `*.faa`: Amino-acid FASTA converted GenBank files of all the proteins in a cluster (only if `--bgc_gecco_runconvert`) | |
| - `*.fna`:Nucleotide sequence FASTA converted GenBank files from the cluster (only if `--bgc_gecco_runconvert`) |
For the change of parameter name, see comment below.
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| [GECCO CONVERT] (https://gecco.embl.de) is an option in gecco which does file conversion into formats like GFF3, GenBank, or FASTA for further analysis. | |
| The additional GFF3, GenBank, or FASTA files from `--bgc_gecco_runconvert`, can be useful for additional further analysis of the BGC hits. |
Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.
Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.
Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.
Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.