Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
30 commits
Select commit Hold shift + click to select a range
fa9db01
Merge pull request #497 from nf-core/dev
jfy133 Oct 4, 2025
5a2afa6
Integrating gecco_convert into funcscan
SkyLexS Nov 2, 2025
0012b8a
Rearranging the code
SkyLexS Nov 2, 2025
518a7a1
fixing
SkyLexS Nov 2, 2025
f636670
fixing typos
SkyLexS Nov 2, 2025
0fd918f
fixing some error
SkyLexS Nov 3, 2025
a1bcca7
polishing the integration
SkyLexS Nov 4, 2025
8e4f163
fixing warnings
SkyLexS Nov 4, 2025
f5ff043
Merge branch 'nf-core:master' into gecco_convert_dis
SkyLexS Nov 11, 2025
686b2ff
Update for the schema and the function calling
SkyLexS Nov 11, 2025
ed55447
updated output.md
SkyLexS Nov 17, 2025
242592d
added tests
SkyLexS Nov 18, 2025
a5d0119
fixed allOf section for gecco convert
SkyLexS Nov 19, 2025
ecfdfd3
redid the output to include gecco convert
SkyLexS Nov 23, 2025
81b1160
updated changelog.md and readme.md
SkyLexS Nov 25, 2025
43a2964
implemented the changes
SkyLexS Nov 25, 2025
2e62888
removing the unwanted code in the modules config
SkyLexS Nov 25, 2025
c9244b7
fixing lint
SkyLexS Nov 25, 2025
000326b
removed unwanted comment
SkyLexS Dec 1, 2025
d6a9dcb
Implemented schema modification
SkyLexS Dec 4, 2025
8104d87
Linting
SkyLexS Dec 4, 2025
ecd72c7
updated test configs
SkyLexS Dec 8, 2025
fb9a4ff
Linting
SkyLexS Dec 8, 2025
df1e3ac
Update docs/output.md
jfy133 Dec 17, 2025
a9a561a
Fix convert format options to match gecco itself, add validation chec…
jfy133 Dec 17, 2025
47a9254
Update tests
jfy133 Dec 17, 2025
e789973
Merge branch 'gecco_convert_dis' of github.com:SkyLexS/funcscan into …
jfy133 Dec 17, 2025
1ec8cee
Update BAKTA test with final test profile
jfy133 Dec 17, 2025
a06d281
Merge branch 'dev' into gecco_convert_dis
jfy133 Dec 18, 2025
de6c87a
Update conf/modules.config
jfy133 Dec 18, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0

- [#500](https://github.com/nf-core/funcscan/pull/500) Updated pipeline template to nf-core/tools version 3.4.1 (by @jfy133)
- [#508](https://github.com/nf-core/funcscan/pull/508) Added support for antiSMASH's --clusterhmmer, --fullhmmer, and --tigrfam options (❤️ to @yusukepockyby for requesting, @jfy133)
- [#506](https://github.com/nf-core/funcscan/pull/506) Added support GECCO convert for generation of additional files useful for downstream analysis (by @SkyLexS)

### `Fixed`

Expand Down
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -92,7 +92,7 @@ nf-core/funcscan was originally written by Jasmin Frangenberg, Anan Ibrahim, Lou

We thank the following people for their extensive assistance in the development of this pipeline:

Adam Talbot, Alexandru Mizeranschi, Hugo Tavares, Júlia Mir Pedrol, Martin Klapper, Mehrdad Jaberi, Robert Syme, Rosa Herbst, Vedanth Ramji, @Microbion.
Adam Talbot, Alexandru Mizeranschi, Hugo Tavares, Júlia Mir Pedrol, Martin Klapper, Mehrdad Jaberi, Robert Syme, Rosa Herbst, Vedanth Ramji, @Microbion, Dediu Octavian-Codrin.

## Contributions and Support

Expand Down
8 changes: 8 additions & 0 deletions conf/modules.config
Original file line number Diff line number Diff line change
Expand Up @@ -532,6 +532,14 @@ process {
].join(' ').trim()
}

withName: GECCO_CONVERT {
publishDir = [
path: { "${params.outdir}/bgc/gecco/${meta.id}" },
mode: params.publish_dir_mode,
saveAs: { filename -> filename.equals('versions.yml') ? null : filename },
]
}

withName: HAMRONIZATION_ABRICATE {
publishDir = [
path: { "${params.outdir}/arg/hamronization/abricate" },
Expand Down
6 changes: 5 additions & 1 deletion conf/test_bgc_bakta.config
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,7 @@ params {
config_profile_description = 'Minimal test dataset to check BGC workflow function'

// Input data
input = params.pipelines_testdata_base_path + 'funcscan/samplesheet_reduced.csv'
input = params.pipelines_testdata_base_path + 'funcscan/samplesheet_hits.csv'
bgc_antismash_db = params.pipelines_testdata_base_path + 'funcscan/databases/antismash_trimmed_8_0_1.tar.gz'

annotation_tool = 'bakta'
Expand All @@ -33,6 +33,10 @@ params {
run_amp_screening = false
run_bgc_screening = true

bgc_gecco_runconvert = true
bgc_gecco_convertmode = 'gbk'
bgc_gecco_convertformat = 'bigslice'

bgc_run_hmmsearch = true
bgc_hmmsearch_models = 'https://raw.githubusercontent.com/antismash/antismash/fd61de057e082fbf071732ac64b8b2e8883de32f/antismash/detection/hmm_detection/data/ToyB.hmm'
}
6 changes: 5 additions & 1 deletion conf/test_bgc_prokka.config
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,7 @@ params {
config_profile_description = 'Minimal test dataset to check BGC workflow function'

// Input data
input = params.pipelines_testdata_base_path + 'funcscan/samplesheet_reduced.csv'
input = params.pipelines_testdata_base_path + 'funcscan/samplesheet_hits.csv'
bgc_antismash_db = params.pipelines_testdata_base_path + 'funcscan/databases/antismash_trimmed_8_0_1.tar.gz'

annotation_tool = 'prokka'
Expand All @@ -32,6 +32,10 @@ params {
run_amp_screening = false
run_bgc_screening = true

bgc_gecco_runconvert = true
bgc_gecco_convertmode = 'gbk'
bgc_gecco_convertformat = 'fna'

bgc_run_hmmsearch = true
bgc_hmmsearch_models = 'https://raw.githubusercontent.com/antismash/antismash/fd61de057e082fbf071732ac64b8b2e8883de32f/antismash/detection/hmm_detection/data/ToyB.hmm'
}
6 changes: 5 additions & 1 deletion conf/test_bgc_pyrodigal.config
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,7 @@ params {
config_profile_description = 'Minimal test dataset to check BGC workflow function'

// Input data
input = params.pipelines_testdata_base_path + 'funcscan/samplesheet_reduced.csv'
input = params.pipelines_testdata_base_path + 'funcscan/samplesheet_hits.csv'
bgc_antismash_db = params.pipelines_testdata_base_path + 'funcscan/databases/antismash_trimmed_8_0_1.tar.gz'

annotation_tool = 'pyrodigal'
Expand All @@ -32,6 +32,10 @@ params {
run_amp_screening = false
run_bgc_screening = true

bgc_gecco_runconvert = true
bgc_gecco_convertmode = 'clusters'
bgc_gecco_convertformat = 'gff'

bgc_run_hmmsearch = true
bgc_hmmsearch_models = 'https://raw.githubusercontent.com/antismash/antismash/fd61de057e082fbf071732ac64b8b2e8883de32f/antismash/detection/hmm_detection/data/ToyB.hmm'
}
2 changes: 2 additions & 0 deletions conf/test_preannotated_bgc.config
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,8 @@ params {
run_amp_screening = false
run_bgc_screening = true

bgc_gecco_runconvert = true

bgc_run_hmmsearch = true
bgc_hmmsearch_models = 'https://raw.githubusercontent.com/antismash/antismash/fd61de057e082fbf071732ac64b8b2e8883de32f/antismash/detection/hmm_detection/data/ToyB.hmm'
}
10 changes: 8 additions & 2 deletions docs/output.md
Original file line number Diff line number Diff line change
Expand Up @@ -457,15 +457,21 @@ Note that filtered FASTA is only used for BGC workflow for run-time optimisation
<summary>Output files</summary>

- `gecco/`
- **GECCO**
- `*.genes.tsv/`: TSV file containing detected/predicted genes with BGC probability scores
- `*.features.tsv`: TSV file containing identified domains
- `*.clusters.tsv`: TSV file containing coordinates of predicted clusters and BGC types
- `*_cluster_*.gbk`: GenBank file (if clusters were found) containing sequence with annotations; one file per GECCO hit

</details>
- `*.gff`: GFF3 converted cluster tables containing the position and metadata for all the predicted clusters (only if `--bgc_gecco_runconvert --bgc_gecco_convertmode clusters --bgc_gecco_convertformat gff`)
- `*.region*.gbk`: Converted and aliased GenBank files so that they can be loaded by BiG-SLiCE (only if `--bgc_gecco_runconvert --bgc_gecco_convertmode gbk --bgc_gecco_convertformat bigslice`)
- `*.faa`: Amino-acid FASTA converted GenBank files of all the proteins in a cluster (only if `--bgc_gecco_runconvert --bgc_gecco_convertmode gbk --bgc_gecco_convertformat faa`)
- `*.fna`:Nucleotide sequence FASTA converted GenBank files from the cluster (only if `--bgc_gecco_runconvert --bgc_gecco_convertmode gbk --bgc_gecco_convertformat fna`)
</details>

[GECCO](https://gecco.embl.de) is a fast and scalable method for identifying putative novel Biosynthetic Gene Clusters (BGCs) in genomic and metagenomic data using Conditional Random Fields (CRFs).

The additional GFF3, GenBank, or FASTA files from `--bgc_gecco_runconvert`, can be useful for additional further analysis of the BGC hits.

### Summary tools

[AMPcombi](#ampcombi), [hAMRonization](#hamronization), [comBGC](#combgc), [MultiQC](#multiqc), [pipeline information](#pipeline-information), [argNorm](#argnorm).
Expand Down
7 changes: 7 additions & 0 deletions modules/nf-core/gecco/convert/environment.yml

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

56 changes: 56 additions & 0 deletions modules/nf-core/gecco/convert/main.nf

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

118 changes: 118 additions & 0 deletions modules/nf-core/gecco/convert/meta.yml

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Loading
Loading