End to end workflow stops after PhaGCN #49

bernt-matthias · 2024-12-19T22:02:54Z

I have a run the exits with exit code 0 (indicating success:

PhaBOX2 is running with: 1 threads!
Running program: PhaMer (virus identification)
[1/7] filtering the length of contigs...
[2/7] calling genes with prodigal...
[3/7] running all-against-all alignment...
[4/7] converting sequences to sentences for language model...
[5/7] Predicting the viruses...
100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 14/14 [00:01<00:00, 11.22it/s]
[6/7] summarizing the results...
[7/7] writing the results...
Run time: 258.42 seconds

PhaMer finished! please check the results in output/final_prediction
Running program: PhaGCN (taxonomy classification)
[1/8] reusing existing filtered contigs...
PhaGCN finished! please check the results in output/final_prediction/phagcn_prediction.tsv

I guess its because output/filtered_contigs.fa is empty and one of the exit() calls here or here is called.

Wondering if the workflow should continue, i.e. the exits should be return statements or if there should be an exit(1). I tried the PhaTYP step separately which gave me some results .. so continuing the workflow might be of interest.

There seem to be more exit() calls in the code which might better be exit(1)?

The text was updated successfully, but these errors were encountered:

KennthShang · 2024-12-20T03:37:21Z

In the "end-to-end" design, if all the sequences are non-viruses judged by the PhaMer, then they should not theoretically be passed to all other tools. This is because the following methods do not have any "negative control" and may give unestimated predictions. It's like an ML/DL model usually fails to solve an out-of-distribution problem but still assigns an in-distribution label to an input.

However, if all the sequences are quantified as viruses in experiments, or identified as viruses by other methods, I suppose the users should skip the PhaMer part.

Or probably, I should provide a version that users can choose whether they need to run PhaMer? Looking forward to your advice.

Best,
Jiayu

bernt-matthias · 2024-12-20T07:31:07Z

Thanks for the explanation. I guess my main problem was that for a novel user it's hard to see that there is a problem from the log output.

Maybe just add log output that tells the user that phamer detected no viruses.

An option to skip phamer could also be a good idea.

KennthShang · 2024-12-26T08:38:55Z

Thanks for the suggestions. I added a new log to show if no viruses were detected.

Also, a new option --skip is added. Users can decide whether they would like to skip PhaMer.

Wish you a nice holiday.

Best,
Jiayu

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

End to end workflow stops after PhaGCN #49

End to end workflow stops after PhaGCN #49

bernt-matthias commented Dec 19, 2024

KennthShang commented Dec 20, 2024

bernt-matthias commented Dec 20, 2024

KennthShang commented Dec 26, 2024

End to end workflow stops after PhaGCN #49

End to end workflow stops after PhaGCN #49

Comments

bernt-matthias commented Dec 19, 2024

KennthShang commented Dec 20, 2024

bernt-matthias commented Dec 20, 2024

KennthShang commented Dec 26, 2024