-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
no taxid_trees file #20
Comments
i'll look into it. |
Thanks for sending the link to the taxid_trees files. It's still not clear which of these should be used though. The value in place in the supplied |
gotta redo the GA_pre_scan step. That phase is used to pre-fetch Taxa to gather a curated database.
Let me know if this works. |
OK, thanks. I tried this again and unfortunately got the same result. Here's the console output, minus the initialization checks:
Thanks. |
your issue is something else: in that URL, there's choco_h3_family/genus/order Those are the same chocophlan DB, just that they've been clustered by taxa level. There's bypasses. You could put a gene database inside /temp/smallMetaproOutput/GA_pre_scan/final_results, bwa-index it, and move on. or you could make sure the config points to one of the clusters. |
Thanks for your message, sorry for the delay in circling back to this. I made sure the config file points to path/to/outputFile/choco_h3_family for |
I met the same issue. taxid_tree files is missing. An additional note. The pipeline to download the databases does not download the database kaiju. Thank you in advance |
Server was down due to a power outage on April 14th. It's back up running now. |
I am writing because the pipeline is stuck in the same place as hughit32 commented on Feb 19. I can confirm that the ga_collect_db.sh script built by the pipeline has /project/j/jparkin/Lab_Databases/family_llbs in the sixth argument position. Any suggestions on how to alter the Config.ini file to help us get past this point in the pipeline? Edit: upon running again, I see in my output: source_taxa_db no inner section found. using default /project/j/jparkin/Lab_Databases/family_llbs How/where do I point to the source_taxa_db inner section? Edit2: Once again, as soon as I give up and post on github, I figure something out. For those interested, I edited Config.ini to give source_taxa_db a path to the Choco family group folder in my librairies. It got me past the roadblock, but not sure if all I get is family classifications from here on out? The Chocophlan database is broken into three folders. Do we need to run separately to get class and genus? Thanks, |
Can you please paste your config? |
Here is the relevant portion of the config.ini [Databases] |
The pipeline is now currently stuck at the Diamond step. Diamond jobs get submitted but the pipeline keeps getting killed for some reason before they finish. I feel like it is perhaps a memory issue or an issue with my hpc job scheduler getting overwhelmed with 40 diamond jobs, but I really don't know yet. Thought I would throw it out there if you had any ideas. |
Diamond is notoriously slow. MetaPro does its best to push through as much as a cluster node will allow it, but it's all at the mercy of the specs of your compute environment, and your data. |
Yes, I throttled the number of diamond jobs submitted back to 5 at a time and the pipeline continued well without issues. Probably room to optimize it a little higher. Thanks, |
There's supposed to be a memory analyzer, but it's not perfect <measures mem usage in discrete timeslices, to make sure your system doesn't OOM>. will revisit when there's time. |
Metapro is missing the taxid_trees file, or whatever file is supposed to be found in that folder. The lib_downloader.py script seems to have completed successfully, but no file or folder with that name was created. Can you tell me where this file can be found?
Thank you!
The text was updated successfully, but these errors were encountered: