Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Passing large numbers of files to createsetdb #5

Open
SDmetagenomics opened this issue Oct 30, 2024 · 0 comments
Open

Passing large numbers of files to createsetdb #5

SDmetagenomics opened this issue Oct 30, 2024 · 0 comments

Comments

@SDmetagenomics
Copy link

I would like to run spacedust on a plasmid database. This database has ~60k individual files that represent separate plasmid "genomes". However when I pass the following command to spacedust:

$spacedust createsetdb /individual_faa/*.faa SpacedustDB tmp --threads 18

bash: /shared/software/bin/spacedust: Argument list too long

I receive a bash error that the arguments list is too long. I have tried a number of workarounds to this such as passing an environment variable that contains all the file names...but to no avail

It would be useful if instead of passing a file glob (*), that spacedust createsetdb could instead take a single input file with paths to each of the .faa files needed for db creation. Alternatively if I could create databases in batches and combine them that could be another approach, just not sure if that is supported. Finally, if you have any other suggestions I would be forever greatful.

In terms of the total number of proteins in these plasmid "genomes" it would be quite similar to the 9000 genomes you ran in the spacedust paper since plasmids are much smaller in size. So I think computationally it should be managable just trouble getting all the files in :-)

My Environment

  • Linux
  • Using Statically compiled spacedust executable for AVX2 instruction set
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant