Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

issue with parsnp/1.5.4 #91

Open
lanorvege opened this issue Mar 11, 2021 · 3 comments
Open

issue with parsnp/1.5.4 #91

lanorvege opened this issue Mar 11, 2021 · 3 comments

Comments

@lanorvege
Copy link

Hi, I'm currently trying to use parsnp v1.5.4 on a slurm cluster (with raxml v8.2.12, PhiPack/1.0, harvest-tools/1.3, FastTree/2.1.11) and I keep getting an error when trying to run parsnp on my 13 bacterial genomes. Any idea on how to fix this?
here is wath I have on my terminal window:

srun -c 4 parsnp -d test_parsnp/ -r! -o test_parsnp/output_parsnp -v -c -p 32 -P 128000
|--Parsnp 1.5.4--|
For detailed documentation please see --> http://harvest.readthedocs.org/en/latest
13:56:05 - INFO -


SETTINGS:
|-refgenome: autopick
|-genomes:
test_parsnp/EERA844_lgtfilt.fasta
test_parsnp/Ec046_lgtfilt.fasta
...12 more file(s)...
test_parsnp/EERA890_lgtfilt.fasta
test_parsnp/Ec125_lgtfilt.fasta
|-aligner: muscle
|-outdir: test_parsnp/output_parsnp
|-OS: Linux
|-threads: 32


13:56:05 - INFO - <>
13:56:05 - INFO - No genbank file provided for reference annotations, skipping..
13:56:05 - DEBUG - Sorting reference replicons
13:56:05 - DEBUG - Writing .ini file
13:56:05 - INFO - Running Parsnp multi-MUM search and libMUSCLE aligner...
13:56:05 - DEBUG - /opt/gensoft/exe/parsnp/1.5.4/bin/parsnp_core test_parsnp/output_parsnp/parsnpAligner.ini
14:03:22 - CRITICAL - The following command failed:
>>$ /opt/gensoft/exe/parsnp/1.5.4/bin/parsnp_core test_parsnp/output_parsnp/parsnpAligner.ini
Please veryify input data and restart Parsnp.
If the problem persists please contact the Parsnp development team.

  STDOUT:
  0

Ec046_lgtfilt.fasta.ref,Len:5089403,GC:50.7909
...
Finished processing input sequences, elapsed time: 3 seconds

             compressed suffix graph construction elapsed time: 0 seconds

             MUM anchor search elapsed time: 10 seconds

             compressed suffix graph construction elapsed time: 0 seconds

...
Finished recursive MUM search, elapsed time: 1 seconds

    Finished filtering spurious matches, elapsed time: 0 seconds

    LCBs created, elapsed time: 0 seconds

  STDERR:

parsnpAligner:: rapid whole genome SNP typing


ParSNP: Preparing to construct global multiple alignment framework

Preparing to verify and process input sequences...
Searching for initial MUM anchors...

    Constructing compressed suffix graph...
    Performing initial search for exact matches in the sequences...

...
Performing recursive MUM search between MUM anchors...
Filtering spurious matches...
Creating and verifying final LCBs...
Writing output files & aligning LCBs...

*** ERROR *** TreeFromSeqVect_UPGMA, CLUSTER_6 not supported

*** ERROR *** TreeFromSeqVect_UPGMA, CLUSTER_6 not supported

*** ERROR *** TreeFromSeqVect_UPGMA, CLUSTER_6 not supported

srun: error: task 0: Exited with exit code 2

@valery-shap
Copy link

valery-shap commented Jan 13, 2022

Hello, @lanorvege
I have the same issue. Have you solved this issue?
Valery
Udp the issue were solved by changed -p to much more lower value then the node has.
the node has 94 cpus, I've set 30 cpus for parsnp not all

@bkille
Copy link
Contributor

bkille commented Jan 13, 2022

@valery-shap to clarify, you were observing the

*** ERROR *** TreeFromSeqVect_UPGMA, CLUSTER_6 not supported

issue as well, but resolved it by lowering the number of threads used?

@valery-shap
Copy link

valery-shap commented Jan 13, 2022

No, I don't have this issue now , it successfully ended.
seems that all (except one) my problems were because of this.
errors:

  1. reference with 5 contigs. 1 chromosome and 4 plasmids.
    I got:
Traceback (most recent call last):
  File "../bin/parsnp", line 1328, in <module>
    if block_spos < chr_spos:
TypeError: '<' not supported between instances of 'int' and 'list'

I changed the reference to one contig with only chromosome
and got the other error:

mkdir: cannot create directory ‘../blocks/’: File exists
10 seqs, max length 59, avg length 59

*** WARNING *** Assuming DNA (see -seqtype option), invalid letters found:

*** WARNING *** Assuming DNA (see -seqtype option), invalid letters found:

*** WARNING *** Assuming DNA (see -seqtype option), invalid letters found:

*** WARNING *** Assuming DNA (see -seqtype option), invalid letters found:

*** WARNING *** Assuming DNA (see -seqtype option), invalid letters found:

and like this

mkdir: cannot create directory /blocks/’: File exists
 10 seqs, max length 127, avg  length 127

*** WARNING *** Assuming DNA (see -seqtype option), invalid letters found: 

*** WARNING *** Assuming DNA (see -seqtype option), invalid letters found: 
*** buffer overflow detected ***: /miniconda3/envs/parsnp/bin/bin/parsnp_core terminated

and have variants with cluster too

mkdir: cannot create directory ‘/blocks/’: File exists
 10 seqs, max length 59, avg  length 59

*** WARNING *** Assuming DNA (see -seqtype option), invalid letters found: 

*** WARNING *** Assuming DNA (see -seqtype option), invalid letters found: 

*** WARNING *** Assuming DNA (see -seqtype option), invalid letters found: 

*** WARNING *** Assuming DNA (see -seqtype option), invalid letters found: 

*** WARNING *** Assuming DNA (see -seqtype option), invalid letters found: 

Alignment not completed, cannot save.

*** ERROR ***  TreeFromSeqVect_UPGMA, CLUSTER_6 not supported

*** WARNING *** Assuming DNA (see -seqtype option), invalid letters found: 

*** WARNING *** Assuming DNA (see -seqtype option), invalid letters found: 
*** buffer overflow detected ***: /home/miniconda3/envs/parsnp/bin/bin/parsnp_core terminated

*** WARNING *** Assuming DNA (see -seqtype option), invalid letters found: 

*** WARNING *** Assuming DNA (see -seqtype option), invalid letters found: 

I've found the files with K, M, Y symbols and removed them
and I still have this error

then I've done test dir with ideal 5 genomes and run parsnp without slurm on another server but just in terminal, I've set 30 cpus
and it worked!
then I've added the genome with N symbol and it worked too
then I've set 70 cpus (all cpus of this server) and I've got the error about:
*** ERROR *** TreeFromSeqVect_UPGMA, CLUSTER_6 not supported
and I've seen the same error on the slurm server when I've set all cpus of the node -p.

Now I've the final output folder with all output files: changed reference with one chromosome and including all genomes(with K,Y, W symbols), I see them in log file of parsnp (about Len and gc) but couldn't find on the tree.
It is ALL looks very strange.
I've used parsnp nearly year ago with hundred plasmids (only sequences of special plasmid) on the laptop and it worked excellent!

In the beginning I had this issue on slurm server with 750 gb ram too:

*** MAX MEMORY 4 MB EXCEEDED***
Memory allocated so far 16004 MB, physical RAM 680 MB
Use -maxmb <n> option to increase limit, where <n> is in MB.

There is no such flag.
The command all the time was:
parsnp -r ref.fasta -d genomes_dir -o output_dir -c -x -p different values
tried to set -P too, but finally it worked without it
version of parsnp from conda 1.5.6

Valery

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants