Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

--fit-model error #321

Closed
luciagrami opened this issue Aug 16, 2024 · 3 comments · May be fixed by #322
Closed

--fit-model error #321

luciagrami opened this issue Aug 16, 2024 · 3 comments · May be fixed by #322
Assignees
Labels
bug Something isn't working

Comments

@luciagrami
Copy link

Hello,
I am using poppunk 2.7.0 with pp-sketchlib v2.1.4.

I am having issues when running fit model:

poppunk --fit-model dbscan --ref-db TBdb --output TBdb_hdbscan

Output:
PopPUNK (POPulation Partitioning Using Nucleotide Kmers)
(with backend: sketchlib v2.1.4
sketchlib: /atlas/apps/miniconda3/envs/pp_env/lib/python3.11/site-packages/pp_sketchlib.cpython-311-x86_64-linux-gnu.so)

Graph-tools OpenMP parallelisation enabled: with 35 threads
Mode: Fitting dbscan model to reference database

Fitting HDBSCAN model using a CPU
Fitting HDBSCAN model using a CPU
Fitting HDBSCAN model using a CPU
Fitting HDBSCAN model using a CPU
Assigning distances with DBSCAN model
100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1026/1026 [00:26<00:00, 38.10it/s]
Fit summary:
Number of clusters 21
Number of datapoints 100000
Number of assignments 81096

Scaled component means
[0.65383488 0.22816747]
[0.38883853 0.14202529]
[0.09366661 0.00975697]
[2.56513596e-01 4.57124497e-06]
[0.22594947 0. ]
[4.91949245e-02 4.69420002e-06]
[7.73994699e-02 2.12680433e-07]
[1.55685291e-01 3.80115409e-07]
[8.13707039e-02 9.56804570e-07]
[0.08370336 0. ]
[8.88103917e-02 5.63386891e-07]
[0.0864341 0. ]
[0.12166263 0. ]
[1.23963781e-01 1.00648538e-06]
[0.09071088 0. ]
[0.09283514 0. ]
[1.10260807e-01 9.44051010e-07]
[1.17498189e-01 5.60512490e-08]
[0.0949481 0. ]
[9.98577848e-02 8.35967001e-07]
[1.04917660e-01 6.65403661e-07]

Network summary:
Components 1547
Density 0.0163
Transitivity 0.6403
Mean betweenness 0.3620
Weighted-mean betweenness 0.1349
Score 0.6299
Score (w/ betweenness) 0.4019
Score (w/ weighted-betweenness) 0.5449
Traceback (most recent call last):
File "/atlas/apps/miniconda3/envs/pp_env/bin/poppunk", line 11, in
sys.exit(main())
^^^^^^
File "/atlas/apps/miniconda3/envs/pp_env/lib/python3.11/site-packages/PopPUNK/main.py", line 668, in main
isolateClustering = {fit_type: printClusters(genomeNetwork,
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/atlas/apps/miniconda3/envs/pp_env/lib/python3.11/site-packages/PopPUNK/network.py", line 1520, in printClusters
unword = next(unword_generator)
^^^^^^^^^^^^^^^^^^^^^^
File "/atlas/apps/miniconda3/envs/pp_env/lib/python3.11/site-packages/PopPUNK/unwords.py", line 31, in gen_unword
word += "".join(syllable()0)
^^^^^^^^^^^^^^^
File "/atlas/apps/miniconda3/envs/pp_env/lib/python3.11/site-packages/PopPUNK/unwords.py", line 20, in
cv = lambda: consonant() + vowel()
^^^^^^^^^^^
File "/atlas/apps/miniconda3/envs/pp_env/lib/python3.11/site-packages/PopPUNK/unwords.py", line 19, in
consonant = lambda: random.sample(consonants, 1)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/atlas/apps/miniconda3/envs/pp_env/lib/python3.11/random.py", line 439, in sample
raise TypeError("Population must be a sequence. "
TypeError: Population must be a sequence. For dicts or sets, use sorted(d).

The error occurs when trying different models. Any suggestions? Thanks!

johnlees added a commit that referenced this issue Aug 19, 2024
@johnlees
Copy link
Member

Hopefully a fix for this in #322, which should appear in v2.7.1.

I'm not sure why that has suddenly started happening, and why it doesn't happen in the tests. Also makes it hard for me to verify this does fix the issue!

Can you change the following line in PopPUNK/unwords.py:
https://github.com/bacpop/PopPUNK/pull/322/files

To do this you'll need to clone the repository, and run poppunk with python poppunk-runner.py instead of just poppunk.

@johnlees
Copy link
Member

i.e. on line 16 add sorted() around the right hand side
consonants = sorted(set(string.ascii_lowercase) - set(vowels) - set(trouble))

@johnlees
Copy link
Member

johnlees commented Nov 7, 2024

Fixed in v2.7.1

@johnlees johnlees closed this as completed Nov 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants