-
Notifications
You must be signed in to change notification settings - Fork 20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Help with issues running poppunk in HPC environment #193
Comments
This is always a tricky one. Potentially we've got a memory related bug, these sometimes cause a crash and sometimes don't. First, it would be helpful to get a minimal example.
Depending on those results, I will hopefully be able to rerun with the same settings, and look at a memory debugger. |
Hi John,
Yes same issue with --create-db running 10 or 2 files.
However --sketch seemed to work fine with 3000ish files (I stopped it manually as I was limited to 1 thread and it was going to take a while!).
also worked fine to completion with 10 files. as did --query DB with 10 files. I think it would have been fine with the larger dataset too but I am currently unable to get more than 1 thread node due to congestion on the hpc!
So it seems specific to the --create-db command.
Carmen
…________________________________
From: John Lees ***@***.***>
Sent: 10 November 2021 12:09
To: johnlees/PopPUNK ***@***.***>
Cc: Carmen Sheppard ***@***.***>; Author ***@***.***>
Subject: Re: [johnlees/PopPUNK] Help with issues running poppunk in HPC environment (Issue #193)
This is always a tricky one. Potentially we've got a memory related bug, these sometimes cause a crash and sometimes don't. First, it would be helpful to get a minimal example.
1. Can you try running with two files and then ten files in files.txt and see if you get the same issue?
2. Does running poppunk_sketch to a) create the sketch database and b) create the distance matrix work? Both with the full set, and the smaller set of ten.
Depending on those results, I will hopefully be able to rerun with the same settings, and look at a memory debugger.
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub<https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fjohnlees%2FPopPUNK%2Fissues%2F193%23issuecomment-965070477&data=04%7C01%7CCarmen.Sheppard%40phe.gov.uk%7C30666a7159e34a82fab408d9a442f0b3%7Cee4e14994a354b2ead475f3cf9de8666%7C0%7C0%7C637721429680553396%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=GqanQZ42%2FKVitD35kR8eF1%2BpIpOdw9qm1i67JTKM2io%3D&reserved=0>, or unsubscribe<https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FAIBC5SGBPVC2WJQQRDT6SLLULJOHLANCNFSM5HXWNITA&data=04%7C01%7CCarmen.Sheppard%40phe.gov.uk%7C30666a7159e34a82fab408d9a442f0b3%7Cee4e14994a354b2ead475f3cf9de8666%7C0%7C0%7C637721429680553396%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=iARnGuTVInrNmSPaMf0m4fhjkIRWe8PcFpKdRUxUyEA%3D&reserved=0>.
Triage notifications on the go with GitHub Mobile for iOS<https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fapps.apple.com%2Fapp%2Fapple-store%2Fid1477376905%3Fct%3Dnotification-email%26mt%3D8%26pt%3D524675&data=04%7C01%7CCarmen.Sheppard%40phe.gov.uk%7C30666a7159e34a82fab408d9a442f0b3%7Cee4e14994a354b2ead475f3cf9de8666%7C0%7C0%7C637721429680563344%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=mqyPcxQ%2B5%2FWFAxutSPA2or1bPd0ZJFyhSEIo2KS%2BZTs%3D&reserved=0> or Android<https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fplay.google.com%2Fstore%2Fapps%2Fdetails%3Fid%3Dcom.github.android%26referrer%3Dutm_campaign%253Dnotification-email%2526utm_medium%253Demail%2526utm_source%253Dgithub&data=04%7C01%7CCarmen.Sheppard%40phe.gov.uk%7C30666a7159e34a82fab408d9a442f0b3%7Cee4e14994a354b2ead475f3cf9de8666%7C0%7C0%7C637721429680573301%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=YbRY6UlZOPxtl0gCy0BEkCwKRBXVmQcxaYlltlIrQTg%3D&reserved=0>.
*******************************************************************************************
The information contained in the Email and any attachments is confidential and intended solely and for the attention and use of the named addressee(s). It may not be disclosed to any other person without the express authority of the UKHSA, or the intended recipient or both. If you are not the intended recipient, you must not disclose, copy, distribute or retain this message or any part of it. This footnote also confirms that this Email has been swept for computer viruses by Exchange Online Protection, but please re-sweep any attachments before opening or saving. UK Health Security Agency -(https://www.gov.uk/government/organisations/uk-health-security-agency)
*******************************************************************************************
|
Ok that's useful. Would it be possible to share those two files? If not, I'll see if I can reproduce in a debugger with two test files of my own. |
Ok I'll email you the assemblies directly. It doesnt' seem to matter which files I try though I've attempted to run different datasets with the same issues. |
Hi Carmen, I've tried to reproduce this bug, but I am unable to. I've also run through with valgrind (which lets you check for segfaults) and that doesn't turn anything up during the call to sketching made by You should at least see:
or similar when the code runs. I think any error before that is due to one of the dependencies. I have a vague memory of there being some possible issues with the
If that prints some output, but then eventually fails on graph_tool being missing, then that's the culprit. |
Hi John,. Yes looks like the graph tool is the problem :
Is there a previous fix for this issue? |
Ok, glad that we seem to have nailed it down. Unfortunately I don't write or support graph tool so the help I can provide is more limited, but some suggestions I have:
and then uploading
|
Ok I tried versions of graph-tool in conda back to 2.35 with the same fault. The earliest version in conda-forge 2.29 then caused other incompatibilities so I didn't try that. We'll think about the other suggestions you made. Thanks very much for your help! |
@nickjcroucher asked me to raise a formal issue when I mentioned to him
Versions*
Name Version Build Channel
_libgcc_mutex 0.1 conda_forge conda-forge
_openmp_mutex 4.5 1_gnu conda-forge
apscheduler 3.8.1 py39hf3d152e_0 conda-forge
at-spi2-atk 2.38.0 h0630a04_3 conda-forge
at-spi2-core 2.40.3 h0630a04_0 conda-forge
atk-1.0 2.36.0 h3371d22_4 conda-forge
boost 1.74.0 py39h5472131_3 conda-forge
boost-cpp 1.74.0 h312852a_4 conda-forge
brotlipy 0.7.0 py39h3811e60_1001 conda-forge
bzip2 1.0.8 h7f98852_4 conda-forge
c-ares 1.18.1 h7f98852_0 conda-forge
ca-certificates 2021.10.8 ha878542_0 conda-forge
cached-property 1.5.2 hd8ed1ab_1 conda-forge
cached_property 1.5.2 pyha770c72_1 conda-forge
cairo 1.16.0 h6cf1ce9_1008 conda-forge
cairomm 1.12.2 ha770c72_3 conda-forge
cairomm-1.0 1.12.2 h56b4340_3 conda-forge
certifi 2021.10.8 py39hf3d152e_1 conda-forge
cffi 1.14.6 py39h4bc2ebd_1 conda-forge
chardet 4.0.0 py39hf3d152e_1 conda-forge
charset-normalizer 2.0.0 pyhd8ed1ab_0 conda-forge
click 8.0.3 py39hf3d152e_0 conda-forge
colorama 0.4.4 pyh9f0ad1d_0 conda-forge
cryptography 35.0.0 py39h95dcef6_1 conda-forge
cycler 0.11.0 pyhd8ed1ab_0 conda-forge
dataclasses 0.8 pyhc8e2a94_3 conda-forge
dbus 1.13.6 h48d8840_2 conda-forge
dendropy 4.5.2 pyh3252c3a_0 bioconda
epoxy 1.5.9 h7f98852_0 conda-forge
expat 2.4.1 h9c3ff4c_0 conda-forge
flask 2.0.2 pyhd8ed1ab_0 conda-forge
flask-apscheduler 1.12.2 pyhd8ed1ab_1 conda-forge
flask-cors 3.0.10 pyhd8ed1ab_0 conda-forge
font-ttf-dejavu-sans-mono 2.37 hab24e00_0 conda-forge
font-ttf-inconsolata 3.000 h77eed37_0 conda-forge
font-ttf-source-code-pro 2.038 h77eed37_0 conda-forge
font-ttf-ubuntu 0.83 hab24e00_0 conda-forge
fontconfig 2.13.1 hba837de_1005 conda-forge
fonts-conda-ecosystem 1 0 conda-forge
fonts-conda-forge 1 0 conda-forge
freetype 2.10.4 h0708190_1 conda-forge
fribidi 1.0.10 h36c2ea0_0 conda-forge
gdk-pixbuf 2.42.6 h04a7f16_0 conda-forge
gettext 0.19.8.1 h73d1719_1008 conda-forge
glib 2.70.0 h780b84a_1 conda-forge
glib-tools 2.70.0 h780b84a_1 conda-forge
gmp 6.2.1 h58526e2_0 conda-forge
graph-tool 2.43 py39hc4320a7_0 conda-forge
graph-tool-base 2.43 py39h8160539_0 conda-forge
graphite2 1.3.13 h58526e2_1001 conda-forge
gtk3 3.24.29 h8c9bf5d_3 conda-forge
gunicorn 20.1.0 py39hf3d152e_0 conda-forge
h5py 3.2.1 nompi_py39h98ba4bc_100 conda-forge
harfbuzz 3.0.0 h83ec7ef_1 conda-forge
hdbscan 0.8.27 py39hce5d2b2_0 conda-forge
hdf5 1.10.6 nompi_h6a2412b_1114 conda-forge
hicolor-icon-theme 0.17 ha770c72_2 conda-forge
icu 68.2 h9c3ff4c_0 conda-forge
idna 3.1 pyhd3deb0d_0 conda-forge
itsdangerous 2.0.1 pyhd8ed1ab_0 conda-forge
jbig 2.1 h7f98852_2003 conda-forge
jinja2 3.0.2 pyhd8ed1ab_0 conda-forge
joblib 1.1.0 pyhd8ed1ab_0 conda-forge
jpeg 9d h36c2ea0_0 conda-forge
kiwisolver 1.3.2 py39h1a9c180_0 conda-forge
krb5 1.19.2 hcc1bbae_2 conda-forge
lcms2 2.12 hddcbb42_0 conda-forge
ld_impl_linux-64 2.36.1 hea4e1c9_2 conda-forge
lerc 3.0 h9c3ff4c_0 conda-forge
libblas 3.9.0 12_linux64_openblas conda-forge
libcblas 3.9.0 12_linux64_openblas conda-forge
libcups 2.3.3 hf5a7f15_0 conda-forge
libcurl 7.79.1 h2574ce0_1 conda-forge
libdeflate 1.8 h7f98852_0 conda-forge
libedit 3.1.20191231 he28a2e2_2 conda-forge
libev 4.33 h516909a_1 conda-forge
libffi 3.4.2 h9c3ff4c_4 conda-forge
libgcc-ng 11.2.0 h1d223b6_11 conda-forge
libgfortran-ng 11.2.0 h69a702a_11 conda-forge
libgfortran5 11.2.0 h5c6108e_11 conda-forge
libgirepository 1.70.0 hb520f89_0 conda-forge
libglib 2.70.0 h174f98d_1 conda-forge
libgomp 11.2.0 h1d223b6_11 conda-forge
libiconv 1.16 h516909a_0 conda-forge
liblapack 3.9.0 12_linux64_openblas conda-forge
libnghttp2 1.43.0 h812cca2_1 conda-forge
libopenblas 0.3.18 pthreads_h8fe5266_0 conda-forge
libpng 1.6.37 h21135ba_2 conda-forge
librsvg 2.52.3 hc3c00ef_0 conda-forge
libssh2 1.10.0 ha56f1ee_2 conda-forge
libstdcxx-ng 11.2.0 he4da1e4_11 conda-forge
libtiff 4.3.0 h6f004c6_2 conda-forge
libuuid 2.32.1 h7f98852_1000 conda-forge
libwebp-base 1.2.1 h7f98852_0 conda-forge
libxcb 1.13 h7f98852_1003 conda-forge
libxml2 2.9.12 h72842e0_0 conda-forge
libzlib 1.2.11 h36c2ea0_1013 conda-forge
lz4-c 1.9.3 h9c3ff4c_1 conda-forge
markupsafe 2.0.1 py39h3811e60_0 conda-forge
matplotlib-base 3.4.3 py39h2fa2bec_1 conda-forge
ncurses 6.2 h58526e2_4 conda-forge
networkx 2.6.3 pyhd8ed1ab_1 conda-forge
numpy 1.21.3 py39hdbf815f_0 conda-forge
olefile 0.46 pyh9f0ad1d_1 conda-forge
openblas 0.3.18 pthreads_h4748800_0 conda-forge
openjpeg 2.4.0 hb52868f_1 conda-forge
openssl 1.1.1l h7f98852_0 conda-forge
pandas 1.3.4 py39hde0f152_0 conda-forge
pango 1.48.10 h54213e6_2 conda-forge
pcre 8.45 h9c3ff4c_0 conda-forge
pillow 8.3.2 py39ha612740_0 conda-forge
pip 21.3.1 pyhd8ed1ab_0 conda-forge
pixman 0.40.0 h36c2ea0_0 conda-forge
poppunk 2.4.0 py39h7f0572b_0 bioconda
pp-sketchlib 1.7.4 py39hdefe18a_0 conda-forge
pthread-stubs 0.4 h36c2ea0_1001 conda-forge
pycairo 1.20.1 py39hedcb9fc_0 conda-forge
pycparser 2.20 pyh9f0ad1d_2 conda-forge
pygobject 3.42.0 py39ha6f447c_0 conda-forge
pyopenssl 21.0.0 pyhd8ed1ab_0 conda-forge
pyparsing 3.0.4 pyhd8ed1ab_0 conda-forge
pysocks 1.7.1 py39hf3d152e_3 conda-forge
python 3.9.7 hb7a2778_3_cpython conda-forge
python-dateutil 2.8.2 pyhd8ed1ab_0 conda-forge
python-tzdata 2021.5 pyhd8ed1ab_0 conda-forge
python_abi 3.9 2_cp39 conda-forge
pytz 2021.3 pyhd8ed1ab_0 conda-forge
pytz-deprecation-shim 0.1.0.post0 py39hf3d152e_0 conda-forge
rapidnj 2.3.2 h7d875b9_1 bioconda
readline 8.1 h46c0cb4_0 conda-forge
requests 2.26.0 pyhd8ed1ab_0 conda-forge
scikit-learn 1.0.1 py39h7c5d8c9_1 conda-forge
scipy 1.7.1 py39hee8e79c_0 conda-forge
setuptools 58.4.0 py39hf3d152e_1 conda-forge
sigcpp-2.0 2.10.7 h9c3ff4c_0 conda-forge
six 1.16.0 pyh6c4a22f_0 conda-forge
sparsehash 2.0.4 h9c3ff4c_0 conda-forge
sqlite 3.36.0 h9cd32fc_2 conda-forge
threadpoolctl 3.0.0 pyh8a188c0_0 conda-forge
tk 8.6.11 h27826a3_1 conda-forge
tornado 6.1 py39h3811e60_1 conda-forge
tqdm 4.62.3 pyhd8ed1ab_0 conda-forge
tzdata 2021e he74cb21_0 conda-forge
tzlocal 2.1 pyh9f0ad1d_0 conda-forge
urllib3 1.26.7 pyhd8ed1ab_0 conda-forge
werkzeug 2.0.1 pyhd8ed1ab_0 conda-forge
wheel 0.37.0 pyhd8ed1ab_1 conda-forge
xorg-compositeproto 0.4.2 h7f98852_1001 conda-forge
xorg-damageproto 1.2.1 h7f98852_1002 conda-forge
xorg-fixesproto 5.0 h7f98852_1002 conda-forge
xorg-inputproto 2.3.2 h7f98852_1002 conda-forge
xorg-kbproto 1.0.7 h7f98852_1002 conda-forge
xorg-libice 1.0.10 h7f98852_0 conda-forge
xorg-libsm 1.2.3 hd9c2040_1000 conda-forge
xorg-libx11 1.6.12 h36c2ea0_0 conda-forge
xorg-libxau 1.0.9 h7f98852_0 conda-forge
xorg-libxaw 1.0.14 h7f98852_0 conda-forge
xorg-libxcomposite 0.4.5 h7f98852_0 conda-forge
xorg-libxcursor 1.2.0 h516909a_0 conda-forge
xorg-libxdamage 1.1.5 h7f98852_0 conda-forge
xorg-libxdmcp 1.1.3 h7f98852_0 conda-forge
xorg-libxext 1.3.4 h516909a_0 conda-forge
xorg-libxfixes 5.0.3 h516909a_1004 conda-forge
xorg-libxi 1.7.10 h516909a_0 conda-forge
xorg-libxinerama 1.1.4 h9c3ff4c_1001 conda-forge
xorg-libxmu 1.1.3 h516909a_0 conda-forge
xorg-libxpm 3.5.13 h516909a_0 conda-forge
xorg-libxrandr 1.5.2 h516909a_1 conda-forge
xorg-libxrender 0.9.10 h516909a_1002 conda-forge
xorg-libxt 1.1.5 h516909a_1003 conda-forge
xorg-libxtst 1.2.3 h516909a_1002 conda-forge
xorg-randrproto 1.5.0 h7f98852_1001 conda-forge
xorg-recordproto 1.14.2 h7f98852_1002 conda-forge
xorg-renderproto 0.11.1 h7f98852_1002 conda-forge
xorg-util-macros 1.19.3 h7f98852_0 conda-forge
xorg-xextproto 7.3.0 h7f98852_1002 conda-forge
xorg-xproto 7.0.31 h7f98852_1007 conda-forge
xz 5.2.5 h516909a_1 conda-forge
zlib 1.2.11 h36c2ea0_1013 conda-forge
zstandard 0.16.0 py39h3811e60_0 conda-forge
zstd 1.5.0 ha95c52a_0 conda-forge
Command used and output returned
poppunk --create-db --output poppunk/poppunk_db --r-files files.txt --threads 1
Segmentation fault (core dumped)
Describe the bug
I cannot run poppunk --create-db on the hpc environment. after a few seconds it reports a segmentation fault (core dumped) and does not give any other output. Other commands can successfully run.
I have deleted and recreated the poppunk environment on the cluster and tried starting afresh.
I can run poppunk fine on my laptop. I have checked versions of the hpc software against the same software on my from conda env list and then manually installed any versions that were different but this didnt' change the error, now everything that is on my laptop poppunk env is the same as the hpc poppunk env..
The hpc is running an older version of conda which I cannot update 4.9.2 (laptop 4.10.3) I'm wondering if there's something the --create-db calls that causes the issue on the hpc?
The text was updated successfully, but these errors were encountered: