Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Assistance with HyperQueue PBS Allocation on a server #805

Open
KosarHooshmand opened this issue Jan 30, 2025 · 1 comment
Open

Assistance with HyperQueue PBS Allocation on a server #805

KosarHooshmand opened this issue Jan 30, 2025 · 1 comment

Comments

@KosarHooshmand
Copy link

I am currently attempting to run a Nextflow workflow using HyperQueue on a server but have encountered an issue with job allocation.

Specifically, I ran the following commands for allocation:

hq alloc add pbs --time-limit 12:00:00 --max-worker-count 2 --cpus 48 -- -P na1 -q normalsr -lncpus=104,mem=500G,storage=scratch/wq2+gdata/gp58+gdata/wq2
hq alloc add pbs --time-limit 48h -- -P wq2 -q normalsr -l walltime=48:00:00,ncpus=104,mem=100GB,jobfs=100GB,storage=gdata/gp58+gdata/wq2+scratch/wq2

The error message I received for both was:

Error: Received error: "Could not submit allocation: qsub execution failed\n\nCaused by:\n    Exit code: 32\n    Stderr: qsub: Error: The system doesn't support the use of \"-l select\". Please use \"-l ncpus\" and \"-l mem\" instead.\n    Stdout: " 

It seems like there may be a misconfiguration in how I am specifying the allocation parameters, but I cannot pinpoint the issue.

would greatly appreciate your guidance in identifying the issue.

@Kobzol
Copy link
Collaborator

Kobzol commented Jan 30, 2025

Hi, it's not a misconfiguration on your side. HyperQueue uses -lselect to tell PBS how many nodes should be requested in an allocation. However, it looks like your specific PBS version/instance does not accept -lselect, and instead wants different parameters (hence the error message).

This is a rather common issue with PBS, it exists in tens of different flavors and versions, which slightly differ in their behavior and sadly also their CLI parameters. HyperQueue can't know all the specific details of the used PBS, so it just tries some generic CLI parameters, but here it doesn't work.

How do you start PBS allocations manually on the cluster? Is -lmem required? Can you specify the number of nodes to spawn, rather than specifying the number of CPUs?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants