Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stuck at waitForEvents #10015

Open
pereverges opened this issue Jul 16, 2024 · 1 comment
Open

Stuck at waitForEvents #10015

pereverges opened this issue Jul 16, 2024 · 1 comment
Labels

Comments

@pereverges
Copy link

Describe the bug

The code gets stuck in a worker.waitForEvents. I do not understand how this is possible if this is non blocking. Moreover I perform worker.close(), while it is block and it still does not jump to the following line. Any idea why this happens

Steps to Reproduce

  • Command line
  • UCX version1.16

Setup and versions

  • Ubuntu 16.4 + CPU architecture (x86_64)

Additional information (depending on the issue)

  • "UCXListener" ../source/configure: line 15969: PKG_PROG_PKG_CONFIG: command not found #16 prio=5 os_prio=0 tid=0x00007fb390001000 nid=0x1c850 runnable [0x00007fb3ccb78000]
    java.lang.Thread.State: RUNNABLE
    at org.openucx.jucx.ucp.UcpWorker.waitWorkerNative(Native Method)
    at org.openucx.jucx.ucp.UcpWorker.waitForEvents(UcpWorker.java:170)
    at es.bsc.comm.ucx.UCXListener.run(UCXListener.java:142)
    while (!this.stop) {
    try {
    try {
    if (worker.progress() == 0) {
    LOGGER.info("Waiting...");
    worker.waitForEvents();
    LOGGER.info("After Waiting..."
    }
    } catch (Exception e) {
    System.out.println("UCX: ERROR " +
    LOGGER.info("UCX: ERROR " + e);
    // worker.cancelRequest(null);
    }
    } catch (Exception e) {
    throw new RuntimeException(e);
    }
    LOGGER.info("RUN EVENTS " + this.stop);
    }
@pereverges pereverges added the Bug label Jul 16, 2024
@yosefe
Copy link
Contributor

yosefe commented Jul 16, 2024

waitForEvents is calling ucp_worker_wait which is a blocking function call. can you try calling worker.signal() to make it exit?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants