Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HQ crashes #743

Open
chkabir opened this issue Aug 23, 2024 · 4 comments
Open

HQ crashes #743

chkabir opened this issue Aug 23, 2024 · 4 comments

Comments

@chkabir
Copy link

chkabir commented Aug 23, 2024

Hi,

I was running an hq server at the Oven node at metacentrum.cz. The oven node is supposed to be explicitly designed to let processes run for long times, and even after their walltime. However, for the last instances the Hq server keeps crashing. Below I attach the relevant statements from the log file:

97: 0x557345bba500 - main
98: 0x154fcfff624a - __libc_start_call_main
at ./csu/../sysdeps/nptl/libc_start_call_main.h:58:16
99: 0x154fcfff6305 - __libc_start_main_impl
at ./csu/../csu/libc-start.c:360:3
100: 0x557345ad4049 -
101: 0x0 -
Oops, HyperQueue has crashed. This is a bug, sorry for that.
If you would be so kind, please report this issue at the HQ issue tracker: https://github.com/It4innovations/hyperqueue/issues/new?title=HQ%20crashes
Please include the above error (starting from "thread ... panicked ...") and the stack backtrace in the issue contents, along with the following information:

HyperQueue version: v0.19.0

You can also re-run HyperQueue server (and its workers) with the RUST_LOG=hq=debug,tako=debug
environment variable, and attach the logs to the issue, to provide us more information.

Can you kindly look into this error ?

@Kobzol
Copy link
Collaborator

Kobzol commented Aug 23, 2024

Hi, thanks for the report. It looks like you have cut out the most important part of the stack trace though (you only sent the lines starting at stack frame #97) :) Could you please include the whole stack trace? Thanks!

@chkabir
Copy link
Author

chkabir commented Aug 23, 2024

Sorry about that: this is the whole stack thread

thread 'main' panicked at crates/tako/src/internal/server/worker.rs:126:9:
assertion failed: self.sn_tasks.remove(&task.id)
stack backtrace:
0: 0x557345f0abf9 - std::backtrace_rs::backtrace::libunwind::trace::hbee8a7973eeb6c93
at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/std/src/../../backtrace/src/backtrace/libunwind.rs:104:5
1: 0x557345f0abf9 - std::backtrace_rs::backtrace::trace_unsynchronized::hc8ac75eea3aa6899
at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/std/src/../../backtrace/src/backtrace/mod.rs:66:5
2: 0x557345f0abf9 - std::sys_common::backtrace::_print_fmt::hc7f3e3b5298b1083
at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/std/src/sys_common/backtrace.rs:68:5
3: 0x557345f0abf9 - <std::sys_common::backtrace::_print::DisplayBacktrace as core::fmt::Display>::fmt::hbb235daedd7c6190
at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/std/src/sys_common/backtrace.rs:44:22
4: 0x557345c55b60 - core::fmt::rt::Argument::fmt::h76c38a80d925a410
at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/core/src/fmt/rt.rs:142:9
5: 0x557345c55b60 - core::fmt::write::h3ed6aeaa977c8e45
at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/core/src/fmt/mod.rs:1120:17
6: 0x557345ed387e - std::io::Write::write_fmt::h78b18af5775fedb5
at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/std/src/io/mod.rs:1810:15
7: 0x557345f0cc2e - std::sys_common::backtrace::_print::h5d645a07e0fcfdbb
at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/std/src/sys_common/backtrace.rs:47:5
8: 0x557345f0cc2e - std::sys_common::backtrace::print::h85035a511aafe7a8
at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/std/src/sys_common/backtrace.rs:34:9
9: 0x557345f0c4d7 - std::panicking::default_hook::{{closure}}::hcce8cea212785a25
10: 0x557345f0c0bf - std::panicking::default_hook::hf5fcb0f213fe709a
at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/std/src/panicking.rs:292:9
11: 0x557345bb9eeb - call<(&core::panic::panic_info::PanicInfo), (dyn core::ops::function::Fn<(&core::panic::panic_info::PanicInfo), Output=()> + core::marker::Send + core::marker::Sync), alloc::alloc::Global>
at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/alloc/src/boxed.rs:2029:9
12: 0x557345bb9eeb - {closure#0}
at /__w/hyperqueue/hyperqueue/crates/hyperqueue/src/bin/hq.rs:360:9
13: 0x557345f0d21a - <alloc::boxed::Box<F,A> as core::ops::function::Fn>::call::hbc5ccf4eb663e1e5
at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/alloc/src/boxed.rs:2029:9
14: 0x557345f0d21a - std::panicking::rust_panic_with_hook::h095fccf1dc9379ee
at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/std/src/panicking.rs:783:13
15: 0x557345f0cf68 - std::panicking::begin_panic_handler::{{closure}}::h032ba12139b353db
at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/std/src/panicking.rs:649:13
16: 0x557345f0cef6 - std::sys_common::backtrace::__rust_end_short_backtrace::h9259bc2ff8fd0f76
at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/std/src/sys_common/backtrace.rs:171:18
17: 0x557345f0ceef - rust_begin_unwind
at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/std/src/panicking.rs:645:5
18: 0x557345a94074 - core::panicking::panic_fmt::h784f20a50eaab275
at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/core/src/panicking.rs:72:14
19: 0x557345a94242 - core::panicking::panic::hb837a5ebbbe5b188
at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/core/src/panicking.rs:144:5
20: 0x557345f1e426 - remove_sn_task
at /__w/hyperqueue/hyperqueue/crates/tako/src/internal/server/worker.rs:126:9
21: 0x557345b60477 - on_task_finishedtako::internal::server::comm::CommSender
22: 0x557345b60477 - {async_fn#0}<futures_util::stream::stream::split::SplitStream<tokio_util::codec::framed::Framed<tokio::net::tcp::stream::TcpStream, tokio_util::codec::length_delimited::LengthDelimitedCodec>>>
at /__w/hyperqueue/hyperqueue/crates/tako/src/internal/server/rpc.rs:270:17
23: 0x557345b60477 - {closure#2}
at /github/home/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.36.0/src/macros/select.rs:524:49
24: 0x557345b60477 - poll<tako::internal::server::rpc::worker_rpc_loop::{async_fn#0}::__tokio_select_util::Out<core::result::Result<core::option::Optiontako::internal::messages::worker::WorkerStopReason, tako::internal::common::error::DsError>, core::result::Result<(), std::io::error::Error>, tako::gateway::LostWorkerReason>, tako::internal::server::rpc::worker_rpc_loop::{async_fn#0}::{closure_env#2}>
at /github/home/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.36.0/src/future/poll_fn.rs:58:9
25: 0x557345b60477 - {async_fn#0}
at /__w/hyperqueue/hyperqueue/crates/tako/src/internal/server/rpc.rs:212:18
26: 0x557345b87b4b - {async_block#0}
at /__w/hyperqueue/hyperqueue/crates/tako/src/internal/server/rpc.rs:64:83
27: 0x557345b87b4b - {closure#0}<tako::internal::server::rpc::connection_initiator::{async_fn#0}::{async_block_env#0}, alloc::sync::Arc<tokio::task::local::Shared, alloc::alloc::Global>>
at /github/home/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.36.0/src/runtime/task/core.rs:328:17
28: 0x557345b87b4b - with_mut<tokio::runtime::task::core::Stagetako::internal::server::rpc::connection_initiator::{async_fn#0}::{async_block_env#0}, core::task::poll::Poll<()>, tokio::runtime::task::core::{impl#6}::poll::{closure_env#0}<tako::internal::server::rpc::connection_initiator::{async_fn#0}::{async_block_env#0}, alloc::sync::Arc<tokio::task::local::Shared, alloc::alloc::Global>>>
at /github/home/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.36.0/src/loom/std/unsafe_cell.rs:16:9
29: 0x557345b87b4b - poll<tako::internal::server::rpc::connection_initiator::{async_fn#0}::{async_block_env#0}, alloc::sync::Arc<tokio::task::local::Shared, alloc::alloc::Global>>
at /github/home/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.36.0/src/runtime/task/core.rs:317:30
30: 0x557345b87b4b - {closure#0}<tako::internal::server::rpc::connection_initiator::{async_fn#0}::{async_block_env#0}, alloc::sync::Arc<tokio::task::local::Shared, alloc::alloc::Global>>
at /github/home/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.36.0/src/runtime/task/harness.rs:485:19
31: 0x557345b87b4b - call_once<core::task::poll::Poll<()>, tokio::runtime::task::harness::poll_future::{closure_env#0}<tako::internal::server::rpc::connection_initiator::{async_fn#0}::{async_block_env#0}, alloc::sync::Arc<tokio::task::local::Shared, alloc::alloc::Global>>>
at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/core/src/panic/unwind_safe.rs:272:9
32: 0x557345b87b4b - do_call<core::panic::unwind_safe::AssertUnwindSafe<tokio::runtime::task::harness::poll_future::{closure_env#0}<tako::internal::server::rpc::connection_initiator::{async_fn#0}::{async_block_env#0}, alloc::sync::Arc<tokio::task::local::Shared, alloc::alloc::Global>>>, core::task::poll::Poll<()>>
at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/std/src/panicking.rs:552:40
33: 0x557345b87b4b - try<core::task::poll::Poll<()>, core::panic::unwind_safe::AssertUnwindSafe<tokio::runtime::task::harness::poll_future::{closure_env#0}<tako::internal::server::rpc::connection_initiator::{async_fn#0}::{async_block_env#0}, alloc::sync::Arc<tokio::task::local::Shared, alloc::alloc::Global>>>>
at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/std/src/panicking.rs:516:19
34: 0x557345b87b4b - catch_unwind<core::panic::unwind_safe::AssertUnwindSafe<tokio::runtime::task::harness::poll_future::{closure_env#0}<tako::internal::server::rpc::connection_initiator::{async_fn#0}::{async_block_env#0}, alloc::sync::Arc<tokio::task::local::Shared, alloc::alloc::Global>>>, core::task::poll::Poll<()>>
at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/std/src/panic.rs:142:14
35: 0x557345b87b4b - poll_future<tako::internal::server::rpc::connection_initiator::{async_fn#0}::{async_block_env#0}, alloc::sync::Arc<tokio::task::local::Shared, alloc::alloc::Global>>
at /github/home/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.36.0/src/runtime/task/harness.rs:473:18
36: 0x557345b87b4b - poll_inner<tako::internal::server::rpc::connection_initiator::{async_fn#0}::{async_block_env#0}, alloc::sync::Arc<tokio::task::local::Shared, alloc::alloc::Global>>
at /github/home/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.36.0/src/runtime/task/harness.rs:208:27
37: 0x557345b87b4b - poll<tako::internal::server::rpc::connection_initiator::{async_fn#0}::{async_block_env#0}, alloc::sync::Arc<tokio::task::local::Shared, alloc::alloc::Global>>
at /github/home/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.36.0/src/runtime/task/harness.rs:153:15
38: 0x557345b87b4b - poll<tako::internal::server::rpc::connection_initiator::{async_fn#0}::{async_block_env#0}, alloc::sync::Arc<tokio::task::local::Shared, alloc::alloc::Global>>
at /github/home/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.36.0/src/runtime/task/raw.rs:271:5
39: 0x557345f6f399 - poll
at /github/home/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.36.0/src/runtime/task/raw.rs:201:18
40: 0x557345f6f399 - run<alloc::sync::Arc<tokio::task::local::Shared, alloc::alloc::Global>>
at /github/home/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.36.0/src/runtime/task/mod.rs:416:9
41: 0x557345f6f399 - {closure#0}
at /github/home/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.36.0/src/task/local.rs:676:68
42: 0x557345f6f399 - with_budget<(), tokio::task::local::{impl#4}::tick::{closure_env#0}>
at /github/home/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.36.0/src/runtime/coop.rs:107:5
43: 0x557345f6f399 - budget<(), tokio::task::local::{impl#4}::tick::{closure_env#0}>
at /github/home/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.36.0/src/runtime/coop.rs:73:5
44: 0x557345f6f399 - tick
at /github/home/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.36.0/src/task/local.rs:676:31
45: 0x557345b59b7a - {closure#0}tokio::net::tcp::listener::{impl#0}::accept::{async_fn_env#0}
at /github/home/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.36.0/src/task/local.rs:982:16
46: 0x557345b59b7a - {closure#0}<core::task::poll::Poll<core::result::Result<(tokio::net::tcp::stream::TcpStream, core::net::socket_addr::SocketAddr), std::io::error::Error>>, tokio::task::local::{impl#10}::poll::{closure_env#0}tokio::net::tcp::listener::{impl#0}::accept::{async_fn_env#0}>
at /github/home/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.36.0/src/task/local.rs:730:13
47: 0x557345b59b7a - try_with<tokio::task::local::LocalData, tokio::task::local::{impl#4}::with::{closure_env#0}<core::task::poll::Poll<core::result::Result<(tokio::net::tcp::stream::TcpStream, core::net::socket_addr::SocketAddr), std::io::error::Error>>, tokio::task::local::{impl#10}::poll::{closure_env#0}tokio::net::tcp::listener::{impl#0}::accept::{async_fn_env#0}>, core::task::poll::Poll<core::result::Result<(tokio::net::tcp::stream::TcpStream, core::net::socket_addr::SocketAddr), std::io::error::Error>>>
at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/std/src/thread/local.rs:270:16
48: 0x557345b59b7a - with<tokio::task::local::LocalData, tokio::task::local::{impl#4}::with::{closure_env#0}<core::task::poll::Poll<core::result::Result<(tokio::net::tcp::stream::TcpStream, core::net::socket_addr::SocketAddr), std::io::error::Error>>, tokio::task::local::{impl#10}::poll::{closure_env#0}tokio::net::tcp::listener::{impl#0}::accept::{async_fn_env#0}>, core::task::poll::Poll<core::result::Result<(tokio::net::tcp::stream::TcpStream, core::net::socket_addr::SocketAddr), std::io::error::Error>>>
at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/std/src/thread/local.rs:246:9
49: 0x557345b59b7a - with<core::task::poll::Poll<core::result::Result<(tokio::net::tcp::stream::TcpStream, core::net::socket_addr::SocketAddr), std::io::error::Error>>, tokio::task::local::{impl#10}::poll::{closure_env#0}tokio::net::tcp::listener::{impl#0}::accept::{async_fn_env#0}>
at /github/home/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.36.0/src/task/local.rs:728:17
50: 0x557345b59b7a - polltokio::net::tcp::listener::{impl#0}::accept::{async_fn_env#0}
at /github/home/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.36.0/src/task/local.rs:968:9
51: 0x557345b59b7a - {async_fn#0}tokio::net::tcp::listener::{impl#0}::accept::{async_fn_env#0}
at /github/home/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.36.0/src/task/local.rs:635:19
52: 0x557345b59b7a - {async_fn#0}<tokio::net::tcp::listener::{impl#0}::accept::{async_fn_env#0}, core::result::Result<(tokio::net::tcp::stream::TcpStream, core::net::socket_addr::SocketAddr), std::io::error::Error>>
at /__w/hyperqueue/hyperqueue/crates/tako/src/internal/common/taskgroup.rs:15:36
53: 0x557345af3388 - {async_fn#0}
at /__w/hyperqueue/hyperqueue/crates/tako/src/internal/server/rpc.rs:48:68
54: 0x557345af3388 - {closure#0}
at /github/home/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.36.0/src/macros/select.rs:524:49
55: 0x557345af3388 - poll<tako::internal::server::start::server_start::{async_fn#0}::{async_block#0}::__tokio_select_util::Out<(), core::result::Result<(), tako::internal::common::error::DsError>>, tako::internal::server::start::server_start::{async_fn#0}::{async_block#0}::{closure_env#0}>
at /github/home/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.36.0/src/future/poll_fn.rs:58:9
56: 0x557345af3388 - {async_block#0}
at /__w/hyperqueue/hyperqueue/crates/tako/src/internal/server/start.rs:92:9
57: 0x557345af3388 - {closure#2}
at /github/home/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.36.0/src/macros/select.rs:524:49
58: 0x557345b0fd83 - poll<hyperqueue::server::backend::{impl#0}::start::{async_fn#0}::{async_block#1}::__tokio_select_util::Out<core::result::Result<(), hyperqueue::common::error::HqError>, core::result::Result<(), hyperqueue::common::error::HqError>, core::result::Result<(), tako::internal::common::error::DsError>>, hyperqueue::server::backend::{impl#0}::start::{async_fn#0}::{async_block#1}::{closure_env#2}>
at /github/home/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.36.0/src/future/poll_fn.rs:58:9
59: 0x557345b0fd83 - {async_block#1}
at /__w/hyperqueue/hyperqueue/crates/hyperqueue/src/server/backend.rs:165:13
60: 0x557345b0fd83 - {closure#0}
at /github/home/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.36.0/src/macros/select.rs:524:49
61: 0x557345b0fd83 - poll<hyperqueue::server::bootstrap::initialize_server::{async_fn#0}::{async_block#4}::__tokio_select_util::Out<(), (), (), core::result::Result<(), hyperqueue::common::error::HqError>>, hyperqueue::server::bootstrap::initialize_server::{async_fn#0}::{async_block#4}::{closure_env#0}>
at /github/home/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.36.0/src/future/poll_fn.rs:58:9
62: 0x557345b0fd83 - {async_block#4}
at /__w/hyperqueue/hyperqueue/crates/hyperqueue/src/server/bootstrap.rs:253:22
63: 0x557345b0fd83 - {closure#0}hyperqueue::server::bootstrap::initialize_server::{async_fn#0}::{async_block_env#4}
at /github/home/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.36.0/src/task/local.rs:978:42
64: 0x557345b0fd83 - {closure#0}<core::task::poll::Poll<core::result::Result<(), anyhow::Error>>, tokio::task::local::{impl#10}::poll::{closure_env#0}hyperqueue::server::bootstrap::initialize_server::{async_fn#0}::{async_block_env#4}>
at /github/home/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.36.0/src/task/local.rs:730:13
65: 0x557345b0fd83 - try_with<tokio::task::local::LocalData, tokio::task::local::{impl#4}::with::{closure_env#0}<core::task::poll::Poll<core::result::Result<(), anyhow::Error>>, tokio::task::local::{impl#10}::poll::{closure_env#0}hyperqueue::server::bootstrap::initialize_server::{async_fn#0}::{async_block_env#4}>, core::task::poll::Poll<core::result::Result<(), anyhow::Error>>>
at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/std/src/thread/local.rs:270:16
66: 0x557345b0fd83 - with<tokio::task::local::LocalData, tokio::task::local::{impl#4}::with::{closure_env#0}<core::task::poll::Poll<core::result::Result<(), anyhow::Error>>, tokio::task::local::{impl#10}::poll::{closure_env#0}hyperqueue::server::bootstrap::initialize_server::{async_fn#0}::{async_block_env#4}>, core::task::poll::Poll<core::result::Result<(), anyhow::Error>>>
at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/std/src/thread/local.rs:246:9
67: 0x557345b0fd83 - with<core::task::poll::Poll<core::result::Result<(), anyhow::Error>>, tokio::task::local::{impl#10}::poll::{closure_env#0}hyperqueue::server::bootstrap::initialize_server::{async_fn#0}::{async_block_env#4}>
at /github/home/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.36.0/src/task/local.rs:728:17
68: 0x557345b0fd83 - pollhyperqueue::server::bootstrap::initialize_server::{async_fn#0}::{async_block_env#4}
at /github/home/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.36.0/src/task/local.rs:968:9
69: 0x557345b0fd83 - {async_fn#0}hyperqueue::server::bootstrap::initialize_server::{async_fn#0}::{async_block_env#4}
at /github/home/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.36.0/src/task/local.rs:635:19
70: 0x557345b0fd83 - {async_fn#0}
at /__w/hyperqueue/hyperqueue/crates/hyperqueue/src/server/bootstrap.rs:368:30
71: 0x557345bb1405 - {async_fn#0}
at /__w/hyperqueue/hyperqueue/crates/hyperqueue/src/server/bootstrap.rs:71:49
72: 0x557345bb1405 - {async_fn#0}
at /__w/hyperqueue/hyperqueue/crates/hyperqueue/src/client/commands/server.rs:159:43
73: 0x557345bb1405 - {async_fn#0}
at /__w/hyperqueue/hyperqueue/crates/hyperqueue/src/client/commands/server.rs:115:69
74: 0x557345bb1405 - {async_block#0}
at /__w/hyperqueue/hyperqueue/crates/hyperqueue/src/bin/hq.rs:386:70
75: 0x557345b9f83d - poll<&mut hq::main::{async_block_env#0}>
at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/core/src/future/future.rs:124:9
76: 0x557345b9f83d - {closure#0}<core::pin::Pin<&mut hq::main::{async_block_env#0}>>
at /github/home/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.36.0/src/runtime/scheduler/current_thread/mod.rs:659:57
77: 0x557345b9f83d - with_budget<core::task::poll::Poll<core::result::Result<(), hyperqueue::common::error::HqError>>, tokio::runtime::scheduler::current_thread::{impl#8}::block_on::{closure#0}::{closure#0}::{closure_env#0}<core::pin::Pin<&mut hq::main::{async_block_env#0}>>>
at /github/home/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.36.0/src/runtime/coop.rs:107:5
78: 0x557345b9f83d - budget<core::task::poll::Poll<core::result::Result<(), hyperqueue::common::error::HqError>>, tokio::runtime::scheduler::current_thread::{impl#8}::block_on::{closure#0}::{closure#0}::{closure_env#0}<core::pin::Pin<&mut hq::main::{async_block_env#0}>>>
at /github/home/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.36.0/src/runtime/coop.rs:73:5
79: 0x557345b9f83d - {closure#0}<core::pin::Pin<&mut hq::main::{async_block_env#0}>>
at /github/home/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.36.0/src/runtime/scheduler/current_thread/mod.rs:659:25
80: 0x557345b9f83d - enter<core::task::poll::Poll<core::result::Result<(), hyperqueue::common::error::HqError>>, tokio::runtime::scheduler::current_thread::{impl#8}::block_on::{closure#0}::{closure_env#0}<core::pin::Pin<&mut hq::main::{async_block_env#0}>>>
at /github/home/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.36.0/src/runtime/scheduler/current_thread/mod.rs:404:19
81: 0x557345b9f83d - {closure#0}<core::pin::Pin<&mut hq::main::{async_block_env#0}>>
at /github/home/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.36.0/src/runtime/scheduler/current_thread/mod.rs:658:36
82: 0x557345b9f83d - {closure#0}<tokio::runtime::scheduler::current_thread::{impl#8}::block_on::{closure_env#0}<core::pin::Pin<&mut hq::main::{async_block_env#0}>>, core::option::Option<core::result::Result<(), hyperqueue::common::error::HqError>>>
at /github/home/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.36.0/src/runtime/scheduler/current_thread/mod.rs:737:68
83: 0x557345b9f83d - set<tokio::runtime::scheduler::Context, tokio::runtime::scheduler::current_thread::{impl#8}::enter::{closure_env#0}<tokio::runtime::scheduler::current_thread::{impl#8}::block_on::{closure_env#0}<core::pin::Pin<&mut hq::main::{async_block_env#0}>>, core::option::Option<core::result::Result<(), hyperqueue::common::error::HqError>>>, (alloc::boxed::Box<tokio::runtime::scheduler::current_thread::Core, alloc::alloc::Global>, core::option::Option<core::result::Result<(), hyperqueue::common::error::HqError>>)>
at /github/home/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.36.0/src/runtime/context/scoped.rs:40:9
84: 0x557345b9f83d - {closure#0}<(alloc::boxed::Box<tokio::runtime::scheduler::current_thread::Core, alloc::alloc::Global>, core::option::Option<core::result::Result<(), hyperqueue::common::error::HqError>>), tokio::runtime::scheduler::current_thread::{impl#8}::enter::{closure_env#0}<tokio::runtime::scheduler::current_thread::{impl#8}::block_on::{closure_env#0}<core::pin::Pin<&mut hq::main::{async_block_env#0}>>, core::option::Option<core::result::Result<(), hyperqueue::common::error::HqError>>>>
at /github/home/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.36.0/src/runtime/context.rs:176:26
85: 0x557345b9f83d - try_with<tokio::runtime::context::Context, tokio::runtime::context::set_scheduler::{closure_env#0}<(alloc::boxed::Box<tokio::runtime::scheduler::current_thread::Core, alloc::alloc::Global>, core::option::Option<core::result::Result<(), hyperqueue::common::error::HqError>>), tokio::runtime::scheduler::current_thread::{impl#8}::enter::{closure_env#0}<tokio::runtime::scheduler::current_thread::{impl#8}::block_on::{closure_env#0}<core::pin::Pin<&mut hq::main::{async_block_env#0}>>, core::option::Option<core::result::Result<(), hyperqueue::common::error::HqError>>>>, (alloc::boxed::Box<tokio::runtime::scheduler::current_thread::Core, alloc::alloc::Global>, core::option::Option<core::result::Result<(), hyperqueue::common::error::HqError>>)>
at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/std/src/thread/local.rs:270:16
86: 0x557345b9f83d - with<tokio::runtime::context::Context, tokio::runtime::context::set_scheduler::{closure_env#0}<(alloc::boxed::Box<tokio::runtime::scheduler::current_thread::Core, alloc::alloc::Global>, core::option::Option<core::result::Result<(), hyperqueue::common::error::HqError>>), tokio::runtime::scheduler::current_thread::{impl#8}::enter::{closure_env#0}<tokio::runtime::scheduler::current_thread::{impl#8}::block_on::{closure_env#0}<core::pin::Pin<&mut hq::main::{async_block_env#0}>>, core::option::Option<core::result::Result<(), hyperqueue::common::error::HqError>>>>, (alloc::boxed::Box<tokio::runtime::scheduler::current_thread::Core, alloc::alloc::Global>, core::option::Option<core::result::Result<(), hyperqueue::common::error::HqError>>)>
at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/std/src/thread/local.rs:246:9
87: 0x557345b9f83d - set_scheduler<(alloc::boxed::Box<tokio::runtime::scheduler::current_thread::Core, alloc::alloc::Global>, core::option::Option<core::result::Result<(), hyperqueue::common::error::HqError>>), tokio::runtime::scheduler::current_thread::{impl#8}::enter::{closure_env#0}<tokio::runtime::scheduler::current_thread::{impl#8}::block_on::{closure_env#0}<core::pin::Pin<&mut hq::main::{async_block_env#0}>>, core::option::Option<core::result::Result<(), hyperqueue::common::error::HqError>>>>
at /github/home/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.36.0/src/runtime/context.rs:176:17
88: 0x557345b9f83d - enter<tokio::runtime::scheduler::current_thread::{impl#8}::block_on::{closure_env#0}<core::pin::Pin<&mut hq::main::{async_block_env#0}>>, core::option::Option<core::result::Result<(), hyperqueue::common::error::HqError>>>
at /github/home/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.36.0/src/runtime/scheduler/current_thread/mod.rs:737:27
89: 0x557345b9f83d - block_on<core::pin::Pin<&mut hq::main::{async_block_env#0}>>
at /github/home/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.36.0/src/runtime/scheduler/current_thread/mod.rs:646:19
90: 0x557345b9f83d - {closure#0}hq::main::{async_block_env#0}
at /github/home/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.36.0/src/runtime/scheduler/current_thread/mod.rs:175:28
91: 0x557345b9f83d - enter_runtime<tokio::runtime::scheduler::current_thread::{impl#0}::block_on::{closure_env#0}hq::main::{async_block_env#0}, core::result::Result<(), hyperqueue::common::error::HqError>>
at /github/home/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.36.0/src/runtime/context/runtime.rs:65:16
92: 0x557345b9f83d - block_onhq::main::{async_block_env#0}
at /github/home/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.36.0/src/runtime/scheduler/current_thread/mod.rs:167:9
93: 0x557345b9f83d - block_onhq::main::{async_block_env#0}
at /github/home/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.36.0/src/runtime/runtime.rs:348:47
94: 0x557345b9f83d - main
at /__w/hyperqueue/hyperqueue/crates/hyperqueue/src/bin/hq.rs:456:5
95: 0x557345b2b203 - call_once<fn() -> core::result::Result<(), hyperqueue::common::error::HqError>, ()>
at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/core/src/ops/function.rs:250:5
96: 0x557345b2b203 - __rust_begin_short_backtrace<fn() -> core::result::Result<(), hyperqueue::common::error::HqError>, core::result::Result<(), hyperqueue::common::error::HqError>>
at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/std/src/sys_common/backtrace.rs:155:18
97: 0x557345bba500 - main
98: 0x154fcfff624a - __libc_start_call_main
at ./csu/../sysdeps/nptl/libc_start_call_main.h:58:16
99: 0x154fcfff6305 - __libc_start_main_impl
at ./csu/../csu/libc-start.c:360:3
100: 0x557345ad4049 -
101: 0x0 -
Oops, HyperQueue has crashed. This is a bug, sorry for that.
If you would be so kind, please report this issue at the HQ issue tracker: https://github.com/It4innovations/hyperqueue/issues/new?title=HQ%20crashes
Please include the above error (starting from "thread ... panicked ...") and the stack backtrace in the issue contents, along with the following information:

HyperQueue version: v0.19.0

You can also re-run HyperQueue server (and its workers) with the RUST_LOG=hq=debug,tako=debug
environment variable, and attach the logs to the issue, to provide us more information.

@Kobzol
Copy link
Collaborator

Kobzol commented Aug 23, 2024

Oops, that looks like some race condition, we will take a look.

If you can reproduce the error, could you please run the server with the following environment variable: RUST_LOG=hq=debug,tako=debug hq server start and then sends us the full debug log if it crashes again? It would help us to debug it.

It would be also great to know how do you create workers (manually/autoalloc?) and what hq submit commands are you using.

@Kobzol
Copy link
Collaborator

Kobzol commented Sep 13, 2024

@chkabir Were you able to reproduce the issue and/or run HQ with more logging? :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants