-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Get thread and task backtraces before terminating a worker on timeout #157
Conversation
…eout * Introduce `timeout_backtraces` to control timeout-triggered thread+task backtraces
To also ignore SIGUSR2 (used by Julia to pause a thread). Also, redirect GDB output to a file: `gdb.btall`, otherwise thread backtraces are dumped to the master process' `stdout`.
5d64274
to
34001f5
Compare
CI doesn't seem to have |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CI doesn't seem to have gdb so the tests aren't running
I think this is worth investigating a little before we merge this
also left a few minor suggestions
src/workers.jl
Outdated
function trigger_backtraces(w::Worker, from::Symbol=:manual) | ||
if Sys.islinux() | ||
@debug "using GDB to get thread and task backtraces on worker $(w.pid) from $from" | ||
gdb_cmd = `gdb -ex "handle SIGSEGV noprint nostop pass" -ex "handle SIGUSR2 noprint nostop pass" -ex "set pagination 0" -ex "set logging overwrite on" -ex "set logging file gdb.btall" -ex "set logging redirect on" -ex "set logging enabled on" -ex "thread apply all bt" -ex "set logging enabled off" -ex "call jl_print_task_backtraces(1)" --batch -p $(w.pid)` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there is a lot going on here... i think we need a comment (maybe a long one)
perhaps we write this out over multiple lines so we can add comments about blocks of -ex
commands, like
gdb_cmd = `gdb -ex "handle SIGSEGV noprint nostop pass" -ex "handle SIGUSR2 noprint nostop pass" -ex "set pagination 0" -ex "set logging overwrite on" -ex "set logging file gdb.btall" -ex "set logging redirect on" -ex "set logging enabled on" -ex "thread apply all bt" -ex "set logging enabled off" -ex "call jl_print_task_backtraces(1)" --batch -p $(w.pid)` | |
gdb_cmd = Cmd([ | |
"gdb", | |
# comment here about SIGSEGV and SIGUSR2 | |
"-ex", "handle SIGSEGV noprint nostop pass", | |
"-ex", "handle SIGUSR2 noprint nostop pass", | |
"-ex", "set pagination 0", | |
"-ex", "set logging overwrite on", | |
"-ex", "set logging file gdb.btall", | |
"-ex", "set logging redirect on", | |
# comment here about getting traces for all threads | |
"-ex", "set logging enabled on", | |
"-ex", "thread apply all bt", | |
"-ex", "set logging enabled off", | |
# comment here... | |
"-ex", "call jl_print_task_backtraces(1)", | |
"--batch", "-p", "$(w.pid)" | |
]) |
e.g. i'm thinking a block like this needs explaining (probably not to a gdb
connoisseur but to some of us less familiar with this wizardry):
-ex "set logging enabled on" -ex "thread apply all bt" -ex "set logging enabled off"
@testset "Backtraces timeout trigger" begin | ||
function gdb_available() | ||
try | ||
run(`gdb -ex "exit"`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is there a way to run CI on machines with gdb
available? why don't the linux CI machines have it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, we need a step
in the CI workflow that has:
- name: Install GDB
run: sudo apt-get install -y gdb
But CI.yml is written in a platform-independent way and I haven't figured out how to add that step for Ubuntu only. If you know, that would be great.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, i think that just needs an if:
line like
- name: Install GDB
if: matrix.os == 'ubuntu-latest'
run: sudo apt-get install -y gdb
not certain on the syntax, might need to be
- name: Install GDB
if: ${{ matrix.os }} == 'ubuntu-latest'
run: sudo apt-get install -y gdb
or
- name: Install GDB
if: ${{ matrix.os == 'ubuntu-latest' }}
run: sudo apt-get install -y gdb
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added it, but the CI machine does not allow gdb
to attach to the process:
Could not attach to process. If your uid matches the uid of the target
process, check the setting of /proc/sys/kernel/yama/ptrace_scope, or try
again as the root user. For more details, see /etc/sysctl.d/10-ptrace.conf
ptrace: Inappropriate ioctl for device.
So I've backed it out. 🤷♂️
Co-authored-by: Nick Robinson <[email protected]>
Thread backtraces are displayed on the GDB process' standard out rather than on the worker's standard out. Use an IOBuffer on the main process to capture the GDB process' output (instead of using a file) and dump its contents after the captured logs. Also add some comments per PR review comments.
This reverts commit b923167.
0dce130
to
d9bf1d3
Compare
Because it might be taking a long time.
1038ce5
to
4836727
Compare
After much testing, it turns out that In my latest (still running) test:
Whatever the cause, using |
Implements the alternative described in #105 as an option.
I think a similar incantation can be assembled for
lldb
, to support Mac. We could support FreeBSD/OpenBSD since they havegdb
, but I don't have a machine that would let me test. Note thatSys.isbsd()
returnstrue
for Mac.Closing comment.