-
Notifications
You must be signed in to change notification settings - Fork 437
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Propagate stderr from pyre.bin #863
base: main
Are you sure you want to change the base?
Conversation
@@ -192,12 +192,17 @@ def _run_check_command(command: Sequence[str]) -> CheckResult: | |||
result = subprocess.run( | |||
command, | |||
stdout=subprocess.PIPE, | |||
stderr=log_file.file, | |||
stderr=subprocess.PIPE, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if it would make more sense to leave this as a file, but dump the file whenever the return code is nonzero.
This might be better if the binary were ever to dump a huge amount to stderr, plus it makes it possible to cat the file if you want to check for messages as the command is running, and it would be enough to fix the issue we just saw where we weren't seeing the glibc link error.
What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about I stream the logs to both the file and the python wrapper STDERR? I think it's always worth propagating the stderr as it makes debugging stuff much easier.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh wait I think I see what's going on here.
Pyre is piping the stdout, which will only contain type errors, but it is using the log file for stderr which includes all the ocaml logging as well as this backtrace.
So we definitely do want the log file, and we probably don't want to pipe the stderr directly because the python side is doing some prettifying (especially in interactive mode).
I think the problem must be that sometimes not all the logs actually get spit out before we get to line 224 in the case where the binary logs a stack trace and exits nonzero. The problem might be with the assumptions in this code around start.background_logging
and how eagerly it logs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@grievejia Do you remember how this works exactly?
I'm not an expert on context managers, but what we saw was that when the pyre binary crashed with a fairly low-level stack trace (a glibc version mismatch), Pyre was never displaying the stderr output.
I wonder if the problem might actually be in the client/log/log.py
:
if <big switch on log section from pyre>
...
else:
LOG.debug(line)
this may mean we are doing debug-level logging of stack traces and so they are invisible by default
@vthemelis and / or I can dig into this but wondering if you remember off the top of your head
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ha, indeed, that seems to be the case (which is why the same error comes up as Debug in this older issue).
Looks like if the log-line doesn't have the usual date-time SEVERITY message
format, pyre will be using LOG.Debug
to log this line or (I think) the severity of the previously logged line.
Even this though doesn't quite make sense as I can see DEBUG messages in the output provided in #860 yet the libc errors do not seem to have propagated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that thread joining is the underlying problem, but given that it's a context manager I think it may be that introducing a delay is the easiest fix. I'd be fine with adding a print
or time.sleep
statement anywhere that fixes the issue for now, with a comment noting the issue it's addressing (could include a link to this discussion).
I slightly prefer using a delay versus delete=False
because we probably don't want the backend file to persist if we can avoid it, it's really only there to pipe logs to the frontend. But it's a temp file that will get cleaned up eventually anyway, so maybe my concern is misplaced here.
@samwgoldman might have more ideas on how to join the threads in a more principled way, I'm very much not a threading expert. But I'd like to get a short-term fix in soon and we can improve on it later rather than block this PR, anything that fixes the dropped logs issue is great work and a fix we want sooner than later!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd really prefer to avoid adding a sleep call. Can we confirm where the log is being dropped? Based on my reading of the code what should be happening:
- pyre.bin's stderr is redirected to a log file
- the log file is being "tail"ed by
log.log_tailer
and converted into an Iterable for each line - the
StreamLogger
prefixes each line and writes to stderr via theLOG
logger
I'm not super familiar with the code in (2) and (3), but honestly it looks overcomplicated. I think it's nice to have a separate thread to read from the file, but that thread should just readline
on the file descriptor until EOF
. When we join that thread, the join should wait for the thread to hit that EOF.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, thinking more about this, I think we should move this code to use the lower-level subprocess.Popen
code, then read from stderr and stdout in two separate threads. This is what subprocess.communicate
does (and what subprocess.run
does in turn).
The thread reading from stderr can then output to the log file and transform+log to stderr. This is much nicer than redirecting to the file and then trying to also tail that file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After a little more looking at this, I think the real issue isn't lack of a join, since we do join
. I think the real issue is that StreamLogger
has two modes it runs in, a best-effort tailing mode and a proper logging mode.
The tailing mode is used in some situations where we start a pyre daemon, e.g. start._run_in_background
, and the logger is currently optimized for that case by having self._should_stop_reading_stream
which gets set on __exit__
, so that whenever the main thread decides we're done we stop reading logs.
But the proper logging mode is what we want for shell commands like pyre check
, and when in that mode we really shouldn't be using _should_stop_reading_stream
at all, we should run to completion.
This suggests either using two classes or an attribute to differentiate the cases
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Whoops I missed Sam's second comment.
Using Popen and two threads (or even just one thread, stdout is going to be used in the foreground process anyway) should work well here too, and then the existing file-based StreamLogger could be used only for deamonized logic like start._run_in_background
.
We can't completely rid of the file-based approach because of the need to use a file in the daemon case; the existing code is trying to use the same approach for both which is indeed questionable.
Pre-submission checklist
pip install -r requirements-dev.txt && pre-commit install
pre-commit run
Summary
Make it easier to diagnose issues like #860
Test Plan
Ran locally and got the full error message when running pyre