-
-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Toward a better signal handling #1705
Comments
The issue raised on #1667 wasn't entirely clear on what the problem was, but this explanation has given me a lot more context, so for starters: thank you. On pass through of signals, on that I'm agreed - though it raises some questions: If I, as a user of nodemon, start nodemon on a process, then hit ctrl-c - it should send a SIGINT signal to the subprocess but it should also actually send the signal to nodemon - and thus nodemon should quit. Going by your description of the issue, it suggests that if the subprocess doesn't quit, neither should nodemon - which feels…odd (though maybe entirely correct, IDK). On sending the SIGUSR2 to the subprocess, this is entirely configurable by the user. Since there's a limited number of signals, I had to pick one, and this one was … the best from the available choice. IIRC, sending a SIGHUP to the subprocess didn't always cause the subprocess to stop (or there was higher chance of the user-code intercepting it and handling), so I settled on SIGUSR2 - but I settled nearly a decade ago, so my memory is hazy on exactly why. Again, this is user configurable, so I don't see it as being a problem. Finally, you mention a timeout config. In what respect? It should default to a timeout before restarting only to make the default usage as simple as possible (i.e. run the thing, it works, focus on dev), but the user can configure a delay before sending the restart signal (currently). But you might be thinking of another use for timeout? |
From documentation:
PS: Thank you for your quick response! |
|
@concatime
|
I'm trying to get my head fully around the problem this is trying to address - but it seems that it is either: about the signal forwarding from nodemon to the sub-process, or it's about all signals but gently brushing over how signals are handled in a restart request (this is the bulk of the complexity in nodemon, so expect suprises here). On this example:
Currently nodemon will listen for a SIGHUP to perform a triggered reload (as per unix daemons), but with the scheme above it would cause nodemon to exit. Re - forwarding signals, this is the current logic with the exception of There's also logic that states (in nodemon) that for the full process tree, each sub-process (of the user code) is also sent the signal (which is why you're seeing the spammy waiting message - which I think should be moved to verbose messaging instead of the standard log). Only after those sub-sub-processes have quit, is the final signal sent to the user process - but this only happens on linux - on a mac the OS handles it differently - which is nice and confusing. Currently there's no timeout for this - it simply waits until all the user sub-processes have finished. I'm sure there's more to consider (like Windows is entirely ignored in this logic, understandably because windows is mostly handled through a "simple" On last point, it seems that the * Side note: I suspect, but am not 100% certain, that sending a |
Let’s start with
However, the second role applies to only daemonized processes because deamons are detached from their terminal. So the system will never send this signal to them. In our case, nodemon is NOT a daemon but rather a foreground process, so it should not interpret Now,
Also, after thinking about it, it’s useless for nodemon to listen to @remy, I understand. Changing the One last thing. The command “rs” should be renamed to “rl” (reload). Strictly speaking, a restart is sending SIGTERM, SIGKILL after timeout if child does not exit, then start the child (systemd). Version 2:
|
This issue has been automatically marked as idle and stale because it hasn't had any recent activity. It will be automtically closed if no further activity occurs. If you think this is wrong, or the problem still persists, just pop a reply in the comments and @remy will (try!) to follow up. |
So… Firstly, there's no real reason to justify changing the Coming back to the Which begs the question: how do you now programatically reload the subprocess whilst nodemon is running? On |
I've also documented the current state of nodemon and how signals are handled when sent both to nodemon and to the subprocess: https://docs.google.com/spreadsheets/d/1gFxwZqv1cBPJpiqeVtYh0MIC30RJlx6g0VF-2B8pQDE/edit?usp=sharing |
This doesn't cover the common case where the subprocess of nodemon then spawns it's own processes (with |
This is possibly the better justification for a change to the system (as tmux is a common tool):
|
I've just close #1667 to continue conversation here, and want to thank @axxie for starting this initial discussion. I hope it's clear, but my hesitation is based entirely on trying to keep a stable system (nodemon is downloaded some 500,000 times a day and used in over a million repositories - so I want to be careful about these changes - something I've learnt and been burnt from the past!). After documenting the current state of signal handling and specifically seeing the tmux example (which I think I missed originally), I'm moving towards the following:
The code that tries to shutdown the sub-subprocesses needs review, currently this is the situation:
This spamming logic isn't for signal handling but because subprocesses would get stuck (particularly when there was a broad tree of spawned processes) and nodemon had no way to know whether the spawned process was ignoring the signal or if it was stuck somehow. A simple search shows that nodemon leaving a process running after exit or after it thinks it has restarted the subprocess is very much an issue for devs - and it's hard to get a good handle on the problem because it ranges from Windows to linux to alpine (where there's no -- Finally, on reflection, I'm not sure this logic is right:
It means that if the subprocess intentionally handles Final thought and question: should this be a Potential solution: go back to the original issue that caused the change, and understand the context it was created in. |
The SIGINT in the
That was my first proposition. Good choice as long as |
Is that normal behaviour though - in general? (I hadn't read it as a second SIGINT which makes sense reading now). Are there any examples of this type of system (double tap to end). nodemon -> "no demon"? Or "node mon"? Or "nodemon" (like pokemon)? 😆 |
I don’t remember where I’ve seen the two consecutive SIGINT, but I know it’s quite useful to debug a program not handling correctly SIGINT or to test/bypass the graceful shutdown. Say you interpret SIGINT as graceful shutdown (just like SIGTERM), but during the shutdown, you have a bug or blocking code. You can then forcefully quit by sending another SIGINT (^c) and fix your code. |
This issue has been automatically marked as idle and stale because it hasn't had any recent activity. It will be automtically closed if no further activity occurs. If you think this is wrong, or the problem still persists, just pop a reply in the comments and @remy will (try!) to follow up. |
I just read through this issue out of pure interest. You are both very knowledgeable and well-spoken. Keep up the good work! @remy @concatime |
Just a data point here: I remember using a program (though I can't remember what it is) that printed out something like "attempting to shut down gracefully, press Ctrl+C again to force quit" when I pressed Ctrl+C. And I have used software before that didn't seem to correctly shut down when I press Ctrl+C, and in those cases, I usually mash Ctrl+C until the software seems to listen (I can't be the only one who does this). So IMO the suggestion of first sending SIGINT (on first Ctrl+C), then SIGKILL (on second or maybe even third Ctrl+C) makes complete sense. |
@bduffany Docker Compose does this, and IMO it's a great escape hatch for early development, before you have nailed down your process handling. Any anyway, I got to this issue because I too was wondering why my signal handling logic wasn't working behind |
nodemon -v
: 2.0.3node -v
: v13.13.0npx nodemon
This is not a bug, but rather a discussion on how to handle signals.
Heavely related to #1667.
Actual behaviour
Currently, it’s quite a mess ( ͡° ͜ʖ ͡°).
Basics
to close
,to end
,to exit
,to quit
,to stop
,to terminate
,to shut down
, a bit confusing isn’t it?Let me quote the official documentation.
SIGTERM
:SIGQUIT
:SIGHUP
:Expected behaviour
This is rather debatable, but here I go.
When nodmeon receives SIGINT or SIGHUP, it should pass the signal to the child process, and if the process does not quit after X seconds, nodemon should ignore the signal. This way, the child process is free to do whatever it wants (e.g. reload config or gracefully quit). If the child stops before the timeout, nodemon should also stops. This should solves #1661. Also, if SIGINT is received before timeout, SIGKILL is sent to the child. This way, the user can forcefully quit with two
^C
.When nodemon receives SIGTERM, it should pass the signal to the child process, and if the process does not quit after X seconds, send SIGKILL to the child, then quit nodemon. This is the behaviour of systemd and it simplifies integration with system’s init.
When nodmeon receives SIGQUIT, it should immediatly SIGKILL the child, then quit nodemon. This way, no cleaning is done by the child.
Also, the restart should be interpretad more as a realod. This way, we can specify a
reloadSignal
(set to SIGHUP by default) which is sent each time a change is detected by watch or user typesrestart
. nodemon should not use (implicitely at least) SIGUSR2 since it’s a user-defined condition (aka. reserved for the child process for other tasks).We should also add a
timeout
config to set the value used by nodemon.WDYT?
The text was updated successfully, but these errors were encountered: