Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inside-out guest mode #1652

Open
wants to merge 13 commits into
base: main
Choose a base branch
from
Open

Inside-out guest mode #1652

wants to merge 13 commits into from

Conversation

oremanj
Copy link
Member

@oremanj oremanj commented Jun 28, 2020

This PR adds become_guest_for(), which moves an existing Trio run to be the guest of a newly-started host loop. When the host loop exits, the Trio run continues on, as just a normal Trio run again.

I'm planning to support this in trio-asyncio. It also just seems like a good tool to have in our toolbox, for cases where you want to wrap an existing some-other-loop framework in a Trio run so you can do some Trio things in it. For example, we seem to regularly get people in gitter asking about fastapi, and I think this might be a useful building block for a user-friendly story there.

@oremanj oremanj requested a review from njsmith June 28, 2020 10:27
@codecov
Copy link

codecov bot commented Jun 28, 2020

Codecov Report

Merging #1652 into master will increase coverage by 0.00%.
The diff coverage is 100.00%.

@@           Coverage Diff            @@
##           master    #1652    +/-   ##
========================================
  Coverage   99.61%   99.62%            
========================================
  Files         115      115            
  Lines       14445    14765   +320     
  Branches     1106     1138    +32     
========================================
+ Hits        14389    14709   +320     
  Misses         41       41            
  Partials       15       15            
Impacted Files Coverage Δ
trio/_core/__init__.py 100.00% <ø> (ø)
trio/lowlevel.py 100.00% <ø> (ø)
trio/_core/_run.py 100.00% <100.00%> (ø)
trio/_core/_thread_cache.py 100.00% <100.00%> (ø)
trio/_core/_traps.py 100.00% <100.00%> (ø)
trio/_core/tests/test_guest_mode.py 100.00% <100.00%> (ø)
trio/_core/tests/test_thread_cache.py 100.00% <100.00%> (ø)

Copy link
Member

@njsmith njsmith left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is kind of ridiculous, but also kind of ridiculously cool. There is a bit more implementation complexity than I really love, but it seems basically clean and correct AFAICT.

I'd like to understand the use cases a bit better, for when you'd use this versus regular guest mode.

You mentioned wanting to use this for trio-asyncio – what are you thinking there?

Do we have some good examples of when each mode is appropriate?

If we're serious that trio.run pretty much has to live as long as the host loop, then should inside-out mode be the only mode?

try:
deliver(result)
except KillThisThread:
return
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a deadlock hazard here too?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, this is just as dangerous as it was to call sys.exit() or raise an exception previously. It's used only in test_race_between_idle_exit_and_job_assignment, which used to call sys.exit().

@oremanj
Copy link
Member Author

oremanj commented Jul 6, 2020

I'd like to understand the use cases a bit better, for when you'd use this versus regular guest mode.

With regular guest mode, the Trio run is nested entirely within the foreign loop's lifetime. Trio starts after the foreign loop does, and must end before. If the foreign loop has a concept of cancellation, then you might need to write code to cancel the Trio run when the foreign loop is cancelled, and make sure the foreign loop doesn't exit until the Trio run does. (For example, on Qt this involves hooking up 'lastWindowClosed' to 'main_cancel_scope.cancel', and disabling 'quitOnLastWindowClosed'.)

With inside-out guest mode, the foreign loop is nested entirely within the Trio run's lifetime. The foreign loop starts after the Trio run does, and the Trio run won't end until at least when the foreign loop stops. You might need to write code to propagate a Trio cancellation into the foreign loop.

If you want to run a foreign loop briefly inside a larger Trio program, only inside-out guest mode is suitable. If you want to run a Trio run briefly inside a larger foreign-loop program, only regular guest mode is suitable. (One example of the latter might be IPython in autoawait-trio mode, to avoid blocking its own asyncio loop.) But in most cases the two lifetimes are approximately the same.

I think one of the biggest benefits of inside-out guest mode is that you don't have to figure out how to keep the foreign loop running after it "wants to" stop, if the Trio run isn't done yet. Imagine you have some asyncio-based application with an opaque synchronous entry point. Maybe that entry point calls loop.run_until_complete() on several different coroutines in a row, and then calls loop.close() at the end. Suppose you want to be able to use Trio in some functions called by that application. With regular guest mode, it's challenging (maybe even impossible) to make sure the Trio run completes before the loop is destroyed. With inside-out guest mode, you can get by with as little as

def entrypoint_without_trio(): ...  # from the existing app

def entrypoint_with_trio():
    def run_child_host(resume_trio_as_guest):
        # This assumes the app uses the existing asyncio event loop.
        # If it creates a new loop, you have to arrange for resume_trio_as_guest
        # to be called after that loop exists. That might be doable with an on-startup
        # hook or similar provided by your app.
        loop = asyncio.get_event_loop()
        resume_trio_as_guest(
            run_sync_soon_threadsafe=loop.call_soon_threadsafe,
            run_sync_soon_not_threadsafe=loop.call_soon,
        )
        return entrypoint_without_trio()
    return trio.run(
        trio.lowlevel.become_guest_for,
        run_child_host,
        lambda _: None,  # no cancellation propagation in this simple example
    )

You mentioned wanting to use this for trio-asyncio – what are you thinking there?

I think we should replace the current "SyncTrioEventLoop" with a wrapper like the above (but with better cancellation propagation so Ctrl+C works), and advertise it as the way to use pockets of Trio in a mostly-asyncio-based application. Maybe provide some wrappers for specific applications people like, such as fastapi. And use it for our testsuite checks.

If we're serious that trio.run pretty much has to live as long as the host loop, then should inside-out mode be the only mode?

Regular guest mode has the opposite problem -- the host loop has to live at least as long as the Trio run. In some situations it's easy to ensure that, and in others it's difficult. Inside-out guest mode lets you trade that problem for its opposite (ensuring the Trio run lives at least as long as the host loop), which is trivial (because the host loop keeps a task alive and the Trio run doesn't end until all tasks exit). In cases where the lifetime challenges of regular guest mode are solvable, I think it's easier to reason about and easier to write the glue code for, which argues for keeping it even though it's maybe a bit awkward API-wise to support both approaches.

del fn
del deliver
del result
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The lack of this del was keeping guest_state alive (via traceback locals) past the gc_collect_harder() in some of the internal error tests, and made the corresponding errors on destruction attach themselves to unrelated tests.

@njsmith
Copy link
Member

njsmith commented Dec 27, 2020

So my basic hesitation here is that the API makes it seem like temporarily switching to another host loop is something you can abstract away behind some blackbox interface. But, this is inherently not true, because "the host loop" is a piece of global state. So you could have two libraries that each individually work fine, but when you try to use them in the same app then it doesn't work. In current guest mode, of course, you have the same problem, but the API makes that obvious up front, because there's no way to even try to specify two different host loops at the same time.

I'm not 100% sure what to do with this hesitation though. Which is why this reply has been sitting in my browser half-written and unposted for literally months now :-). This came up in a much more concrete case today in chat, so that's nudging me to actually hit post: https://gitter.im/python-trio/general?at=5fe3c889acd1e516f8aedc53

@oremanj replied there:

fwiw I think most of the benefit of inside-out guest mode is in the flexibility it offers around startup/teardown order -- which is an important part of composability, but definitely not the whole story

I can see that. Specifically inside-out guest mode lets you delay starting the host loop until after trio is already running, and to continue running trio after the host loop has exited.

For the startup part: I guess the main idea is that you can e.g. wait until someone tries to use asyncio before transitioning to guest mode with asyncio as the host? Intuitively, it seems like this is never technically necessary, but just convenient: you can always work around it by pushing the asyncio startup out to when you start trio, right? Of course, this is convenient, but my point is that this is about ergonomics, not about technical capability -- does that sound right? And on the ergonomics, the question is exactly the one at the top of this post, about whether it's best to have some slick magic that breaks down in surprising ways when you push on it, or a more explicit API that exposes the actual limitations of the system.

For the shutdown part: there's a real technical issue here, where some host loops might insist on shutting down when they feel like it, not when trio's done. To some extent, this is always going to suck: even if we make it possible for trio to survive the host loop's unexpected death, the user's trio code that's trying to interact with the host loop is probably going to explode messily, and there's not much we can do about that. So I feel like whenever possible, trio/host integration libraries should try really really hard to make sure the host doesn't exit until after trio finishes. But that said, it would be possible to regular outside-in host guest mode to survive host death by basically switching back to regular trio.run. That's pretty simple. (In fact it's even technically possible right now by writing sufficiently clever host callbacks, but we could make it simpler.)

E.g., maybe start_guest_run should return a magic sync callable, which if invoked immediately iterates the rest of the trio run until exhausted, with the idea that you would call this after your host loop disappears to finish cleaning things up.

Would this give us effectively the same lifetime flexibility as inside-out guest mode, with much less internal complexity?

@oremanj
Copy link
Member Author

oremanj commented May 13, 2023

Sorry for the long delay here. I had somehow gotten the message from the most recent comment here of "BDFL says we aren't going to do this", and got discouraged and dropped it, but on another look that's not actually what you said. I'm revisiting this because I'm revisiting the "can we do trio-asyncio but better" question.

I'm not a huge fan of the TrioEventLoop approach taken by trio-asyncio currently, where the asyncio event loop is secretly a Trio task. It works, but suboptimally. It doesn't scale as well as a native aio loop to workloads with lots of file descriptors (each FD needs a Trio backing task to integrate it with the Trio IO manager, and each transition between waiting for readability vs not kills the backing task or creates a new one); you can't use more-performant aio loops like uvloop; call_soon is ~10x slower than in native asyncio and it's not clear to me how to make it faster; and the monkeypatching of the event loop policy to make everything work is brittle and confusing. We now have guest mode, which we didn't have when trio-asyncio was first being developed. I would much rather use that.

Without inside-out guest mode, I don't see how to implement the first async with aio_mode: in a program that started with trio.run(). With inside-out guest mode, it's comparatively easy: start a Trio system task that does await trio.lowlevel.become_guest_of(lambda: asyncio.run(something)), and make the cancellation handler propagate cancellation to the something.

So my basic hesitation here is that the API makes it seem like temporarily switching to another host loop is something you can abstract away behind some blackbox interface. But, this is inherently not true, because "the host loop" is a piece of global state. So you could have two libraries that each individually work fine, but when you try to use them in the same app then it doesn't work.

I definitely agree that this is a wart with the API. I think it can be sufficiently mitigated by some combination of documentation + a really good/explanatory error message if become_guest_of is invoked when Trio is already in guest mode. I don't want to sacrifice useful cases where there is only one host loop just to mitigate a composability problem in the comparatively rare case where the user tries to designate more than one of them.

E.g., maybe start_guest_run should return a magic sync callable, which if invoked immediately iterates the rest of the trio run until exhausted, with the idea that you would call this after your host loop disappears to finish cleaning things up.

Would this give us effectively the same lifetime flexibility as inside-out guest mode, with much less internal complexity?

I think this is plausibly a good thing to do independently of what we decide on inside-out guest mode. For an application framework that can control the top-level run() call, I think the "start_guest_run() returns a magic callable" approach works just as well as inside-out guest mode for solving the shutdown order issues. trio-asyncio isn't an application framework, though; ideally it lets you use an asyncio library in just one corner of a Trio program, or vice versa. (The Trio-library-in-asyncio-program case is less well-supported, but the abstract desirability of both situations seems symmetric to me.)

@oremanj
Copy link
Member Author

oremanj commented May 13, 2023

Well, asyncio guest mode turns out to be possible (https://github.com/oremanj/aioguest) so I guess it's a question of which approach is most palatable. aioguest has some fairly nasty monkeypatching, but I guess at least it's self-contained?

@CoolCat467
Copy link
Member

E.g., maybe start_guest_run should return a magic sync callable, which if invoked immediately iterates the rest of the trio run until exhausted, with the idea that you would call this after your host loop disappears to finish cleaning things up.

This sort of functionality would be invaluable in a project I was working on recently. In my trio on top of tkinter guest mode runner, we have to override the window close event handler for Tk and have it try to cancel the root trio task and wait for trio to shut down properly before we close the window.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants