Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How do I catch errors in tasks started from fixtures and abort tests(s) using cancel scope? #120

Closed
gc-ss opened this issue May 3, 2021 · 12 comments · Fixed by #121
Closed

Comments

@gc-ss
Copy link

gc-ss commented May 3, 2021

Given a test file test_x.py:

# -*- coding: utf-8 -*-


async def test_fails_right_away(boom):
    ...


async def test_fails_needs_some_scopes(boom, depends_on_boom):
    ...

and a conftest.py:

# -*- coding: utf-8 -*-

from pytest_trio.enable_trio_mode import *

from functools import partial
import trio

import pytest

from loguru import logger


@pytest.fixture
def booms_ready():
    _booms_ready = trio.Event()
    return _booms_ready


async def run_depends_on_boom(booms_ready, task_status=trio.TASK_STATUS_IGNORED):

    logger.error("run_depends_on_boom:start")

    await booms_ready.wait()
    logger.error("run_depends_on_boom:waited. Starting…")

    task_status.started()


@pytest.fixture
async def depends_on_boom(nursery, booms_ready):
    logger.debug("depends_on_boom:start")
    _ = await nursery.start(partial(run_depends_on_boom, booms_ready))
    yield
    logger.debug("depends_on_boom:end")


async def run_boom(booms_ready, task_status=trio.TASK_STATUS_IGNORED):

    logger.error("run_boom:start")

    task_status.started()

    logger.error("run_boom:hitting-Exception")
    raise Exception()  # This will never get thrown in test_fails_needs_some_scopes. test_fails_right_away will catch this on KeyboardInterrupt

    booms_ready.set()


@pytest.fixture
async def boom(nursery, booms_ready):
    logger.debug("boom:start")
    _ = await nursery.start(partial(run_boom, booms_ready))
    yield
    logger.debug("boom:end")

The test suite never ends because booms_ready.set() is never reached. As a result, booms_ready.wait() blocks.

One way around this could be to detect that run_boom raised an Exception and then send a cancel scope to nursery?

If that's a solution, how do I go about it?

@njsmith
Copy link
Member

njsmith commented May 3, 2021

I'm not quite sure I'm following the details here, but this looks like a bug -- if a task in a fixture raises an exception then pytest-trio should automatically cancel everything and give you a traceback.

@njsmith
Copy link
Member

njsmith commented May 3, 2021

For reference, running with pytest -vs, I get:

test_ex.py::test_fails_needs_some_scopes
2021-05-03 01:04:42.227 | DEBUG    | conftest:boom:51 - boom:start
2021-05-03 01:04:42.227 | DEBUG    | conftest:depends_on_boom:31 - depends_on_boom:start
2021-05-03 01:04:42.227 | ERROR    | conftest:run_depends_on_boom:21 - run_depends_on_boom:start
2021-05-03 01:04:42.227 | ERROR    | conftest:run_boom:39 - run_boom:start
2021-05-03 01:04:42.227 | ERROR    | conftest:run_boom:43 - run_boom:hitting-Exception

...and then it hangs.

@njsmith
Copy link
Member

njsmith commented May 3, 2021

Here's a self-contained minimal version that still demonstrates the hang:

# test_gh_120.py

import trio
import pytest


@pytest.fixture
async def hanging_fixture():
    print("hanging_fixture:start")
    await trio.Event().wait()
    yield
    print("hanging_fixture:end")


@pytest.fixture
async def exploding_fixture():
    print("exploding_fixture:start")
    raise Exception
    yield
    print("exploding_fixture:end")


@pytest.mark.trio
async def test_fails_right_away(exploding_fixture):
    ...


@pytest.mark.trio
async def test_fails_needs_some_scopes(exploding_fixture, hanging_fixture):
    ...

Output:

❯ pytest -vs test_gh_120.py 
================================== test session starts ===================================
platform linux -- Python 3.8.6, pytest-5.4.2, py-1.8.1, pluggy-0.13.1 -- /home/njs/.user-python3.8/bin/python3
cachedir: .pytest_cache
rootdir: /home/njs/trio-notes/py120
plugins: trio-0.7.0, cov-2.8.1
collected 2 items                                                                        

test_gh_120.py::test_fails_right_away exploding_fixture:start
FAILED
test_gh_120.py::test_fails_needs_some_scopes hanging_fixture:start
exploding_fixture:start
[hangs]

@njsmith
Copy link
Member

njsmith commented May 3, 2021

Oh hmm, so apparently I set things up this way on purpose, at some point:

# If a fixture crashes, whether during setup, teardown, or in a background
# task at any other point, then we mark the whole test run as "crashed". When
# a run is "crashed", two things happen: (1) if any fixtures or the test
# itself haven't started yet, then we don't start them. (2) if the test is
# running, we cancel it. That's all. In particular, if a fixture has a
# background crash, we don't propagate that to any other fixtures, we still
# follow the normal teardown sequence, and so on – but since the test is
# cancelled, the teardown sequence should start immediately.

Maybe we should revisit that?

@gc-ss
Copy link
Author

gc-ss commented May 4, 2021

The logic in the statements line 121 - 128 sound really solid to me and I can't disagree with any of it.

In this specific case, the part if the test is running, we cancel it. seems the most relevant?

Lets see if, by cancel it, we can agree that would mean applying a cancel scope. (Let's keep aside for a minute, that right now,
no such cancel scope is explicitly applied - though it might something we work towards if we agree on this)

What seems to be happening here: the cancel scope never affects the trio.Event().wait() in hanging_fixture.

As a result, hanging_fixture never tears down - ergo the test never ends.

Hence, a fix would be to apply a cancel scope, perhaps explicitly on explicit dependencies like the one built in this case - we would have to, think of a way to either cancel the trio.Event().wait() or the fixture that contains it.

If this does not make sense, let's figure out where we differ.

@njsmith
Copy link
Member

njsmith commented May 4, 2021

What that comment is saying is: we only cancel the test function itself, the function called test_whatever. The fixtures have to finish before the test function can run. And one of the fixtures is hanging, so the test function never runs, so it never gets cancelled.

I think we can tweak this slightly: if the test hasn't started yet, we don't start it, and we treat it like it's failed and start unwinding fixtures backwards down the dependency tree. I guess I just never thought about this case :-)

@gc-ss
Copy link
Author

gc-ss commented May 4, 2021

And one of the fixtures is hanging, so the test function never runs, so it never gets cancelled.

Ah! This is what I was missing. Logically it makes sense.

I think we can tweak this slightly: if the test hasn't started yet, we don't start it, and we treat it like it's failed and start unwinding fixtures backwards down the dependency tree. I guess I just never thought about this case :-)

Makes sense!

and we treat it like it's failed

Sure. I mean if you wanted to be picky, we can mark it as skipped with a reason "Fixtures failed" or similar (because we don't have information whether the test actually failed). However, this could be something I add on later after I work with you to figure out the fundamental cancel mechanics.

So, would it be fair to say the "blocker" for this issue is that trio.Event().wait() blocks by design but there's no way to cancel it once it's clear to us that will never unblock (because the Fixture that's responsible for the unblock failed)?

@njsmith
Copy link
Member

njsmith commented May 4, 2021

So, would it be fair to say the "blocker" for this issue is that trio.Event().wait() blocks by design but there's no way to cancel it once it's clear to us that will never unblock (because the Fixture that's responsible for the unblock failed)?

No, we can cancel it. We just need to wire things up in pytest-trio so that it does cancel it under this circumstance :-).

I spent a few minutes looking at this yesterday and didn't get terribly far. The concurrent fixture setup code is hard to understand. I guess another way to solve this would be to remove the concurrent setup -- so then the first fixture would fail, and the second would never start (#57).

@gc-ss
Copy link
Author

gc-ss commented May 4, 2021

I can continue discussing this on #57 but needed clarity on 2 things:

I guess another way to solve this would be to remove the concurrent setup -- so then the first fixture would fail, and the second would never start

What if we could have both. Those tests that don't need concurrent fixtures (because of interdependencies) pass in a --no-concurrent-fixtures option to pytest-trio and in that case, fixtures run in sequence?

  1. Would running fixtures run in sequence actually help us in this specific case where the setup of one fixture depends on the setup of another but they are not torn down yet - they just enter their "yield" phase at which point the actual test runs (so the fixtures in theory haven't entered their teardown yet)?

@njsmith
Copy link
Member

njsmith commented May 6, 2021

Looked at this again and it wasn't as tricky as I thought. (At least, I think it's not! We'll see if anyone pokes any holes in it :-).) Anyway: #121

@gc-ss
Copy link
Author

gc-ss commented May 7, 2021

At least, I think it's not! We'll see if anyone pokes any holes in it :-)

Neither was I. I will try harder over the next few weeks though!

@gc-ss
Copy link
Author

gc-ss commented May 7, 2021

There's however something, that I think is being discussed at #110

This is my own take on it.

Given a test file test_x.py:

# -*- coding: utf-8 -*-

import trio

async def test_passes_shorter(boom):
    await trio.sleep(boom - 2)  # will pass as the nursery is cancelled before the exception is thrown

and a conftest.py:

# -*- coding: utf-8 -*-

from pytest_trio.enable_trio_mode import *

from functools import partial
import trio
import pytest
from loguru import logger

BLOCKING_PERIOD = 5

async def run_boom(task_status=trio.TASK_STATUS_IGNORED):

    logger.error("run_boom:start")

    task_status.started(BLOCKING_PERIOD)

    await trio.sleep(BLOCKING_PERIOD)
    logger.error("run_boom:hitting-Exception")
    raise Exception()  # This gets thrown as expected IFF this is still running (which is when the tests aren't done yet/still running themselves)

@pytest.fixture
async def boom(nursery):
    logger.debug("boom:start")
    time_start = trio.current_time()
    blocking_period = await nursery.start(partial(run_boom))
    yield blocking_period
    time_end = trio.current_time()
    logger.debug(f"boom:blocking_period: {blocking_period} -> {time_end - time_start}")
    logger.debug("boom:end")

The test will always pass.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants