Skip to content

Conversation

@shunping
Copy link
Collaborator

@shunping shunping commented Dec 8, 2025

This change enables a retry mechanism for pytest to improve the developer experience and the reliability of our test/release pipelines.

Recently, we have observed an increase in flaky Python tests. These failures disrupt development and release validation, often requiring manual and time-consuming test reruns. Some post-commit workflows can take hours to complete, and multiple flaky tests can significantly delay getting a green build.

To address this, this change leverages the pytest-rerunfailures plugin with the following configuration:

  • --reruns 1: Allows for a single retry of a failed test. We want to keep this a small number so that we do not risk hiding a truly flaky test.
  • --reruns-delay 5: Introduces a 5-second delay before the retry.

This approach provides a balance between mitigating transient failures (such as GRPC deadline exceeded) and ensuring that consistently flaky tests are still surfaced and addressed.

fixes #37038

@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @shunping, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request introduces a robust retry mechanism for pytest tests across various Python environments. By integrating the pytest-rerunfailures plugin and configuring it within tox.ini, the system will now automatically re-run failed tests up to three times with a short delay. This enhancement is designed to significantly improve the reliability of CI/CD pipelines by addressing and reducing failures caused by intermittent or flaky tests, ensuring more consistent and accurate test results.

Highlights

  • New Dependency Added: The pytest-rerunfailures plugin has been added as a new dependency in sdks/python/setup.py. This plugin enables the automatic re-execution of failed pytest tests.
  • Pytest Retry Mechanism Implemented: The tox.ini configuration has been updated across multiple test environments to utilize the newly added pytest-rerunfailures plugin. Pytest commands will now automatically retry failed tests 3 times with a 5-second delay between retries.
  • Improved Test Stability: This change aims to enhance the stability of the test suite by mitigating issues caused by transient or flaky test failures, reducing the need for manual re-runs of CI/CD jobs.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@github-actions github-actions bot added the python label Dec 8, 2025
@shunping shunping marked this pull request as ready for review December 8, 2025 20:35
@shunping shunping self-assigned this Dec 8, 2025
@shunping shunping changed the title Add retry to pytest Add retry to pytest for transient failure Dec 8, 2025
@shunping
Copy link
Collaborator Author

shunping commented Dec 8, 2025

cc'ed @damccorm

commands =
python apache_beam/examples/complete/autocomplete_test.py
bash {toxinidir}/scripts/run_pytest.sh {envname} "{posargs}"
bash {toxinidir}/scripts/run_pytest.sh {envname} "{posargs}" '--reruns 1 --reruns-delay 5'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While I understand the pragmatic nature of this fix, I'm not sure it is the right approach. Doing this has a high likelihood of masking real failures, and generally I think we should be tracking down/fixing flakes. If there are things that innately mean we're going to have this level of flakiness, we're likely passing those on to users as well.

I'm open to more discussion here, but at a minimum I think if we're going to make a change like this it should be surfaced to the dev list, and probably should come with data that describes the problem and the reason we have this kind of flakiness. We could also evaluate alternate approaches there (for example, reducing the number of tests we run to reduce flakiness, only doing this for PR runs, etc...)

Copy link
Collaborator Author

@shunping shunping Dec 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My rationale for this approach was based on PR #35915, where pytest retry arguments were introduced:
https://github.com/apache/beam/pull/35915/files#diff-33fb11ecf72212eda83aaf8e36f94816ad447d8d12896dc3b5e5ac3727adbbd1R114,
even though there are no actual effects for those arguments after my investigation.

I misinterpreted the acceptance of that PR as a general agreement to use retries for flaky tests and tried to fix the retry here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That makes sense - my understanding from that PR is that those were going to be used for Grpc retries somehow, but I may be wrong (I probably shouldn't have merged with the meaningless options though).

Regardless, I'd be more open to this for a specific GHA suite (especially a precommit), but I think it warrants broader discussion/data first

commands =
python apache_beam/examples/complete/autocomplete_test.py
bash {toxinidir}/scripts/run_pytest.sh {envname} "{posargs}"
bash {toxinidir}/scripts/run_pytest.sh {envname} "{posargs}" '--reruns 1 --reruns-delay 5'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: If we're going to do this for every invocation, we should include it as part of the run_pytest.sh script itself

@github-actions
Copy link
Contributor

github-actions bot commented Dec 8, 2025

Assigning reviewers:

R: @damccorm for label python.

Note: If you would like to opt out of this review, comment assign to next reviewer.

Available commands:

  • stop reviewer notifications - opt out of the automated review tooling
  • remind me after tests pass - tag the comment author after tests pass
  • waiting on author - shift the attention set back to the author (any comment or push by the author will return the attention set to the reviewers)

The PR bot will only process comments in the main thread (not review comments).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug]: pytest retry configuration is not effective

2 participants