Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions sdks/python/setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -448,6 +448,7 @@ def get_portability_package_data():
'pytest>=7.1.2,<9.0',
'pytest-xdist>=2.5.0,<4',
'pytest-timeout>=2.1.0,<3',
'pytest-rerunfailures>=16.1.0',
'scikit-learn>=0.20.0,<1.8.0',
'sqlalchemy>=1.3,<3.0',
'psycopg2-binary>=2.8.5,<3.0',
Expand Down
28 changes: 14 additions & 14 deletions sdks/python/tox.ini
Original file line number Diff line number Diff line change
Expand Up @@ -77,7 +77,7 @@ deps =
numpy==1.26.4
commands =
python apache_beam/examples/complete/autocomplete_test.py
bash {toxinidir}/scripts/run_pytest.sh {envname} "{posargs}"
bash {toxinidir}/scripts/run_pytest.sh {envname} "{posargs}" '--reruns 1 --reruns-delay 5'

[testenv:py{310,311,312,313}-macos]
commands_pre =
Expand All @@ -87,12 +87,12 @@ commands_pre =
bash {toxinidir}/scripts/run_tox_cleanup.sh
commands =
python apache_beam/examples/complete/autocomplete_test.py
bash {toxinidir}/scripts/run_pytest.sh {envname} "{posargs}"
bash {toxinidir}/scripts/run_pytest.sh {envname} "{posargs}" '--reruns 1 --reruns-delay 5'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While I understand the pragmatic nature of this fix, I'm not sure it is the right approach. Doing this has a high likelihood of masking real failures, and generally I think we should be tracking down/fixing flakes. If there are things that innately mean we're going to have this level of flakiness, we're likely passing those on to users as well.

I'm open to more discussion here, but at a minimum I think if we're going to make a change like this it should be surfaced to the dev list, and probably should come with data that describes the problem and the reason we have this kind of flakiness. We could also evaluate alternate approaches there (for example, reducing the number of tests we run to reduce flakiness, only doing this for PR runs, etc...)

Copy link
Collaborator Author

@shunping shunping Dec 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My rationale for this approach was based on PR #35915, where pytest retry arguments were introduced:
https://github.com/apache/beam/pull/35915/files#diff-33fb11ecf72212eda83aaf8e36f94816ad447d8d12896dc3b5e5ac3727adbbd1R114,
even though there are no actual effects for those arguments after my investigation.

I misinterpreted the acceptance of that PR as a general agreement to use retries for flaky tests and tried to fix the retry here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That makes sense - my understanding from that PR is that those were going to be used for Grpc retries somehow, but I may be wrong (I probably shouldn't have merged with the meaningless options though).

Regardless, I'd be more open to this for a specific GHA suite (especially a precommit), but I think it warrants broader discussion/data first


[testenv:py{310,311,312,313}-win]
commands =
python apache_beam/examples/complete/autocomplete_test.py
bash {toxinidir}/scripts/run_pytest.sh {envname} "{posargs}"
bash {toxinidir}/scripts/run_pytest.sh {envname} "{posargs}" '--reruns 1 --reruns-delay 5'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: If we're going to do this for every invocation, we should include it as part of the run_pytest.sh script itself

install_command = {envbindir}/python.exe {envbindir}/pip.exe install --retries 10 {opts} {packages}
list_dependencies_command = {envbindir}/python.exe {envbindir}/pip.exe freeze

Expand All @@ -101,7 +101,7 @@ list_dependencies_command = {envbindir}/python.exe {envbindir}/pip.exe freeze
extras = test,hadoop,gcp,interactive,dataframe,aws,azure
commands =
python apache_beam/examples/complete/autocomplete_test.py
bash {toxinidir}/scripts/run_pytest.sh {envname} "{posargs}"
bash {toxinidir}/scripts/run_pytest.sh {envname} "{posargs}" '--reruns 1 --reruns-delay 5'

[testenv:py{310,311}-ml]
# Don't set TMPDIR to avoid "AF_UNIX path too long" errors in certain tests.
Expand All @@ -114,7 +114,7 @@ extras = test,gcp,dataframe,ml_test
commands =
# Log tensorflow version for debugging
/bin/sh -c "pip freeze | grep -E tensorflow"
bash {toxinidir}/scripts/run_pytest.sh {envname} "{posargs}"
bash {toxinidir}/scripts/run_pytest.sh {envname} "{posargs}" '--reruns 1 --reruns-delay 5'

[testenv:py312-ml]
# many packages do not support py3.12
Expand All @@ -126,7 +126,7 @@ extras = test,gcp,dataframe,p312_ml_test
commands =
# Log tensorflow version for debugging
/bin/sh -c "pip freeze | grep -E tensorflow"
bash {toxinidir}/scripts/run_pytest.sh {envname} "{posargs}"
bash {toxinidir}/scripts/run_pytest.sh {envname} "{posargs}" '--reruns 1 --reruns-delay 5'

[testenv:py313-ml]
# many packages do not support py3.13, and datatables breaks after 3.12.
Expand All @@ -138,22 +138,22 @@ extras = test,gcp,dataframe,p313_ml_test
commands =
# Log tensorflow version for debugging
/bin/sh -c "pip freeze | grep -E tensorflow"
bash {toxinidir}/scripts/run_pytest.sh {envname} "{posargs}"
bash {toxinidir}/scripts/run_pytest.sh {envname} "{posargs}" '--reruns 1 --reruns-delay 5'

[testenv:py{310,311,31,313}-dask]
extras = test,dask,dataframes
commands_pre =
pip install 'distributed>=2024.4.2' 'dask>=2024.4.2'
commands =
bash {toxinidir}/scripts/run_pytest.sh {envname} {toxinidir}/apache_beam/runners/dask/
bash {toxinidir}/scripts/run_pytest.sh {envname} {toxinidir}/apache_beam/runners/dask/ '--reruns 1 --reruns-delay 5'

[testenv:py{310,311,312,313}-win-dask]
# use the tight range since the latest dask requires cloudpickle 3.0
commands_pre =
pip install 'distributed>=2024.4.2,<2024.9.0' 'dask>=2024.4.2,<2024.9.0'
commands =
python apache_beam/examples/complete/autocomplete_test.py
bash {toxinidir}/scripts/run_pytest.sh {envname} {toxinidir}/apache_beam/runners/dask/
bash {toxinidir}/scripts/run_pytest.sh {envname} {toxinidir}/apache_beam/runners/dask/ '--reruns 1 --reruns-delay 5'
install_command = {envbindir}/python.exe {envbindir}/pip.exe install --retries 10 {opts} {packages}
list_dependencies_command = {envbindir}/python.exe {envbindir}/pip.exe freeze

Expand All @@ -175,7 +175,7 @@ setenv =
# NOTE: we could add ml_test to increase the collected code coverage metrics, but it would make the suite slower.
extras = test,hadoop,gcp,interactive,dataframe,aws,redis
commands =
bash {toxinidir}/scripts/run_pytest.sh {envname} "{posargs}" "--cov-report=xml --cov=. --cov-append"
bash {toxinidir}/scripts/run_pytest.sh {envname} "{posargs}" "--cov-report=xml --cov=. --cov-append --reruns 1 --reruns-delay 5"

[testenv:lint]
# Don't set TMPDIR to avoid "AF_UNIX path too long" errors in pylint.
Expand Down Expand Up @@ -387,15 +387,15 @@ commands =
# Log pandas and numpy version for debugging
/bin/sh -c "pip freeze | grep -E '(pandas|numpy)'"
# Run all DataFrame API unit tests
bash {toxinidir}/scripts/run_pytest.sh {envname} 'apache_beam/dataframe'
bash {toxinidir}/scripts/run_pytest.sh {envname} 'apache_beam/dataframe' '--reruns 1 --reruns-delay 5'

[testenv:py{310,311}-tft-{113,114}]
deps =
# Help pip resolve conflict with typing-extensions due to an old version of tensorflow https://github.com/apache/beam/issues/30852
113: pydantic<2.0
114: tensorflow_transform>=1.14.0,<1.15.0
commands =
bash {toxinidir}/scripts/run_pytest.sh {envname} 'apache_beam/ml/transforms apache_beam/examples/snippets/transforms/elementwise/mltransform_test.py'
bash {toxinidir}/scripts/run_pytest.sh {envname} 'apache_beam/ml/transforms apache_beam/examples/snippets/transforms/elementwise/mltransform_test.py' '--reruns 1 --reruns-delay 5'

[testenv:py{310,311}-pytorch-{19,110,111,112,113}]
deps =
Expand Down Expand Up @@ -587,12 +587,12 @@ commands =
# Log aiplatform and its dependencies version for debugging
/bin/sh -c "pip freeze | grep -E tensorflow"
# Allow exit code 5 (no tests run) so that we can run this command safely on arbitrary subdirectories.
bash {toxinidir}/scripts/run_pytest.sh {envname} 'apache_beam/ml/transforms/embeddings'
bash {toxinidir}/scripts/run_pytest.sh {envname} 'apache_beam/ml/transforms/embeddings' '--reruns 1 --reruns-delay 5'

[testenv:py{310,312}-dill]
extras = test,dill
commands =
# Log dill version for debugging
/bin/sh -c "pip freeze | grep -E dill"
# Run all dill-specific tests
bash {toxinidir}/scripts/run_pytest.sh {envname} "{posargs}"
bash {toxinidir}/scripts/run_pytest.sh {envname} "{posargs}" '--reruns 1 --reruns-delay 5'
Loading