Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

skip setproctitle in task_runner on Mac OS #45124

Draft
wants to merge 2 commits into
base: main
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
11 changes: 9 additions & 2 deletions airflow/dag_processing/manager.py
Original file line number Diff line number Diff line change
Expand Up @@ -39,7 +39,6 @@
from typing import TYPE_CHECKING, Any, NamedTuple

import attrs
from setproctitle import setproctitle
from sqlalchemy import delete, select, update
from tabulate import tabulate
from uuid6 import uuid7
Expand Down Expand Up @@ -181,7 +180,15 @@ def _run_processor_manager(
# to iterate the child processes

set_new_process_group()
setproctitle("airflow scheduler -- DagFileProcessorManager")

# setproctitle causes issue on Mac OS: https://github.com/benoitc/gunicorn/issues/3021
os_type = sys.platform
if os_type == "darwin":
log.info("Mac OS detected, skipping setproctitle")
else:
from setproctitle import setproctitle
setproctitle("airflow scheduler -- DagFileProcessorManager")
Comment on lines +188 to +190
Copy link
Contributor

@jlaneve jlaneve Dec 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
else:
from setproctitle import setproctitle
setproctitle("airflow scheduler -- DagFileProcessorManager")
else:
from setproctitle import setproctitle
setproctitle("airflow scheduler -- DagFileProcessorManager")

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this may actually be a GitHub bug, in the "Files changed" tab it shows the indentation being off, but in the conversation / timeline it shows the indentation as being correct (and my suggested change is unneeded indentation)

Screenshot 2024-12-20 at 6 01 53 PM

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am in contact with setproctitle maintainer during the "Airflow Beach Cleaning" project. I can ask him to comment.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After short discussion with @dvarrazzo - it's likely this dvarrazzo/py-setproctitle#144 is going to fix it (unreleased yet).

It would be great though to get some more details about those segfaults @jaketf @ashb when you see them happening again ?


reload_configuration_for_dag_processing()
processor_manager = DagFileProcessorManager(
dag_directory=dag_directory,
Expand Down
11 changes: 8 additions & 3 deletions task_sdk/src/airflow/sdk/execution_time/task_runner.py
Original file line number Diff line number Diff line change
Expand Up @@ -214,9 +214,14 @@ def startup() -> tuple[RuntimeTaskInstance, Logger]:
msg = SUPERVISOR_COMMS.get_message()

if isinstance(msg, StartupDetails):
from setproctitle import setproctitle

setproctitle(f"airflow worker -- {msg.ti.id}")
# setproctitle causes issue on Mac OS: https://github.com/benoitc/gunicorn/issues/3021
os_type = sys.platform
if os_type == "darwin":
log.info("Mac OS detected, skipping setproctitle")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
log.info("Mac OS detected, skipping setproctitle")
log.debug("Mac OS detected, skipping setproctitle")

else:
from setproctitle import setproctitle

setproctitle(f"airflow worker -- {msg.ti.id}")

log = structlog.get_logger(logger_name="task")
# TODO: set the "magic loop" context vars for parsing
Expand Down
Loading