Add graph-based pod stop #25169

mheon · 2025-01-30T18:19:39Z

Implement a graph-based pod stop and use it by default, ensuring that containers stop in a dependency-based order. This prevents race conditions where application containers stopped after the infra container, meaning they did not have functional networking for the last seconds before they stopped, potentially causing unexpected application errors.

As a pleasant side-effect, make removing containers within a pod parallel, which should improve performance.

Full details in commit descriptions.

Does this PR introduce a user-facing change?

Containers in pods are now stopped in order based on their dependencies, with the infra container being stopped last, preventing application containers from losing networking before they are stopped due to the infra container stopping prematurely.

openshift-ci · 2025-01-30T18:19:53Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: mheon

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~OWNERS~~ [mheon]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

mheon · 2025-01-30T18:22:54Z

Tagging no new tests as the existing pod stop/remove tests should exercise this. Would be nice to test the ordering aspect but I'm not sure if that's doable without being very racy.

Luap99 · 2025-01-30T18:54:45Z

For tests one thing that we could do is check the FinishedAt time from inspect for the infra and actual container after pod stop, then make sure the one of the infra is later.
That should guarantee us it was stopped last and it should be easy to add to an existing pod stop test.

Luap99

will do a proper review tomorrow

Luap99 · 2025-01-30T18:56:48Z

libpod/container_api.go

+		exists, err := c.runtime.state.HasContainer(c.ID())
+		if err != nil {
+			return err
+		}
+		if !exists {
+			return fmt.Errorf("container %s does not exist in database: %w", c.ID(), define.ErrNoSuchCtr)
+		}


That adds a extra db queries overhead, I am not sure we need this at all.
Technically if the pod doesn't exists you could just ignore the error, i.e. only take locks in an if err == nil {} block.
Then the other code would return its normal error anyway.

Oh, I like that. Will fix.

Luap99 · 2025-01-30T18:57:29Z

libpod/container_api.go

+	// Have to lock the pod the container is a part of.
+	// This prevents running `podman start` at the same time a
+	// `podman pod stop` is running, which could lead to wierd races.
+	// Pod locks come before container locks, so do this first.
+	if c.config.Pod != "" {


We should do the same thing around stop for consistency

The intention behind this is to stop races between `pod stop|start` and `container stop|start` being run at the same time. This could result in containers with no working network (they join the still-running infra container's netns, which is then torn down as the infra container is stopped, leaving the container in an otherwise unused, nonfunctional, orphan netns. Locking the pod (if present) in the public container start and stop APIs should be sufficient to stop this. Signed-off-by: Matt Heon <[email protected]>

mheon · 2025-01-31T13:24:48Z

Oops. Forgot to wait for the parallel executors to stop creating horrible races. Fixed now.

packit-as-a-service · 2025-01-31T21:29:03Z

Ephemeral COPR build failed. @containers/packit-build please check.

packit-as-a-service · 2025-01-31T21:56:15Z

Cockpit tests failed for commit 3aa18df. @martinpitt, @jelly, @mvollmer please check.

First, refactor our existing graph traversal code to improve code sharing. There still isn't much sharing between inward traversal (stop, remove) and outward traversal (start) but stop and remove are sharing most of their code, which seems a positive. Second, add a new graph-traversal function to stop containers. We already had start and remove; stop uses the newly-refactored inward-traversal code which it shares with removal. Third, rework the shared stop/removal inward-traversal code to add locking. This allows parallel execution of stop and removal, which should improve the performance of `podman pod rm` and retain the performance of `podman pod stop` at about what it is right now. Fourth and finally, use the new graph-based stop when possible to solve unordered stop problems with pods - specifically, the infra container stopping before application containers, leaving those containers without a working network. Fixes https://issues.redhat.com/browse/RHEL-76827 Signed-off-by: Matt Heon <[email protected]>

martinpitt · 2025-02-03T08:58:30Z

The cockpit test failure from above is essentially this:

# podman pod rm --force --time 0 --all

Error: not all containers could be removed from pod f99a8d18b9b1cf2c8a4951fcce467057f5477ab385b9eb23d38b912ad93120eb: removing pod containers
Error: error removing container 52e324f5fe703a53d4ac0fdfa66fd4914ffb5b7dfdcbc2d3b6d88eccb12b946c from pod f99a8d18b9b1cf2c8a4951fcce467057f5477ab385b9eb23d38b912ad93120eb: container 52e324f5fe703a53d4ac0fdfa66fd4914ffb5b7dfdcbc2d3b6d88eccb12b946c has dependent containers which must be removed before it: df56d19ae6e61a5dc779e1e6e1994734e2e490ed0e8769efa2fb48a29e13ce6f: container already exists

This feels related to this change? The latest push passed, but in commit 3aa18df it also passed in F41 and only failed once in Rawhide -- so this feels like a race condition, or possibly file system order etc., i.e. something not reliably reproducible? Does that ring a bell?

Luap99 · 2025-02-03T09:37:22Z

@martinpitt Most of our tests fail as well so yes this patch is broken and cannot be merged like that.

openshift-ci bot added the release-note label Jan 30, 2025

openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jan 30, 2025

mheon added the No New Tests Allow PR to proceed without adding regression tests label Jan 30, 2025

Luap99 reviewed Jan 30, 2025

View reviewed changes

mheon force-pushed the graph_stop branch 2 times, most recently from 9c9d10f to bee225e Compare January 31, 2025 13:24

mheon force-pushed the graph_stop branch 11 times, most recently from ed5adc4 to 3aa18df Compare January 31, 2025 21:27

mheon force-pushed the graph_stop branch 3 times, most recently from 2fb56ab to 8f21503 Compare January 31, 2025 23:32

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add graph-based pod stop #25169

Add graph-based pod stop #25169

mheon commented Jan 30, 2025

openshift-ci bot commented Jan 30, 2025

mheon commented Jan 30, 2025

Luap99 commented Jan 30, 2025

Luap99 left a comment

Luap99 Jan 30, 2025

mheon Jan 30, 2025

Luap99 Jan 30, 2025

mheon commented Jan 31, 2025

packit-as-a-service bot commented Jan 31, 2025

packit-as-a-service bot commented Jan 31, 2025

martinpitt commented Feb 3, 2025

Luap99 commented Feb 3, 2025

Add graph-based pod stop #25169

Are you sure you want to change the base?

Add graph-based pod stop #25169

Conversation

mheon commented Jan 30, 2025

Does this PR introduce a user-facing change?

openshift-ci bot commented Jan 30, 2025

mheon commented Jan 30, 2025

Luap99 commented Jan 30, 2025

Luap99 left a comment

Choose a reason for hiding this comment

Luap99 Jan 30, 2025

Choose a reason for hiding this comment

mheon Jan 30, 2025

Choose a reason for hiding this comment

Luap99 Jan 30, 2025

Choose a reason for hiding this comment

mheon commented Jan 31, 2025

packit-as-a-service bot commented Jan 31, 2025

packit-as-a-service bot commented Jan 31, 2025

martinpitt commented Feb 3, 2025

Luap99 commented Feb 3, 2025