Skip to content

Commit

Permalink
freeze_processes: implement kludges for cgroup v1
Browse files Browse the repository at this point in the history
Cgroup v1 freezer has always been problematic, failing to freeze a
cgroup.

In runc, we have implemented a few kludges to increase the chance of
succeeding, but those are used when runc freezes a cgroup for its own
purposes (for "runc pause" and to modify device properties for cgroup
v1).

When criu is used, it fails to freeze a cgroup from time to time
(see [1], [2]). Let's try adding kludges similar to ones in runc.

Alas, I have absolutely no way to test this, so please review carefully.

[1]: opencontainers/runc#4273
[2]: opencontainers/runc#4457

Signed-off-by: Kir Kolyshkin <[email protected]>
  • Loading branch information
kolyshkin committed Dec 16, 2024
1 parent a678a3b commit 2e5b4b5
Showing 1 changed file with 30 additions and 0 deletions.
30 changes: 30 additions & 0 deletions criu/seize.c
Original file line number Diff line number Diff line change
Expand Up @@ -542,6 +542,7 @@ static int freeze_processes(void)
enum freezer_state state = THAWED;

static const unsigned long step_ms = 100;
/* Since opts.timeout is in seconds, multiply it by 1000 to convert to milliseconds. */
unsigned long nr_attempts = (opts.timeout * 1000) / step_ms;
unsigned long i = 0;

Expand Down Expand Up @@ -599,6 +600,35 @@ static int freeze_processes(void)
goto err;
}
nanosleep(&req, NULL);

if (cgroup_v2)
continue;

/* As per older kernel docs (freezer-subsystem.txt before
* the kernel commit ef9fe980c6fcc1821), if FREEZING is seen,
* userspace should either retry or thaw. While current
* kernel cgroup v1 docs no longer mention a need to retry,
* even recent kernels can't reliably freeze a cgroup v1.
*
* Let's keep asking the kernel to freeze from time to time.
* In addition, do occasional thaw/sleep/freeze.
*
* This is still a game of chances (the real fix belongs to the kernel)
* but these kludges might improve the probability of success.
*
* Cgroup v2 does not have this problem.
*/
switch (i % 32) {
case 9:
case 20:
freezer_write_state(fd, FROZEN);
break;
case 31:
freezer_write_state(fd, THAWED);
nanosleep(&req, NULL);
freezer_write_state(fd, FROZEN);
break;
}
}

if (i > nr_attempts) {
Expand Down

0 comments on commit 2e5b4b5

Please sign in to comment.