Skip to content

Commit

Permalink
criu: Initialize util before service worker starts
Browse files Browse the repository at this point in the history
When restoring dumps in new mount + pid namespaces where multiple dumps
share the same network namespace, CRIU may fail due to conflicting
unix socket names. This happens because the service worker creates
sockets using a pattern that includes criu_run_id, but util_init()
is called after cr_service_work() starts.

The socket naming pattern "crtools-fd-%d-%d" uses the restore PID
and criu_run_id, however criu_run_id is always 0 when not initialized,
leading to conflicts when multiple restores run simultaneously either
in the same CRIU process or because of multiple CRIU processes
doing the same operation in different PID namespaces.

Fix this by:

- Moving util_init() before cr_service_work() starts
- Adding a second util_init() call in the service worker fork
to ensure unique IDs across multiple worker runs
- Making sure that dump and restore operations have util_init() called
early to generate unique socket names

With this fix, socket names always include the namespace ID, preventing
conflicts when multiple processes with the same pid share a network
namespace.

Fixes checkpoint-restore#2500

Signed-off-by: Lorenzo Fontana <[email protected]>
  • Loading branch information
fntlnz committed Oct 22, 2024
1 parent dfb56ee commit 3c4e1cb
Show file tree
Hide file tree
Showing 2 changed files with 18 additions and 3 deletions.
9 changes: 9 additions & 0 deletions criu/cr-service.c
Original file line number Diff line number Diff line change
Expand Up @@ -1310,6 +1310,8 @@ int cr_service_work(int sk)
int ret = -1;
CriuReq *msg = 0;

util_init();

more:
opts.mode = CR_SWRK;

Expand Down Expand Up @@ -1528,6 +1530,13 @@ int cr_service(bool daemon_mode)

close(server_fd);
init_opts();
/*
* We want to have an unique criu_run_id
* here so that each service worker fork here
* can create its own sockets file descriptors
* despite being in the same network namespace.
*/
util_init();
ret = cr_service_work(sk);
close(sk);
exit(ret != 0);
Expand Down
12 changes: 9 additions & 3 deletions criu/crtools.c
Original file line number Diff line number Diff line change
Expand Up @@ -169,7 +169,15 @@ int main(int argc, char *argv[], char *envp[])
pr_err("unknown command: %s\n", argv[optind]);
goto usage;
}

/*
* During dump, restore and parasite it's important for us
* to initialize criu_run_id and compel_run_id so that
* sockets and file descriptors are generated with an unique
* name identifying the specific process even in cases
* where multiple processes with the same pid in different
* pid namespaces are sharing the same network namespace.
*/
util_init();
if (opts.mode == CR_SWRK) {
if (argc != optind + 2) {
fprintf(stderr, "Usage: criu swrk <fd>\n");
Expand Down Expand Up @@ -254,8 +262,6 @@ int main(int argc, char *argv[], char *envp[])
return 1;
}

util_init();

if (log_init(opts.output))
return 1;

Expand Down

0 comments on commit 3c4e1cb

Please sign in to comment.