Networking issues #296

cfergeau · 2023-11-13T15:39:24Z

Sometimes, after a while, podman machine networking, or crc networking stops working.
No clear reproducer, but was hit by people working on podman-desktop, by some crc users, ...
Latest such issue is:
containers/podman#20639
The common symptom is that ssh access to the VM does not work.
modprobe -r virtio-net && modprobe virtio-net gets the network back up in #20639.

Currently working with Florent who filed #20639 and who can reproduce it several times per week to get some traces through dlv to see if this gives a hint as to what's going on. This could be a gvproxy bug as much as a kernel or qemu bug.

Regarding the other similar bugs which have been filed/mentioned in the past, they may have the same root cause, or not.
They happened on Windows + hyperv, on macos + vfkit, and I think even on linux + libvirt/qemu.
#20639 was macos + qemu. This means this both happens with gvproxy, and with crc daemon + vm process running in the VM.

There were hints of a crc daemon crash/restart in the linux + qemu case, but not in #20639, which is why I'm thinking there could be different issues.

The text was updated successfully, but these errors were encountered:

cfergeau · 2023-11-13T15:43:53Z

Regarding #20639, I asked Florent

to upgrade gvproxy to the latest released version as the one shipped by podman 4.7.2 is old (0.5.0 vs 0.7.1)
extract the binary for his platform as delve does not support universal macos binaries: lipo -extract arm64 -output gvproxy-darwin-arm64 ./gvproxy-darwin
replace the gvproxy binary used by podman with this gvproxy-darwin-arm64 binary
install delve: brew install delve
get some traces when the issue occurs:

$ dlv attach  $(pgrep gvproxy)

(dlv) trace /github.com\/containers\/gvisor-tap-vsock\/*/

When the tracing is done, it's possible to detach dlv from the process by pressing ctrl+c and answering 'no' when delve asks if the process should be killed.

cfergeau · 2023-11-13T15:45:23Z

Regarding containers/podman#20639, one suggestion from @n1hility was to try to use vm/gvforwarder in the VM, and sends the network traffic over vsock rather than directly over virtio-net to see if the bug can still be reproduced.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Networking issues #296

Networking issues #296

cfergeau commented Nov 13, 2023 •

edited

Loading

cfergeau commented Nov 13, 2023

cfergeau commented Nov 13, 2023

Networking issues #296

Networking issues #296

Comments

cfergeau commented Nov 13, 2023 • edited Loading

cfergeau commented Nov 13, 2023

cfergeau commented Nov 13, 2023

cfergeau commented Nov 13, 2023 •

edited

Loading