Skip to content

Conversation

ricardobranco777
Copy link
Contributor

@ricardobranco777 ricardobranco777 commented Aug 10, 2025

Fix network test helpers used by the pasta tests.

Right now the code doesn't work for us in openQA where we use TAP networking for IPv4, and IPv6 is fully functional only on the s390x worker. The current code incorrectly detects IPv6 as routable on hosts without IPv6 connectivity.

This PR:

  • Fixes ipv4_get_addr_global & ipv6_get_addr_global by skipping link-local IPv4/IPv6 addresses as they are not globally routable as defined by RFC 3927 (IPv4 169.254.0.0/16) and RFC 4291 (IPv6 fe80::/10).
  • Uses a trick from Richard W. Stevens's "Unix Network Programming Vol.1 3rd Edition" book to get the default global IPv4 & IPv6 address that doesn't need a target IP host to be connected at all to the Internet. No more guessing this from the output of ip addr & ip route.

Notes:

  • The above trick works on most POSIX systems: Linux, FreeBSD, NetBSD, OpenBSD, DragonflyBSD, MidnightBSD, MacOS, Illumos, Solaris & MacOS X.
  • I chose Python for this so I can backport this fix to podman 5.4.2

This PR was tested on:

  • openSUSE Tumbleweed with podman 5.5.2 (both crun & runc on aarch64 & x86_64)
  • SLES 16.0 RC with podman 5.4.2. with runc on aarch64, ppc64le, s390x & x86_64

Verification runs available at os-autoinst/os-autoinst-distri-opensuse#22933

Does this PR introduce a user-facing change?

None

Copy link
Contributor

openshift-ci bot commented Aug 10, 2025

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: ricardobranco777
Once this PR has been reviewed and has the lgtm label, please assign l0rd for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Fixes ipv4_get_addr_global & ipv6_get_addr_global by skipping
link-local IPv4/IPv6 addresses as they are not globally routable.

Uses a trick from Richard W. Stevens's "Unix Network Programming" book
to get the default global IPv4 & IPv6 address that doesn't need a target
IP host to be connected at all to the Internet.

Signed-off-by: Ricardo Branco <[email protected]>
Copy link

[NON-BLOCKING] Packit jobs failed. @containers/packit-build please check. Everyone else, feel free to ignore.

Copy link
Member

@Luap99 Luap99 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While that is an option it is not really correct. It doesn't matter to be the most precise or correct here. What matters is to match what pasta will do and AFAIK they are not doing this, pasta actually list routes and and addresses to get the info.

Also pasta has since learned a "local mode" when no suitable ip is found so I think most test assumptions/skips here are simply incorrect now and need to be reviewed again.

cc @sbrivio-rh @dgibson

@ricardobranco777
Copy link
Contributor Author

While that is an option it is not really correct. It doesn't matter to be the most precise or correct here. What matters is to match what pasta will do and AFAIK they are not doing this, pasta actually list routes and and addresses to get the info.

Also pasta has since learned a "local mode" when no suitable ip is found so I think most test assumptions/skips here are simply incorrect now and need to be reviewed again.

It seems pasta has issues with TAP interfaces:
https://bugs.passt.top/show_bug.cgi?id=49

@sbrivio-rh
Copy link
Collaborator

While that is an option it is not really correct. It doesn't matter to be the most precise or correct here. What matters is to match what pasta will do and AFAIK they are not doing this, pasta actually list routes and and addresses to get the info.
Also pasta has since learned a "local mode" when no suitable ip is found so I think most test assumptions/skips here are simply incorrect now and need to be reviewed again.

It seems pasta has issues with TAP interfaces: https://bugs.passt.top/show_bug.cgi?id=49

"tap-style" (not tap, nor TAP) in that context means interfaces without an address, which is not your case. I still have to go through the changes and the rest of the comments.

@sbrivio-rh sbrivio-rh added the pasta pasta(1) bugs or features label Aug 11, 2025
@sbrivio-rh
Copy link
Collaborator

@ricardobranco777 first off, thanks for your contribution! Having these tests running on openSUSE will certainly save us some headaches and tickets later on, and at the same time most of the contributions to this project come from Red Hat folks so I'm really glad to see other names, especially in the testing area.

I have a concern about the general approach, though, which is an aspect of what @Luap99 mentioned: pasta(1) enables IPv6 (by default) if there's a default IPv6 route (similar to those checks we currently have in tests), or if there's no route at all (for any protocol family) using a so-called "local mode".

And this is something we should actually test with Podman, even in isolated environments. It doesn't (or shouldn't!) matter so much if we have real IPv6 connectivity or not.

But I guess you're coming here with tests that are otherwise failing. What is failing exactly without these changes? I'd rather fix those to work in isolated IPv6 environments. Otherwise we risk not testing IPv6 at all on openSUSE (except for s390x, but we would be doing tests on big-endian only).

@ricardobranco777
Copy link
Contributor Author

@ricardobranco777 first off, thanks for your contribution! Having these tests running on openSUSE will certainly save us some headaches and tickets later on, and at the same time most of the contributions to this project come from Red Hat folks so I'm really glad to see other names, especially in the testing area.

You're welcome. For a general overview of the tests we're running:

https://github.com/os-autoinst/os-autoinst-distri-opensuse/tree/master/tests/containers/bats#openqa-jobs

I have a concern about the general approach, though, which is an aspect of what @Luap99 mentioned: pasta(1) enables IPv6 (by default) if there's a default IPv6 route (similar to those checks we currently have in tests), or if there's no route at all (for any protocol family) using a so-called "local mode".

And this is something we should actually test with Podman, even in isolated environments. It doesn't (or shouldn't!) matter so much if we have real IPv6 connectivity or not.

But I guess you're coming here with tests that are otherwise failing. What is failing exactly without these changes?

Until yesterday we were ignoring the pasta tests because of the failures, which can be seen in the openQA jobs for openSUSE Tumbleweed Build20250808 and earlier dates like: https://openqa.opensuse.org/tests/5230632/file/podman-bats-user-local.tap

The output of ip -4 -j addr, ip -6 -j route & others can be seen in this job from today:
https://openqa.opensuse.org/tests/5234566#downloads

The most recent will always be:

I'd rather fix those to work in isolated IPv6 environments. Otherwise we risk not testing IPv6 at all on openSUSE (except for s390x, but we would be doing tests on big-endian only).

I raised the topic about IPv6 with our QA team. Feel free to ping me if you want to try another approach as we can easily try PR's with openQA.

@ricardobranco777
Copy link
Contributor Author

The issue wasn't only IPv6 as we can see here

#not ok 569 [505] No IPv4 in 695ms
# tags: ci:parallel
# (from function `bail-now' in file test/system/helpers.bash, line 187,
#  from function `assert' in file test/system/helpers.bash, line 1062,
#  in test file test/system/505-networking-pasta.bats, line 300)
#   `assert "${container_address}" = "null" \' failed
#
# [17:42:52.708387896] $ /usr/bin/podman run --rm --net=pasta:-6 quay.io/libpod/testimage:20241011 ip -j -4 address show
# [17:42:53.045748620] [{"ifindex":1,"ifname":"lo","flags":["LOOPBACK","UP","LOWER_UP"],"mtu":65536,"qdisc":"noqueue","operstate":"UNKNOWN","group":"default","txqlen":1000,"addr_info":[{"family":"inet","local":"127.0.0.1","prefixlen":8,"scope":"host","label":"lo","valid_life_time":4294967295,"preferred_life_time":4294967295}]},{"ifindex":2,"ifname":"tap0","flags":["BROADCAST","MULTICAST","UP","LOWER_UP"],"mtu":65520,"qdisc":"pfifo_fast","operstate":"UNKNOWN","group":"default","txqlen":1000,"addr_info":[{"family":"inet","local":"169.254.2.1","prefixlen":16,"scope":"global","label":"tap0","valid_life_time":4294967295,"preferred_life_time":4294967295}]}]
# #/vvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv
# #|     FAIL: Container has IPv4 global address with IPv4 disabled
# #| expected: = null
# #|   actual:   169.254.2.1
# #\^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
# # [teardown]

@sbrivio-rh
Copy link
Collaborator

Thanks for sharing the old results, from a quick look I'd say most of these failures are actually due to the fact that we missed to update tests as we introduced local mode, and things just happened to work in CIs from Podman and passt itself so we didn't touch them, but I need a bit more time to go through the list of previous failures and comment on them.

@Luap99
Copy link
Member

Luap99 commented Aug 11, 2025

I think trying to reimplement how pasta decided what address to pick seems pointless? I understand why we did it originally when the logic was nice and simple but now I am not sure if they add any value.

I don't think podman must guess and verify what address pasta picked. That seems something you can verify in the pasta tests if you really want, for podman we care that options work like -4/-6 and that connection/forwarding works and that our default options we depend on such as --dns-forward/--map-guest-addr work and are correctly used by podman to populate resolv.conf/hosts file.

So if we do --net=pasta:-6 then all we need to check that there is no v4 address on the interface, same way the other way around.

@ricardobranco777
Copy link
Contributor Author

ricardobranco777 commented Aug 11, 2025

ip_get_addr_global can be simplified on Linux like this:

ip_get_addr_global() {
    # Google DNS servers
    if [ "$1" = "-6" ]; then
        address="2001:4860:4860::8888"
    else
        address="8.8.8.8"
    fi
    result=$(ip -j route get "$address" 2>/dev/null | jq -Mr '.[0].prefsrc')
    if [ -z "$result" ]; then
        return 1
    fi
    echo "$result"
}

@sbrivio-rh
Copy link
Collaborator

sbrivio-rh commented Aug 12, 2025

I think trying to reimplement how pasta decided what address to pick seems pointless? I understand why we did it originally when the logic was nice and simple but now I am not sure if they add any value.

True, we should probably drop some checks. When I added them my urge was to check that nothing unexpected would happen because of a flaw in the Podman integration, but that came at a lower cost.

Note, though, that the logic is more complicated mostly because of specific requests from Podman users, which might indicate that we care about not breaking some of those subtleties particularly when started by Podman and in typical (test) environments used by Podman tests.

I don't think podman must guess and verify what address pasta picked.

That is, I agree that there's no need, but:

That seems something you can verify in the pasta tests if you really want

...it's quite convenient (for the moment, as long as we're not done with a test framework that can be automated more easily) to have those running as part of Podman's CI as it runs more frequently on more diverse environments.

On top of that, we're not investing effort in passt's current test suite right now as we plan to replace the framework eventually, so those checks would logically belong there, they're not Podman's CI job, but I'd keep what can be reasonably kept if it's not too much effort to maintain.

for podman we care that options work like -4/-6 and that connection/forwarding works and that our default options we depend on such as --dns-forward/--map-guest-addr work and are correctly used by podman to populate resolv.conf/hosts file.

So if we do --net=pasta:-6 then all we need to check that there is no v4 address on the interface, same way the other way around.

Right, definitely (I broke this by the way, and this only came up in the OpenQA environment, so that's the kind of test I'd keep).

ip_get_addr_global can be simplified on Linux like this:

ip_get_addr_global() {
    # Google DNS servers
    if [ "$1" = "-6" ]; then
        address="2001:4860:4860::8888"
    else
        address="8.8.8.8"
    fi
[...]

@ricardobranco777 I'm generally trying to avoid hardcoding references to whatever assigned address or name (see also #23336 (comment), even though the specific problem I mentioned there doesn't apply here as you wouldn't actually use the resolvers). But my main question at the moment is whether we need to revisit this part at all. List of failing tests with comments (from the results you shared):

#not ok 569 [505] No IPv4 in 695ms
# #/vvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv
# #|     FAIL: Container has IPv4 global address with IPv4 disabled
# #| expected: = null
# #|   actual:   169.254.2.1
# #\^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Ouch, I guess I broke this in https://passt.top/passt/commit/?id=14b84a7f077ecb734bb0e724f70bafeaa6d35a61, added to my to-do list. It's good that this broke, no changes needed for this one.

#not ok 570 [505] IPv6 default address assignment in 709ms
# #/vvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv
# #|     FAIL: Container address not matching host
# #| expected: = fec0::ad2c:7328:7b81:2af2
# #|   actual:   null
# #\^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

I would simply drop this test going in the direction @Luap99 suggested. There can be way too many combinations leading to many different outcomes in pasta for this one. We'll eventually have to write more detailed tests in pasta's own suite.

#not ok 576 [505] IPv6 default route in 693ms
# [17:42:58.386507030] $ /usr/bin/podman run --rm --net=pasta quay.io/libpod/testimage:20241011 ip -j -6 route show
# [17:42:58.725007193] [{"dst":"fe80::/64","dev":"ens4","protocol":"kernel","metric":256,"flags":[],"pref":"medium"}]
# #/vvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv
# #|     FAIL: Container route not matching host
# #| expected: = fe80::2
# #|   actual:   null
# #\^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

But wait, there's a default route on the host according to https://openqa.opensuse.org/tests/5234566/logfile?filename=podman-ip6-route.txt (am I reading it wrong?). Even if IPv6 is not globally routable on the host, it should work at least between container and host (see #22959 (comment), the ticket is still open but at least that part should be fixed). If we fail to copy the default route, that will breaks things.

And we do copy one address in fe80::/64, so copying the route shouldn't fail because of this. This is something else to investigate (please feel free, or it's going to my to-do list as well), but not a good reason to consider IPv6 disabled for the purposes of this test.

If it's due to timing, though (IPv6 not configured yet when tests are started), that's not fixed yet, and we need to find a workaround for the moment being, if that's the case.

#not ok 597 [505] Single TCP port forwarding, IPv6, tap in 567ms
# [17:43:20.611812191] $ /usr/bin/podman run -d --name=c-socat-t597-2wi8alqf --net=pasta -p [fec0::ad2c:7328:7b81:2af2]:5651:5651/tcp quay.io/libpod/testimage:20241011 sh -c for port in $(seq 5651 5651); do     
                         socat -u TCP6-LISTEN:${port} STDOUT &                          done; wait
# [17:43:20.699303259] Error: pasta failed with exit code 1:
# Failed to bind port 5651 ((null)) for option '-t fec0::ad2c:7328:7b81:2af2/5651-5651:5651-5651'
# [17:43:20.703217342] [ rc=126 (** EXPECTED 0 **) ]
# #/vvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv
# #| FAIL: exit code is 126; expected 0
# #\^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
# # [teardown]

...and all the forwarding / transfer tests: oops, why are we failing to bind to that? We don't need to add the zone / interface specifier (say, %ens4), at least not for pasta (maybe for socat), this should work as it is. Unless that address is not available anymore for whatever timing reason. This is another failure to investigate in my opinion. I'll look into it unless I'm missing an obvious way this test is broken.

@ricardobranco777 some other changes from your commit, such as preferring permanent addresses (thanks, I had no idea how to do that with jq) make sense to me in any case. But in general I would say you have enough IPv6 in your environment for these tests to run.

@ricardobranco777
Copy link
Contributor Author

@ricardobranco777 I'm generally trying to avoid hardcoding references to whatever assigned address or name (see also #23336 (comment), even though the specific problem I mentioned there doesn't apply here as you wouldn't actually use the resolvers). But my main question at the moment is whether we need to revisit this part at all.

We should at least revisit the comments and function names to see if the code matches the indended behaviour. It seems to me that we're trying to guess what the kernel routing code would do instead of using available information. We can use ip route get "$address". If we can't use hardcoded references, what do you suggest to use in this case?

I'm not yet familiar with pasta local mode so I don't know how much more I can help here other than helping running tests and making suggestions.

But wait, there's a default route on the host according to https://openqa.opensuse.org/tests/5234566/logfile?filename=podman-ip6-route.txt (am I reading it wrong?). Even if IPv6 is not globally routable on the host, it should work at least between container and host (see #22959 (comment), the ticket is still open but at least that part should be fixed). If we fail to copy the default route, that will breaks things.

Maybe it's fixed in latest code? We're only testing released versions:

  • passt-20250611.0293c6f-3.2.x86_64
  • podman-5.5.2-1.2.x86_64

The full package list: https://openqa.opensuse.org/tests/5234566/logfile?filename=podman-rpm-qa.txt

@ricardobranco777 some other changes from your commit, such as preferring permanent addresses (thanks, I had no idea how to do that with jq) make sense to me in any case. But in general I would say you have enough IPv6 in your environment for these tests to run.

I'll try in another branch. I want to test with the current setup we have and then with the new setup when IPv6 is fixed on all openQA workers.

@sbrivio-rh
Copy link
Collaborator

@ricardobranco777 I'm generally trying to avoid hardcoding references to whatever assigned address or name (see also #23336 (comment), even though the specific problem I mentioned there doesn't apply here as you wouldn't actually use the resolvers). But my main question at the moment is whether we need to revisit this part at all.

We should at least revisit the comments and function names to see if the code matches the indended behaviour. It seems to me that we're trying to guess what the kernel routing code would do instead of using available information.

Well, that wasn't really the purpose. The idea behind those implementations of ipv4_get_addr_global() and ipv6_get_addr_global() is to match how pasta used to source addresses.

Now pasta does something more complicated than that, in that it actually copies multiple addresses and routes (if present) to the container, by default. I think it's the function names that are misleading, yes.

We can use ip route get "$address". If we can't use hardcoded references, what do you suggest to use in this case?

That doesn't match pasta's logic, though. In any case, should we need something like that, I would consider an address in TEST-NET-1 (say, 192.0.2.1), which should be something nobody will ever want to add specific routes to.

I'm not yet familiar with pasta local mode so I don't know how much more I can help here other than helping running tests and making suggestions.

But wait, there's a default route on the host according to https://openqa.opensuse.org/tests/5234566/logfile?filename=podman-ip6-route.txt (am I reading it wrong?). Even if IPv6 is not globally routable on the host, it should work at least between container and host (see #22959 (comment), the ticket is still open but at least that part should be fixed). If we fail to copy the default route, that will breaks things.

Maybe it's fixed in latest code? We're only testing released versions:

  • passt-20250611.0293c6f-3.2.x86_64
  • podman-5.5.2-1.2.x86_64

No, no relevant fixes since then. :(

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
pasta pasta(1) bugs or features release-note-none
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants