identify/: Don't fail on unknown multiaddr protocol #3244

mxinden · 2022-12-14T11:51:47Z

Summary

libp2p-identify currently discards the entire identify payload of a remote peer, if that payload contains a multiaddr it can not parse.

Expected behaviour

When parsing an identify payload of a remote peer with an unknown multiaddr protocol (e.g. quic-v1/ or webrtc/) log the parsing error, but don't discard the entire payload.

Actual behaviour

In libp2p-identify's parsing logic, it returns an error for the entire payload, in case it can not parse a multiaddr.

rust-libp2p/protocols/identify/src/protocol.rs

Lines 220 to 226 in be3ec6c

    
           let listen_addrs = { 
        
               let mut addrs = Vec::new(); 
        
               for addr in msg.listen_addrs.into_iter() { 
        
                   addrs.push(parse_multiaddr(addr)?); 
        
               } 
        
               addrs 
        
           };

This is especially relevant when rolling out a new protocol to a live network. Say that most nodes of a network run on an implementation version v1. Say that the multiaddr implementation is not aware of the webrtc/ protocol. Say that a new version (v2) is rolled out to the network with support for the webrtc/ protocol, listening via webrtc/ by default. In such case all v1 nodes would discard all identify payloads of v2 nodes, given that the v2 identify payloads would contain the webrtc/ protocol in their listen_addr addresses.

//CC @dignifiedquire for the role out of quic-v1/ and webtransport/ on IPFS
//CC @melekes for the role out of webrtc/ and quic-v1/ on Polkadot
//CC @divagant-martian and @AgeManning in case you experiment with quic-v1/ on lighthouse

Possible Solution

Release patch versions of libp2p-identify which log the parsing failure, but don't discard the entire identify payload.

Version

libp2p <=v0.50.0

Misc

This was already discovered by @elenaf9 on rust-client: handle address parsing error gracefully punchr#64 before. I missed that this hard a larger impact beyond punchr.
Ideally our Testground tests would catch issues like this in the future. Given that our current Testground tests run the libp2p-ping protocol only, they have not caught it.

Would you like to work on fixing this bug?

Yes, unless someone else has capacity to do so in the next couple of days.

The text was updated successfully, but these errors were encountered:

rkuhn · 2022-12-14T12:49:16Z

In addition to this fix — which I fully agree with — I usually add a FutureCompat variant to all enums representing protocol elements that may be extended in the future (which is basically every enum that gets (de)serialised, unless Moses literally brought the protocol with him from Mt Sinaï). In this particular case, that variant could even contain the string value naming the locally unknown protocol.

The current approach in multiaddr does not foresee such a feature, do you think it makes sense to add a parsing facility that accepts syntactically valid but semantically unknown strings?

EDIT: adding a use-case: forwarding multiaddrs should be fine as long as they’re syntactically valid, even when we don’t (yet) understand them.

With this commit `libp2p-identify` no longer discards the whole identify payload in case a listen addr of the remote node is invalid, but instead logs the failure, skips the invalid multiaddr and parses the remaining identify payload. > This is especially relevant when rolling out a new protocol to a live network. Say that most nodes > of a network run on an implementation version v1. Say that the `multiaddr` implementation is not > aware of the `webrtc/` protocol. Say that a new version (v2) is rolled out to the network with > support for the `webrtc/` protocol, listening via `webrtc/` by default. In such case all v1 nodes > would discard all identify payloads of v2 nodes, given that the v2 identify payloads would contain > the `webrtc/` protocol in their `listen_addr` addresses. See libp2p#3244 for details.

thomaseizinger · 2022-12-14T20:33:43Z

In addition to this fix — which I fully agree with — I usually add a FutureCompat variant to all enums representing protocol elements that may be extended in the future (which is basically every enum that gets (de)serialised, unless Moses literally brought the protocol with him from Mt Sinaï). In this particular case, that variant could even contain the string value naming the locally unknown protocol.

The current approach in multiaddr does not foresee such a feature, do you think it makes sense to add a parsing facility that accepts syntactically valid but semantically unknown strings?

I would have to check but I think we don't have a generic TLV encoding here so we don't know, how long the unknown protocol is.

For example, assume tcp is unknown. TCP looks like this: /tcp/1234. Would this parse as two unknown protocols?

In any case, I think this warrants opening an issue over at multiaddr.

altonen · 2022-12-15T08:57:17Z

Is there any workaround for this, like a possibility to retroactively attempt to upgrade the connection with these possibly unsupported protocols without causing the connection to be closed?

rkuhn · 2022-12-16T19:42:20Z

In any case, I think this warrants opening an issue over at multiaddr.

done: multiformats/rust-multiaddr#74

With this commit `libp2p-identify` no longer discards the whole identify payload in case a listen addr of the remote node is invalid, but instead logs the failure, skips the invalid multiaddr and parses the remaining identify payload. > This is especially relevant when rolling out a new protocol to a live network. Say that most nodes > of a network run on an implementation version v1. Say that the `multiaddr` implementation is not > aware of the `webrtc/` protocol. Say that a new version (v2) is rolled out to the network with > support for the `webrtc/` protocol, listening via `webrtc/` by default. In such case all v1 nodes > would discard all identify payloads of v2 nodes, given that the v2 identify payloads would contain > the `webrtc/` protocol in their `listen_addr` addresses. See #3244 for details.

mxinden · 2022-12-17T20:36:07Z

#3246 is merged and libp2p-identify v0.41.1 is published and thus the libp2p v0.50 family is patched.

In case folks would like me to backport this patch to the libp2p v0.49 family, please comment here.

Also note that we face the same issue in at least one other protocol, namely libp2p-kad:

rust-libp2p/protocols/kad/src/protocol.rs

Lines 106 to 110 in 67c741e

    
           let mut addrs = Vec::with_capacity(peer.addrs.len()); 
        
           for addr in peer.addrs.into_iter() { 
        
               let as_ma = Multiaddr::try_from(addr).map_err(invalid_data)?; 
        
               addrs.push(as_ma); 
        
           }

mxinden · 2022-12-19T12:31:39Z

Is there any workaround for this, like a possibility to retroactively attempt to upgrade the connection with these possibly unsupported protocols without causing the connection to be closed?

I don't understand your comment @altonen. Would you mind rephrasing it?

altonen · 2022-12-19T13:17:52Z

Problem for us is that parachains can be several months behind the latest client release meaning if we'd want to roll out QUIC and WebRTC soon, large portion of the network wouldn't be able to identify these new nodes which would be a problem until the network has updated to the libp2p version that has this fix in. We're working on a workaround which would temporarily split the Identify into two phases and in which these two new protocols are advertised in a second Identify push, allowing all nodes to Identify each other while also allowing nodes supporting these new protocols to exchange these new listening endpoints.

thomaseizinger · 2022-12-20T01:46:07Z

We're working on a workaround which would temporarily split the Identify into two phases and in which these two new protocols are advertised in a second Identify push, allowing all nodes to Identify each other while also allowing nodes supporting these new protocols to exchange these new listening endpoints.

That sounds like a clever way of doing it. Unfortunately, we didn't spot this problem earlier so I can't think of a way of having older deployments not fail here. To support your workaround, we'd probably need a configuration flag on identity::Config to allow filtering of the sent listen addresses.

mxinden · 2022-12-21T09:02:55Z

that parachains can be several months behind the latest client release

In case rolling out patch releases is an option, we can backport the patch to any previous libp2p release. The patch is very self-contained and the code hasn't changed since 2018. Thus this is not a large effort.

altonen · 2022-12-21T10:34:18Z

@thomaseizinger @mxinden

Thanks for your replies. If you could backport it to some older version of libp2p that would be great, then we wouldn't have to modify Identify temporarily for our needs. The plan was to keep the workaround inside Substrate anyway but having the fix in an older version of libp2p would be perfect.

thomaseizinger · 2022-12-21T11:14:12Z

Which version(s) of libp2p do you need the backport for?

altonen · 2022-12-21T16:28:50Z

Could you port it to all the way to v0.43.0?

thomaseizinger · 2022-12-21T21:43:58Z

Do you mean all versions until then or just 0.43.0?

altonen · 2022-12-22T05:38:52Z

All versions, sorry for confusion

Edit: We don't need the backport all the way to 0.43.0, 0.45.1 works for us just fine.

jxs · 2022-12-22T11:57:34Z

#3246 is merged and libp2p-identify v0.41.1 is published and thus the libp2p v0.50 family is patched.

In case folks would like me to backport this patch to the libp2p v0.49 family, please comment here.

Also note that we face the same issue in at least one other protocol, namely libp2p-kad:

rust-libp2p/protocols/kad/src/protocol.rs

Lines 106 to 110 in 67c741e

let mut addrs = Vec::with_capacity(peer.addrs.len());

for addr in peer.addrs.into_iter() {

let as_ma = Multiaddr::try_from(addr).map_err(invalid_data)?;

addrs.push(as_ma);

}

should we open an issue for this?

With this commit `libp2p-kad` no longer discards the whole peer payload in case an addr is invalid, but instead logs the failure, skips the invalid multiaddr and parses the remaining payload. See libp2p#3244 for details.

With this commit `libp2p-identify` no longer discards the whole identify payload in case a listen addr of the remote node is invalid, but instead logs the failure, skips the invalid multiaddr and parses the remaining identify payload. This is especially relevant when rolling out a new protocol to a live network. Say that most nodes of a network run on an implementation version v1. Say that the `multiaddr` implementation is not aware of the `webrtc/` protocol. Say that a new version (v2) is rolled out to the network with support for the `webrtc/` protocol, listening via `webrtc/` by default. In such case all v1 nodes would discard all identify payloads of v2 nodes, given that the v2 identify payloads would contain the `webrtc/` protocol in their `listen_addr` addresses. See #3244 for details.

With this commit `libp2p-kad` no longer discards the whole peer payload in case an addr is invalid, but instead logs the failure, skips the invalid multiaddr and parses the remaining payload. See #3244 for details. Co-authored-by: Thomas Eizinger <[email protected]>

With this commit `libp2p-kad` no longer discards the whole peer payload in case an addr is invalid, but instead logs the failure, skips the invalid multiaddr and parses the remaining payload. See libp2p#3244 for details. Co-authored-by: Thomas Eizinger <[email protected]>

With this commit `libp2p-kad` no longer discards the whole peer payload in case an addr is invalid, but instead logs the failure, skips the invalid multiaddr and parses the remaining payload. See #3244 for details. Co-authored-by: Thomas Eizinger <[email protected]>

With this commit `libp2p-dcutr` no longer discards the whole remote payload in case an addr is unparsable, but instead logs the failure and skips the unparsable multiaddr. See libp2p#3244 for details.

mxinden · 2023-01-06T15:07:46Z

Edit: We don't need the backport all the way to 0.43.0, 0.45.1 works for us just fine.

Note that these are 4 crates (identify, kad, dcutr, autonat) across 6 versions (0.45 - 0.50), thus 24 releases in total. Do you really need all those backports? In other words, are these outdated parachains going to upgrade version by version? Note that in case they do upgrade version by version, and do so across all nodes in a parachain, only the version before rolling out a new transport needs this set of patches.

In case you still need all those backports, would you mind creating appropriate backport pull requests like #3246 targeting the release branches?

With this commit `libp2p-dcutr` no longer discards the whole remote payload in case an addr is unparsable, but instead logs the failure and skips the unparsable multiaddr. See #3244 for details.

With this commit `libp2p-dcutr` no longer discards the whole remote payload in case an addr is unparsable, but instead logs the failure and skips the unparsable multiaddr. See libp2p#3244 for details.

With this commit `libp2p-dcutr` no longer discards the whole remote payload in case an addr is unparsable, but instead logs the failure and skips the unparsable multiaddr. See #3244 for details.

With this commit `libp2p-autonat` no longer discards the whole remote payload in case an addr is unparsable, but instead logs the failure and skips the unparsable multiaddr. See libp2p#3244 for details.

With this commit `libp2p-autonat` no longer discards the whole remote payload in case an addr is unparsable, but instead logs the failure and skips the unparsable multiaddr. See #3244 for details.

With this commit `libp2p-autonat` no longer discards the whole remote payload in case an addr is unparsable, but instead logs the failure and skips the unparsable multiaddr. See libp2p#3244 for details.

With this commit `libp2p-autonat` no longer discards the whole remote payload in case an addr is unparsable, but instead logs the failure and skips the unparsable multiaddr. See #3244 for details.

jxs · 2023-01-30T11:59:10Z

this one is done for both identify and autonat right?

altonen · 2023-02-09T11:26:27Z

Hi @mxinden sorry for late repy

We had a discussion internally and decided that it is easiest for us to update libp2p, release a node, wait for a few releases and only then release QUIC/WebRTC. In other words, we don't need backports, sorry for the trouble

mxinden · 2023-02-09T11:49:35Z

this one is done for both identify and autonat right?

Right. This is now done for identify, autonat, kad and dcutr. Closing here. Thanks for the reminder @jxs.

mxinden · 2023-02-09T11:50:45Z

wait for a few releases and only then release QUIC/WebRTC

Note @altonen that one release is enough, e.g. updating to v0.51.0 and then rolling out QUIC and WebRTC works.

altonen · 2023-02-09T12:17:23Z

Yeah that is our current plan, updating to v0.51.0 and waiting for couple of releases

With this commit `libp2p-autonat` no longer discards the whole remote payload in case an addr is unparsable, but instead logs the failure and skips the unparsable multiaddr. See libp2p#3244 for details.

mxinden added bug priority:important The changes needed are critical for libp2p, or are blocking another project difficulty:easy labels Dec 14, 2022

mxinden mentioned this issue Dec 14, 2022

fix(identify): Skip invalid multiaddr in listen_addrs #3246

Merged

4 tasks

p-shahi mentioned this issue Dec 14, 2022

Add identify protocol to test-plans libp2p/test-plans#91

Open

rkuhn mentioned this issue Dec 16, 2022

avoid breaking network compatibility multiformats/rust-multiaddr#74

Open

mxinden mentioned this issue Dec 23, 2022

fix(identify): Don't fail on unknown multiaddr protocol #3279

Merged

4 tasks

mxinden mentioned this issue Dec 23, 2022

fix(kad): Skip invalid multiaddr #3280

Merged

4 tasks

mxinden mentioned this issue Dec 26, 2022

fix(kad): Skip invalid multiaddr #3284

Merged

4 tasks

mxinden mentioned this issue Jan 2, 2023

fix(dcutr): Skip unparsable multiaddr (#3280) #3300

Merged

4 tasks

mxinden mentioned this issue Jan 11, 2023

fix(dcutr): Skip unparsable multiaddr #3320

Merged

4 tasks

mxinden mentioned this issue Jan 19, 2023

fix(autonat): Skip unparsable multiaddr #3351

Merged

4 tasks

jxs mentioned this issue Jan 20, 2023

fix(autonat): Skip unparsable multiaddr #3363

Merged

4 tasks

mxinden closed this as completed Feb 19, 2023

altonen mentioned this issue Mar 14, 2023

feat: add WebRTC transport paritytech/substrate#12529

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

identify/: Don't fail on unknown multiaddr protocol #3244

identify/: Don't fail on unknown multiaddr protocol #3244

mxinden commented Dec 14, 2022

rkuhn commented Dec 14, 2022 •

edited

Loading

thomaseizinger commented Dec 14, 2022

altonen commented Dec 15, 2022

rkuhn commented Dec 16, 2022

mxinden commented Dec 17, 2022

mxinden commented Dec 19, 2022

altonen commented Dec 19, 2022

thomaseizinger commented Dec 20, 2022 •

edited

Loading

mxinden commented Dec 21, 2022 •

edited

Loading

altonen commented Dec 21, 2022

thomaseizinger commented Dec 21, 2022

altonen commented Dec 21, 2022

thomaseizinger commented Dec 21, 2022

altonen commented Dec 22, 2022 •

edited

Loading

jxs commented Dec 22, 2022

mxinden commented Jan 6, 2023

jxs commented Jan 30, 2023

altonen commented Feb 9, 2023

mxinden commented Feb 9, 2023

mxinden commented Feb 9, 2023

altonen commented Feb 9, 2023 •

edited

Loading

identify/: Don't fail on unknown multiaddr protocol #3244

identify/: Don't fail on unknown multiaddr protocol #3244

Comments

mxinden commented Dec 14, 2022

Summary

Expected behaviour

Actual behaviour

Possible Solution

Version

Misc

Would you like to work on fixing this bug?

rkuhn commented Dec 14, 2022 • edited Loading

thomaseizinger commented Dec 14, 2022

altonen commented Dec 15, 2022

rkuhn commented Dec 16, 2022

mxinden commented Dec 17, 2022

mxinden commented Dec 19, 2022

altonen commented Dec 19, 2022

thomaseizinger commented Dec 20, 2022 • edited Loading

mxinden commented Dec 21, 2022 • edited Loading

altonen commented Dec 21, 2022

thomaseizinger commented Dec 21, 2022

altonen commented Dec 21, 2022

thomaseizinger commented Dec 21, 2022

altonen commented Dec 22, 2022 • edited Loading

jxs commented Dec 22, 2022

mxinden commented Jan 6, 2023

jxs commented Jan 30, 2023

altonen commented Feb 9, 2023

mxinden commented Feb 9, 2023

mxinden commented Feb 9, 2023

altonen commented Feb 9, 2023 • edited Loading

rkuhn commented Dec 14, 2022 •

edited

Loading

thomaseizinger commented Dec 20, 2022 •

edited

Loading

mxinden commented Dec 21, 2022 •

edited

Loading

altonen commented Dec 22, 2022 •

edited

Loading

altonen commented Feb 9, 2023 •

edited

Loading