Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Subscription & Monitoring breaks if Ethernet is unplugged, monitoring layer issue? #1668

Open
chengguizi opened this issue Jul 20, 2024 · 4 comments

Comments

@chengguizi
Copy link

Problem Description

We have been using eCAL on embedded system. A typical opertion procedure is that:

  • Host laptop connect to the embedded device using Ethernet cable
  • Host starts eCAL pub node and start sub node, both running on device
  • Host disconnects the Ethernet, hoping device running headless

However, this operation is not working as expected. I have identify the key affecting factors.

  1. The break of stream to sub node happens after N seconds. Specified by
[monitoring]
timeout                   = 5000

That is to say, if i increase the timeout to 50000, instead of sub node breaks after 5 sec, it breaks after 50 sec or so
2. To investigate further, on monitoring layer. I have tried changing the following setting

shm_monitoring_enabled      = true

The issue seems to go away!

What is happening here? Is monitoring layer not falling back to lo multicast successfully, during runtime?

routing table on device. It has been setup that the multicast should fall back to lo if br0 (ethernet interface) is not present

default via 10.42.0.1 dev br0 proto dhcp src 10.42.0.64 metric 1024
...
239.0.0.0/24 dev br0 proto static scope link 
239.0.0.0/24 dev lo proto static scope link metric 1000 

How to reproduce

on embedded device:

Use ssh through ethernet connection:

screen -R test
$ ecal_sample_person_snd

Ctrl+A D (to detach)

in consult UART connection

ecal_mon_tui

## click / Enter on the topic to inspect the actual incoming stream of data

Observe that everything works correctly. Now unplug ethernet. After 5 second or so. the mon_tui become blank. Then, replug Ethernet, things comeback

Another strange thing. If we run ecal_sample_person_rec, the issue seems not there.

How did you get eCAL?

Custom Build / Built from source

Environment

Debian 12, arm64

eCAL System Information

$ ecal_config
------------------------- SYSTEM ---------------------------------
Version                  : v5.11.8 (2024-02-07 16:34:55 +0100)
Platform                 : linux

------------------------- CONFIGURATION --------------------------
Default INI              : /etc/ecal/ecal.ini

------------------------- NETWORK --------------------------------
Host name                : huimin-Vostro-5320
Network mode             : cloud
Network ttl              : 2
Network sndbuf           : 5 MByte
Network rcvbuf           : 5 MByte
Multicast group          : 239.0.0.1
Multicast mask           : 0.0.0.15
Multicast ports          : 14000 - 14010
Multicast join all IFs   : off
Bandwidth limit (udp)    : not limited

------------------------- TIME -----------------------------------
Synchronization realtime : "ecaltime-localtime"
Synchronization replay   : 
State                    :  synchronized 
Master / Slave           :  Master 
Status (Code)            : "everything is fine." (0)

------------------------- PUBLISHER LAYER DEFAULTS ---------------
Layer Mode INPROC        : auto
Layer Mode SHM (ZEROCPY) : auto
Layer Mode TCP           : off
Layer Mode UDP MC        : auto

------------------------- SUBSCRIPTION LAYER DEFAULTS ------------
Layer Mode INPROC        : on
Layer Mode SHM           : on
Layer Mode TCP           : on
Layer Mode UDP MC        : on
Npcap UDP Reciever       : off
@chengguizi
Copy link
Author

It is also reproducible using ecal_mon_cli -l, when the commanded started with Ethernet connected.

@KerstinKeller
Copy link
Contributor

Hi @chengguizi we'll look into this problem.
In general, there is one socket for sending and one for receiving data for udp monitoring/registration traffic.
I am unsure how that socket behaves, when the cable is unplugged, and how we handle this, in case of network enabled.

If you're working in local mode, there are no problems (at least on Windows), but as said, we need to investigate for Linux devices.

If you're enabling shm monitoring, you're sending monitoring info on both shm and udp. Shm is unaffected by network settings, and continues to function. However, with this mode you see only processes on the same host.

@chengguizi
Copy link
Author

Hi @KerstinKeller , yes I agree there is no issues for Linux when in local mode. However, the use case was to use in cloud mode. And it is kind of a unwanted behaviour that such breaks happens when ethernet cable is unplugged.

@chengguizi
Copy link
Author

@KerstinKeller Any updates on this? Should be able to reproduce on any embedded SBC, i.e. rasperberry pi with Ethernet

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants