Skip to content

🐛 BUG: Extremely delayed traffic #1425

@Cyberes

Description

@Cyberes

What version of nebula are you using? (nebula -version)

1.9.5

What operating system are you using?

Linux

Describe the Bug

One of my hosts is very very slow to respond over Nebula. Restarting Nebula does not fix the issue. I have not tried rebooting the host.

Ping from lighthouse:

:~$ ping h.dump.nb
PING h.dump.nb (172.0.4.2) 56(84) bytes of data.
64 bytes from backups.nb (172.0.4.2): icmp_seq=1 ttl=64 time=10829 ms
64 bytes from backups.nb (172.0.4.2): icmp_seq=2 ttl=64 time=13397 ms
64 bytes from backups.nb (172.0.4.2): icmp_seq=3 ttl=64 time=15862 ms
64 bytes from backups.nb (172.0.4.2): icmp_seq=4 ttl=64 time=18300 ms
64 bytes from backups.nb (172.0.4.2): icmp_seq=5 ttl=64 time=21158 ms

While pinging its LAN IP is fine:

:~$ ping 10.0.0.192
PING 10.0.0.192 (10.0.0.192) 56(84) bytes of data.
64 bytes from 10.0.0.192: icmp_seq=1 ttl=63 time=49.5 ms
64 bytes from 10.0.0.192: icmp_seq=2 ttl=63 time=44.8 ms
64 bytes from 10.0.0.192: icmp_seq=3 ttl=63 time=62.8 ms
64 bytes from 10.0.0.192: icmp_seq=4 ttl=63 time=46.2 ms
64 bytes from 10.0.0.192: icmp_seq=5 ttl=63 time=36.9 ms

Pinging from the host to its own Nebula IP is also fine:

root@dump:~# ping 172.0.4.2
PING 172.0.4.2 (172.0.4.2) 56(84) bytes of data.
64 bytes from 172.0.4.2: icmp_seq=1 ttl=64 time=0.131 ms
64 bytes from 172.0.4.2: icmp_seq=2 ttl=64 time=0.116 ms
64 bytes from 172.0.4.2: icmp_seq=3 ttl=64 time=0.088 ms
64 bytes from 172.0.4.2: icmp_seq=4 ttl=64 time=0.140 ms
64 bytes from 172.0.4.2: icmp_seq=5 ttl=64 time=0.134 ms
64 bytes from 172.0.4.2: icmp_seq=6 ttl=64 time=0.127 ms

This host is under heavy load (it's a Proxmox Backup Server):

Image

I took a pprof: https://files.catbox.moe/2r59zo.pprof

I can only ping from the lighthouse because it seems to fail to open a tunnel to other hosts.

Logs from affected hosts

dump (Proxmox Backup Server), the affected host: dump.txt

My desktop while trying to ping the host from the same LAN its on: desktop.txt

Config files from affected hosts

From dump, the affected host:

firewall:
  conntrack:
    default_timeout: 10m
    max_connections: 100000
    tcp_timeout: 12m
    udp_timeout: 3m
  inbound:
  - host: any
    port: any
    proto: icmp
  - groups: XXX
    port: any
    proto: any
  outbound:
  - host: any
    port: any
    proto: any
lighthouse:
  am_lighthouse: false
  hosts:
  - 172.0.0.2
  - 172.0.0.3
  interval: 60
listen:
  batch: 128
  host: 0.0.0.0
  port: 0
  read_buffer: 10485760
  write_buffer: 10485760
logging:
  format: text
  level: info
pki:
  ca: |
    -----BEGIN NEBULA CERTIFICATE-----
    xxx
    -----END NEBULA CERTIFICATE-----
  cert: |
    -----BEGIN NEBULA CERTIFICATE-----
    xxx
    -----END NEBULA CERTIFICATE-----
  key: |
    -----BEGIN NEBULA X25519 PRIVATE KEY-----
    xxx
    -----END NEBULA X25519 PRIVATE KEY-----
punchy:
  delay: 1s
  punch: true
  punch_back: true
  respond: true
relay:
  am_relay: false
  relays:
  - 172.0.0.2
  - 172.0.0.3
  use_relays: true
static_host_map:
  172.0.0.2:
  - XXX.XXX.XXX.XXX:4242
  172.0.0.3:
  - XXX.XXX.XXX.XXX:4242
tun:
  dev: nebula1
  disabled: false
  drop_local_broadcast: false
  drop_multicast: false
  routes: null
  tx_queue: 5000
  unsafe_routes: null

sshd:
  enabled:          true
  listen:           0.0.0.0:2222
  host_key:         /etc/nebula/ssh_host_ed25519_key
  authorized_users:
    - user:         XXX
      keys:
        - "XXX"

From my desktop:

firewall:
  conntrack:
    default_timeout: 10m
    max_connections: 100000
    tcp_timeout: 12m
    udp_timeout: 3m
  inbound:
  - host: any
    port: any
    proto: icmp
  - groups: XXX
    port: any
    proto: any
  outbound:
  - host: any
    port: any
    proto: any
lighthouse:
  am_lighthouse: false
  hosts:
  - 172.0.0.2
  - 172.0.0.3
  interval: 60
listen:
  batch: 128
  host: 0.0.0.0
  port: 0
  read_buffer: 10485760
  write_buffer: 10485760
logging:
  format: text
  level: info
pki:
  ca: |
    -----BEGIN NEBULA CERTIFICATE-----
    XXX
    -----END NEBULA CERTIFICATE-----
  cert: |
    -----BEGIN NEBULA CERTIFICATE-----
    XXX
    -----END NEBULA CERTIFICATE-----
  key: |
    -----BEGIN NEBULA X25519 PRIVATE KEY-----
    XXX
    -----END NEBULA X25519 PRIVATE KEY-----
punchy:
  delay: 1s
  punch: true
  punch_back: true
  respond: true
relay:
  am_relay: false
  relays:
  - 172.0.0.2
  - 172.0.0.3
  use_relays: true
static_host_map:
  172.0.0.2:
  - XXX.XXX.XXX.XXX:4242
  172.0.0.3:
  - XXX.XXX.XXX.XXX:4242
tun:
  dev: nebula1
  disabled: false
  drop_local_broadcast: false
  drop_multicast: false
  routes: null
  tx_queue: 5000
  unsafe_routes: null

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions