HAProxy crashing on start, thread 2 is about to kill the process. #224

ahuston-0 · 2023-12-11T16:20:36Z

Not sure if I should file this here or with the upstream, but my haproxy setup was working fine for months until a few days ago. I'm not sure what changed, but now it crashes with "Thread 2 is about to kill the process" and then no traffic goes through.

Environment Details
OS: NixOS 23.11
Docker 24.0.5
Storage is OverlayFS2 on ZFS 2.2.2

HAProxy config

global
 # log stdout format raw local0 info
  log stdout format raw local0
  crt-base /etc/ssl/certs/

defaults
  log global
  mode http
  timeout client 2000m
  timeout connect 200s
  timeout server 2000m
  timeout http-request 2000m

#Application Setup
frontend ContentSwitching
  bind *:80
 # bind *:443 ssl crt /etc/ssl/certs/cloudflare.pem
  bind *:443 ssl crt /etc/ssl/certs/origin_ca_ecc_root_new.pem
  mode  http
  option httplog

  # Front-end acess control list
  acl host_glances hdr(host) -i monit.mydomain.xyz
  acl host_glances hdr(host) -i glances.mydomain.xyz
  # Backend-forwarding
  use_backend glances_nodes if host_glances

backend glances_nodes
  mode http
  server server glances:61208

Docker Compose file

services:
  haproxy:
    privileged: false
    restart: always
    image:latest
     volumes:
      - ./haproxy:/usr/local/etc/haproxy:ro
      - ./certs:/etc/ssl/certs:ro
    ports:
      - 9080:80
      - 9443:443
      - 8404:8404
      - 25565:25565
    environment:
      - PUID=997
      - PGID=100
    networks:
      - web
      - default
networks:
  web:
    name: haproxy-net

Crash logs

haproxy-1 | [NOTICE] (1) : New worker (8) forked
haproxy-1 | [NOTICE] (1) : Loading success.
haproxy-1 | Thread 2 is about to kill the process.
haproxy-1 | Thread 1 : id=0x7fc626c57100 act=1 glob=0 wq=0 rq=1 tl=0 tlsz=0 rqsz=1
haproxy-1 | 1/1 stuck=0 prof=0 harmless=0 isolated=0
haproxy-1 | cpu_ns: poll=0 now=6035627479 diff=6035627479
haproxy-1 | curr_task=0
haproxy-1 | *>Thread 2 : id=0x7fc626c4b700 act=0 glob=0 wq=0 rq=0 tl=0 tlsz=0 rqsz=0
haproxy-1 | 1/2 stuck=1 prof=0 harmless=0 isolated=0
haproxy-1 | cpu_ns: poll=97660 now=2001614429 diff=2001516769
haproxy-1 | curr_task=0
haproxy-1 | call trace(10):
haproxy-1 | | 0x564a2f23d661 [eb ba 66 66 2e 0f 1f 84]: ha_thread_dump+0x91/0x93
haproxy-1 | | 0x564a2f23d8a1 [64 48 8b 53 10 64 48 8b]: ha_panic+0x111/0x3f4
haproxy-1 | | 0x7fc626f86140 [48 c7 c0 0f 00 00 00 0f]: libpthread:+0x13140
haproxy-1 | | 0x564a2f284282 [48 83 7a 28 00 74 36 48]: fd_reregister_all+0x32/0xb1
haproxy-1 | | 0x564a2f0b5709 [b8 01 00 00 00 5b 5d 41]: main+0x4ec9
haproxy-1 | | 0x564a2f21345e [85 c0 0f 84 5a 03 00 00]: main+0x162c1e
haproxy-1 | | 0x7fc626f7aea7 [64 48 89 04 25 30 06 00]: libpthread:+0x7ea7
haproxy-1 | | 0x7fc626e9aa2f [48 89 c7 b8 3c 00 00 00]: libc:clone+0x3f/0x5a

The text was updated successfully, but these errors were encountered:

Darlelet · 2023-12-14T16:01:17Z

If you still encounter the issue please file an issue on HAProxy directly (https://github.com/haproxy/haproxy/issues)

"Thread X is about to kill the process" means haproxy watchdog noticed that one thread has become unresponsive and to prevent further issues the watchdog decided to abort the process.

Since you're deploying using the :latest tag, it's very likely that you are hitting a bug or limitation which only happens on haproxy 2.9 (which was just released, see 81e9df2) and didn't show up with the previous version (2.8)

ahuston-0 · 2023-12-14T17:58:22Z

For reference, I've managed to get a copy of my haproxy config running with the nixos haproxy service, so the issue is isolated to docker.

ahuston-0 · 2023-12-14T18:09:15Z

If you still encounter the issue please file an issue on HAProxy directly (https://github.com/haproxy/haproxy/issues)

"Thread X is about to kill the process" means haproxy watchdog noticed that one thread has become unresponsive and to prevent further issues the watchdog decided to abort the process.

Since you're deploying using the :latest tag, it's very likely that you are hitting a bug or limitation which only happens on haproxy 2.9 (which was just released, see 81e9df2) and didn't show up with the previous version (2.8)

I think I've seen the same on haproxy lts but let me file a bug with upstream. Thanks

Darlelet · 2024-01-13T14:17:03Z

Any news about this issue? Do you still encounter the crash with :latest or :2.9.2 tag which where some high cpu usage related bugs were addressed?

Thanks

ahuston-0 · 2024-01-14T02:34:39Z

Hi, I've upgraded to the latest version and will get back in a few hours. I did discover though that the original issue I was having that was causing these logs is some memory/FD bug. If I dont add a nofile limit to the container it was just consuming like 200GB of memory and then crashing (this is on a server with ~256GB of memory available).

ahuston-0 · 2024-01-14T02:35:07Z

This is what I added

    ulimits:
      nofile:
        soft: 1024
        hard: 4096

Darlelet · 2024-01-15T08:16:11Z

Does it instantly ramp up to 200GB of used memory or is it slowing getting there, which would suggest a leak somewhere?

In the first case, maybe this is not related to haproxy itself but to docker engine update (which removed or increased an existing limit), see haproxy/haproxy#2043. If that's the case, you could also mitigate using maxconn or fd-hard-limit global parameters in haproxy config file.

ahuston-0 · 2024-01-15T18:21:27Z

Does it instantly ramp up to 200GB of used memory or is it slowing getting there, which would suggest a leak somewhere?

In the first case, maybe this is not related to haproxy itself but to docker engine update (which removed or increased an existing limit), see haproxy/haproxy#2043. If that's the case, you could also mitigate using maxconn or fd-hard-limit global parameters in haproxy config file.

It was happening over the span of like a minute or two. I can probably time it and get back. I'll check out those two settings and see if they help.

Regarding the CPU utilization issue, I did the upgrade last night and 12 hours in we're at 0.6% CPU so I think that one might have worked.

ahuston-0 · 2024-01-16T00:38:44Z

Does it instantly ramp up to 200GB of used memory or is it slowing getting there, which would suggest a leak somewhere?

In the first case, maybe this is not related to haproxy itself but to docker engine update (which removed or increased an existing limit), see haproxy/haproxy#2043. If that's the case, you could also mitigate using maxconn or fd-hard-limit global parameters in haproxy config file.

It looks like this ended up working. I removed the ulimits settings in docker-compose.yml and replaced it with maxconn 60000 in haproxy.cfg and now theres no more high memory utlization and crashing. Then for the other issue having upgraded to 2.9.2 seems to have fixed the high CPU utilization.

Darlelet · 2024-01-16T07:39:10Z

Great news, thanks for sharing your positive results with us.

LaurentGoderre · 2024-04-23T17:03:20Z

Can we close this as solved?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

HAProxy crashing on start, thread 2 is about to kill the process. #224

HAProxy crashing on start, thread 2 is about to kill the process. #224

ahuston-0 commented Dec 11, 2023

Darlelet commented Dec 14, 2023

ahuston-0 commented Dec 14, 2023

ahuston-0 commented Dec 14, 2023

Darlelet commented Jan 13, 2024 •

edited

Loading

ahuston-0 commented Jan 14, 2024

ahuston-0 commented Jan 14, 2024

Darlelet commented Jan 15, 2024

ahuston-0 commented Jan 15, 2024

ahuston-0 commented Jan 16, 2024

Darlelet commented Jan 16, 2024

LaurentGoderre commented Apr 23, 2024

HAProxy crashing on start, thread 2 is about to kill the process. #224

HAProxy crashing on start, thread 2 is about to kill the process. #224

Comments

ahuston-0 commented Dec 11, 2023

Darlelet commented Dec 14, 2023

ahuston-0 commented Dec 14, 2023

ahuston-0 commented Dec 14, 2023

Darlelet commented Jan 13, 2024 • edited Loading

ahuston-0 commented Jan 14, 2024

ahuston-0 commented Jan 14, 2024

Darlelet commented Jan 15, 2024

ahuston-0 commented Jan 15, 2024

ahuston-0 commented Jan 16, 2024

Darlelet commented Jan 16, 2024

LaurentGoderre commented Apr 23, 2024

Darlelet commented Jan 13, 2024 •

edited

Loading