docker compose setup to reproduce Kong/kong#8531
We create a dummy plugin in each available PDK language:
Rust PDK is incomplete:
Each plugin injects a specific header, e.g. x-goplugin
and forwards the request to a simple reflection service, which the loadtest tests against to check if the plugin was intercepting the request.
We use Kong version 2.7.1
.
docker compose up > kong.log
(cd loadtest && cargo run --release -- --host http://localhost --no-reset-metrics --no-task-metrics -u 8 -t 20) > metrics.log
The loadtest accepts LOADTEST_TASKS
environment var to select specific plugin tests:
LOADTEST_TASKS=lua,python,js,go cargo run --release -- --host http://localhost
docker compose exec kong kong reload
Kong serves under reasonable load - see goose loadtest.
In every case we were able to DOS kong plugins by reload.
All 8 users hatched.
22:18:33 [WARN] "/js": error sending request for url (http://localhost/js): connection closed before message completed
22:18:33 [WARN] "/js": error sending request for url (http://localhost/js): connection closed before message completed
22:18:33 [WARN] "/go": error sending request for url (http://localhost/go): connection closed before message completed
22:18:33 [WARN] "/go": error sending request for url (http://localhost/go): connection closed before message completed
22:18:33 [WARN] "/go": error sending request for url (http://localhost/go): connection closed before message completed
22:18:54 [WARN] "/js": error sending request for url (http://localhost/js): operation timed out
22:18:54 [WARN] "/js": error sending request for url (http://localhost/js): operation timed out
22:18:55 [WARN] "/js": error sending request for url (http://localhost/js): operation timed out
22:18:55 [WARN] "/js": error sending request for url (http://localhost/js): operation timed out
22:18:55 [WARN] "/js": error sending request for url (http://localhost/js): operation timed out
22:18:55 [WARN] "/js": error sending request for url (http://localhost/js): operation timed out
22:18:56 [WARN] "/js": error sending request for url (http://localhost/js): operation timed out
22:18:56 [WARN] "/js": error sending request for url (http://localhost/js): operation timed out
=== PER REQUEST METRICS ===
------------------------------------------------------------------------------
Name | # reqs | # fails | req/s | fail/s
------------------------------------------------------------------------------
GET /go | 14,312 | 282 (2.0%) | 715.60 | 14.10
GET /js | 14,320 | 618 (4.3%) | 716.00 | 30.90
GET /python | 14,320 | 2 (0%) | 716.00 | 0.10
-------------------------+---------------+----------------+----------+--------
Aggregated | 42,952 | 902 (2.1%) | 2148 | 45.10
------------------------------------------------------------------------------
Name | Avg (ms) | Min | Max | Median
------------------------------------------------------------------------------
GET /go | 0.93 | 1 | 202 | 1
GET /js | 13.70 | 1 | 20,001 | 2
GET /python | 1.93 | 1 | 2,061 | 1
-------------------------+-------------+------------+-------------+-----------
Aggregated | 5.52 | 1 | 20,001 | 1
------------------------------------------------------------------------------
Slowest page load within specified percentile of requests (in ms):
------------------------------------------------------------------------------
Name | 50% | 75% | 98% | 99% | 99.9% | 99.99%
------------------------------------------------------------------------------
GET /go | 1 | 1 | 3 | 4 | 11 | 200
GET /js | 2 | 3 | 6 | 8 | 54 | 20,000
GET /python | 1 | 2 | 4 | 5 | 19 | 430
-------------------------+--------+--------+--------+--------+--------+-------
Aggregated | 1 | 2 | 5 | 6 | 25 | 20,000
=== ERRORS ===
------------------------------------------------------------------------------
Count | Error
------------------------------------------------------------------------------
608 GET /js: plugin header missing
279 GET /go: plugin header missing
8 GET /js: error sending request /js: operation timed out
3 GET /go: error sending request /go: connection closed before message completed
2 GET /python: plugin header missing
2 GET /js: error sending request /js: connection closed before message completed
------------------------------------------------------------------------------
------------------------------------------------------------------------------
Users: 8
Target host: http://localhost/
Starting: 2022-04-27 00:18:19 - 2022-04-27 00:18:26 (duration: 00:00:07)
Running: 2022-04-27 00:18:26 - 2022-04-27 00:18:46 (duration: 00:00:20)
Stopping: 2022-04-27 00:18:46 - 2022-04-27 00:18:56 (duration: 00:00:10)
goose v0.15.2
------------------------------------------------------------------------------
error log contains further details.
- is not graceful due to closed connections
- plugin is nearly stable and seems to "repair" its corruption.
- is not graceful - there are closed connections
- plugin is skipped/omitted, but reflection service reached (missing header)
- reaches a state, where a certain rate of all consecutive requests is corrupt.
- is not graceful - closed connections and also timeouts
- plugin is skipped/omitted, but reflection service reached (missing header)
- reaches a state, where all consecutive requests are corrupt.
- is not graceful due to close connections
- we were unable to trigger any corruption
Sometimes a nginx worker turns crazy by consuming 100% CPU - strace
shows:
epoll_pwait(50, [], 512, 0, NULL, 8) = 0
epoll_pwait(50, [], 512, 0, NULL, 8) = 0
epoll_pwait(50, [], 512, 0, NULL, 8) = 0
epoll_pwait(50, [], 512, 0, NULL, 8) = 0
epoll_pwait(50, [], 512, 0, NULL, 8) = 0
epoll_pwait(50, [], 512, 0, NULL, 8) = 0
epoll_pwait(50, [], 512, 0, NULL, 8) = 0
epoll_pwait(50, [], 512, 0, NULL, 8) = 0
epoll_pwait(50, [], 512, 0, NULL, 8) = 0
epoll_pwait(50, [], 512, 0, NULL, 8) = 0
- All plugin types resp. languages suffer from closed connections when reloading kong.
- Lua and Python plugins seem to stay sane.
- Go and JavaScript plugins end up in a currupted state, where the only way to fix this is to reload/restart without load (yes also cold restarts under load have their pitfalls).
- JavaScript plugins seem to fail in 100 % of cases after curruption takes place.
Since Python and JavaScript use mp_rpc.lua
and their behavior is dramatically
different, we conclude, that this issue is driven by the specific
implementation of the plugin server.