From b57ac01e5671054cbf7c34c8601485ec4ae66645 Mon Sep 17 00:00:00 2001 From: mateuszpierzchala-splunk <106949453+mateuszpierzchala-splunk@users.noreply.github.com> Date: Wed, 7 Sep 2022 13:33:00 +0200 Subject: [PATCH] docs: add tcp vs udp part (#1801) * docs: add tcp vs udp part * Update architecture.md * fix: hardcoding poetry version (#1802) * chore: adding step name (#1803) Co-authored-by: Lukasz Loboda <76950960+uoboda-splunk@users.noreply.github.com> --- .github/workflows/ci-main.yaml | 3 ++- docs/architecture.md | 22 +++++++++++++++++++--- package/Dockerfile | 4 ++-- 3 files changed, 23 insertions(+), 6 deletions(-) diff --git a/.github/workflows/ci-main.yaml b/.github/workflows/ci-main.yaml index b6ad4610a7..6b540058d9 100644 --- a/.github/workflows/ci-main.yaml +++ b/.github/workflows/ci-main.yaml @@ -182,7 +182,8 @@ jobs: with: submodules: false persist-credentials: false - - run: | + - name: Run tests + run: | pip3 install poetry poetry install mkdir -p test-results || true diff --git a/docs/architecture.md b/docs/architecture.md index caaeaf982e..86c1043687 100644 --- a/docs/architecture.md +++ b/docs/architecture.md @@ -41,6 +41,22 @@ _will_ be data loss (think CD-quality (lossless) vs. MP3). Syslog data collecti ## UDP vs. TCP -Paradoxically, UDP for syslog actually ends up being a better choice for resiliency for syslog. For an excellent discussion on this topic -(as well as the "myth" of load balancers for HA), -see [Performant AND Reliable Syslog: UDP is best](https://www.rfaircloth.com/2020/05/21/performant-and-reliable-syslog-udp-is-best/). +For running syslog UDP is recommended over TCP. + +The syslogd daemon was originally configured to use UDP for log forwarding to reduce overhead. +While UDP is an unreliable protocol, it's streaming method does not require the overhead of establishing a network session. +This protocol also reduces network load as the network stream with no required receipt verification or window adjustment. +While TCP could seem a better choice because it uses ACKS and there should not be data loss, there are some cases when it's possible: +* The TCP session is closed events published while the system is creating a new session will be lost. (Closed Window Case) +* The remote side is busy and can not ack fast enough events are lost due to local buffer full +* A single ack is lost by the network and the client closes the connection. (local and remote buffer lost) +* The remote server restarts for any reason (local buffer lost) +* The remote server restarts without closing the connection (local buffer plus timeout time lost) +* The client side restarts without closing the connection + +Additionally as stated before it causes more overhead on the network. +TCP should be used in case of the syslog event is larger than the maximum size of the UDP packet on your network typically limited to Web Proxy, DLP and IDs type sources. +To decrease drawbacks of TCP you can use TLS over TCP: +* The TLS can continue a session over a broken TCP reducing buffer loss conditions +* The TLS will fill packets for more efficient use of wire +* The TLS will compress in most cases diff --git a/package/Dockerfile b/package/Dockerfile index 9349207a2e..7c171c1726 100644 --- a/package/Dockerfile +++ b/package/Dockerfile @@ -54,7 +54,7 @@ COPY package/etc/goss.yaml /etc/syslog-ng/goss.yaml COPY pyproject.toml / COPY poetry.lock / -RUN pip3 install poetry +RUN pip3 install poetry==1.1.15 RUN poetry export --format requirements.txt | pip3 install --user -r /dev/stdin COPY package/etc/syslog-ng.conf /etc/syslog-ng/syslog-ng.conf @@ -71,4 +71,4 @@ ENV SC4S_CONTAINER_OPTS=--no-caps ARG VERSION=unknown RUN echo $VERSION>/etc/syslog-ng/VERSION -ENTRYPOINT ["/entrypoint.sh"] \ No newline at end of file +ENTRYPOINT ["/entrypoint.sh"]