Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dynamically Return 204 in Nginx Ingress When Backend Returns 500, and Resume Normal Behavior When Backend Recover #12004

Open
umlumpa opened this issue Sep 21, 2024 · 5 comments
Labels
kind/feature Categorizes issue or PR as related to a new feature. lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. needs-priority needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one.

Comments

@umlumpa
Copy link

umlumpa commented Sep 21, 2024

Problem
We are facing an issue where our backend service, behind an Nginx Ingress Controller, occasionally starts returning HTTP 500 errors. When this happens, we notice that our Nginx Ingress Controller (configured with hostNetwork: true) consumes an excessive amount of CPU and memory, often reaching 100%.

To mitigate the load during these failure scenarios, we want Nginx Ingress Controller to automatically return an HTTP 204 status code whenever the backend starts returning 500 errors. The goal is to avoid sending traffic to the backend when it's in a failing state. Once the backend recovers and starts returning HTTP 200, we want Nginx to stop returning 204 and resume forwarding traffic to the backend normally.

Proposed Solution
We are looking for a way to implement dynamic behavior in the Nginx Ingress Controller:

When the backend starts returning 500: Nginx should immediately start responding with 204 for all incoming requests to a specific path (e.g., /bid) without forwarding these requests to the backend.
When the backend recovers and returns 200: Nginx should remove the 204 rule and resume forwarding traffic to the backend normally.
We attempted to find a solution within the current Nginx Ingress annotations but couldn’t find a dynamic mechanism to implement this. A possible solution could be to use Lua scripting in Nginx to track backend responses and adjust the behavior accordingly, but this is not something that can be done with existing Ingress annotations. Is there some solutions?

@umlumpa umlumpa added the kind/bug Categorizes issue or PR as related to a bug. label Sep 21, 2024
@k8s-ci-robot
Copy link
Contributor

This issue is currently awaiting triage.

If Ingress contributors determines this is a relevant issue, they will accept it by applying the triage/accepted label and provide further guidance.

The triage/accepted label can be added by org members by writing /triage accepted in a comment.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@k8s-ci-robot k8s-ci-robot added needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. needs-priority labels Sep 21, 2024
@chengjoey
Copy link
Contributor

If liveness is configured for the backend service, when the backend returns 500, the pod will become unhealthy and traffic will not be forwarded to the backend service. Does this solve your problem?

@longwuyuan
Copy link
Contributor

/remove-kind bug

@k8s-ci-robot k8s-ci-robot added needs-kind Indicates a PR lacks a `kind/foo` label and requires one. and removed kind/bug Categorizes issue or PR as related to a bug. labels Sep 25, 2024
@longwuyuan
Copy link
Contributor

/kind feature

@k8s-ci-robot k8s-ci-robot added kind/feature Categorizes issue or PR as related to a new feature. and removed needs-kind Indicates a PR lacks a `kind/foo` label and requires one. labels Sep 25, 2024
Copy link

This is stale, but we won't close it automatically, just bare in mind the maintainers may be busy with other tasks and will reach your issue ASAP. If you have any question or request to prioritize this, please reach #ingress-nginx-dev on Kubernetes Slack.

@github-actions github-actions bot added the lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. label Oct 25, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/feature Categorizes issue or PR as related to a new feature. lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. needs-priority needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one.
Projects
Development

No branches or pull requests

4 participants