-
Notifications
You must be signed in to change notification settings - Fork 36
Description
What happened:
submariner-lighthouse-coredns has ready plugin enabled. We have added a readiness probe to submariner-lighthouse-coredns deployment to probe endpoint exposed by ready plugin. We did this to make sure submariner-lighthouse-coredns rollouts don't cause DNS availability loss during pod recycling.
readinessProbe:
failureThreshold: 3
httpGet:
path: /ready
port: 8181
scheme: HTTP
initialDelaySeconds: 60
periodSeconds: 10
successThreshold: 1
timeoutSeconds: 1
But submariner-lighthouse-coredns pod are marked as ready before submariner-lighthouse-coredns
have processed all the EndpointSlices in the cluster.
This causes submariner-lighthouse-coredns pod to return NXDOMAIN error if the pod has yet to process endpointslice that contains the required IPs to serve DNS request.
What you expected to happen:
Ideally submariner-lighthouse-coredns pod should only become ready once it can serve all DNS records in the clusterset.
CoreDNS docs say that ready.Readiness
interface needs to be implemented to bubble up the health signal. But I don't see that being done in lighthouse code. I assume this is why pod is being marked ready prematurely?
How to reproduce it (as minimally and precisely as possible):
# Scale in submariner-operator to prevent it from overriding changes
kubectl scale deployment submariner-operator --replicas 0
# Edit submariner-lighthouse-coredns deployment. Add readinessProbe against /ready endpoint on port 8181
kubectl edit deployment submariner-lighthouse-coredns
# Restart deployment to simulate new rollout
kubectl rollout restart deployment submariner-lighthouse-coredns
# Observe that new pods become ready immediately before all Endpointslices can be processed by the submariner-lighthouse-coredns pod
Anything else we need to know?:
Environment:
Submariner version 0.17.3
Metadata
Metadata
Assignees
Labels
Type
Projects
Status