Resource requests should be set for ephemeral storage #95

lindhe · 2023-12-05T08:52:10Z

At least the web pod and worker beat pod are using emptyDir volumes. This will consume ephemeral-storage on the node that the pod gets scheduled on. Since we have not specified any resource requests and limits for ephemeral storage for the container, we risk that the pod gets evicted and/or crashes and/or causes resource exhaustion on the node.

Currently, my pods get evicted and I get a warning when the pod gets scheduled on a node with too little ephemeral storage available:

$ kubectl get events --field-selector involvedObject.name=worker-beat-7898d974fc-sb9xz                                                              130 ↵
LAST SEEN   TYPE      REASON                   OBJECT                             MESSAGE
46m         Warning   FailedScheduling         pod/worker-beat-7898d974fc-sb9xz   0/6 nodes are available: pod has unbound immediate PersistentVolumeClaims. preemption: 0/6 nodes are available: 6 No preemption victims found for incoming pod..
46m         Warning   FailedScheduling         pod/worker-beat-7898d974fc-sb9xz   0/6 nodes are available: pod has unbound immediate PersistentVolumeClaims. preemption: 0/6 nodes are available: 6 No preemption victims found for incoming pod..
45m         Normal    Scheduled                pod/worker-beat-7898d974fc-sb9xz   Successfully assigned invenio-dev/worker-beat-7898d974fc-sb9xz to kth-prod-1-worker-7a7516d2-v8vbc
45m         Normal    SuccessfulAttachVolume   pod/worker-beat-7898d974fc-sb9xz   AttachVolume.Attach succeeded for volume "pvc-801c874c-37a9-4520-a0e8-c59606c9d09a"
45m         Normal    Pulling                  pod/worker-beat-7898d974fc-sb9xz   Pulling image "ghcr.io/inveniosoftware/demo-inveniordm/demo-inveniordm@sha256:2193abc2caec9bc599061d6a5874fd2d7d201f55d1673a545af0a0406690e8a4"
44m         Warning   Evicted                  pod/worker-beat-7898d974fc-sb9xz   The node was low on resource: ephemeral-storage. Threshold quantity: 994154920, available: 759960Ki.
44m         Normal    Pulled                   pod/worker-beat-7898d974fc-sb9xz   Successfully pulled image "ghcr.io/inveniosoftware/demo-inveniordm/demo-inveniordm@sha256:2193abc2caec9bc599061d6a5874fd2d7d201f55d1673a545af0a0406690e8a4" in 1m2.20910036s (1m2.209116986s including waiting)
44m         Normal    Created                  pod/worker-beat-7898d974fc-sb9xz   Created container worker-beat
44m         Normal    Started                  pod/worker-beat-7898d974fc-sb9xz   Started container worker-beat
44m         Normal    Killing                  pod/worker-beat-7898d974fc-sb9xz   Stopping container worker-beat
44m         Warning   ExceededGracePeriod      pod/worker-beat-7898d974fc-sb9xz   Container runtime did not kill the pod within specified grace period.

I suggest we add resource limits and requests for ephemeral-storage on all containers that use emptyDir. I can whip up a PR for it, but I need your help to identify a reasonable size to set as request and limit.

Instances of `emptyDir` in deployments

Give feedback

The text was updated successfully, but these errors were encountered:

lindhe · 2023-12-05T12:16:28Z

Here's another example of what it can look like when pods are evicted because they use more resources than are available:

J4bbi mentioned this issue Aug 21, 2024

Fix/documentation CottageLabs/helm-invenio#1

Draft

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Resource requests should be set for ephemeral storage #95

Resource requests should be set for ephemeral storage #95

lindhe commented Dec 5, 2023 •

edited

Loading

Instances of `emptyDir` in deployments

lindhe commented Dec 5, 2023

Resource requests should be set for ephemeral storage #95

Resource requests should be set for ephemeral storage #95

Comments

lindhe commented Dec 5, 2023 • edited Loading

Instances of emptyDir in deployments

lindhe commented Dec 5, 2023

lindhe commented Dec 5, 2023 •

edited

Loading

Instances of `emptyDir` in deployments