-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rook (ceph) fails to start correctly after upgrading to runc 1.2.0 #4483
Comments
Could you provide some error log produced by runc? |
@ErickStaal yes, please, at least the output of kubectl logs and kubectl describe for the failed pods. But it will be great if you can find a simple repro, ideally that doesn't need a ceph cluster. |
We are seeing a potentially related issue where the AWS CSI driver (we are close to the official daemonset definition) fails under cri-o 1.29.10 and 1.30.7 with runc 1.2.0 (I verified that 1.1.14 works) (OS: Ubuntu noble). The issue boils down to
I assume the issue is related as rook/ceph does some device setup early on and crashes if that fails. (That's from the top of my head - it's been 3 years since I used rook). What would be a good way to debug this further? What is a good way to determine that cause for the EPERM? |
You can use retsnoop to figure out exactly where the |
Description
Rook (Ceph) fails starting correctly after upgrading to runc v1.2.0. Rolling back to runc v1.1.15 fixes all errors.
Steps to reproduce the issue
rook-ceph rook-ceph-mds-k8sfs-a-65588bd59d-d9ccf 1/2 CrashLoopBackOff 215 (53s ago) 19h
rook-ceph rook-ceph-mds-k8sfs-b-686bdc8d8d-kk498 1/2 CrashLoopBackOff 67 (50s ago) 5h56m
rook-ceph rook-ceph-mgr-b-58f9d6576b-4df8v 2/3 CrashLoopBackOff 333 (51s ago) 19h
I checked the output of kubectl describe nodes. There was no memory or storage pressure on the nodes.
Describe the results you received and expected
rook starting just like under runc v1.1.15
What version of runc are you using?
v1.1.15 (I rolled back from v1.2.0 and Everything works again).
Host OS information
PRETTY_NAME="Ubuntu 24.04.1 LTS"
Host kernel information
Linux 6.8.0-47-generic #47-Ubuntu SMP PREEMPT_DYNAMIC Fri Sep 27 21:40:26 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux
(on all Kubernetes nodes).
The text was updated successfully, but these errors were encountered: