Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Missing kubernetes logs makes diagnostic hard #73

Open
MonsieurNicolas opened this issue Oct 12, 2022 · 0 comments
Open

Missing kubernetes logs makes diagnostic hard #73

MonsieurNicolas opened this issue Oct 12, 2022 · 0 comments
Assignees
Labels
bug Something isn't working core-team Issue can be worked on by the core team

Comments

@MonsieurNicolas
Copy link
Contributor

Kubernetes logs are not propagated "real time", making it difficult to figure out when things go wrong.

For example recently we had this in the supercluster logs (stays stuck like this until hitting the 8 hours timeout):

[11:52:00 INF] Waiting for replicas on stellar-supercluster/ssc-0416z-9db5e7-sts-bd
[11:52:00 INF] Saw event for statefulset ssc-0416z-9db5e7-sts-bd: Added
[11:52:00 INF] StatefulSet stellar-supercluster/ssc-0416z-9db5e7-sts-bd: 0/3 replicas ready
[11:52:59 INF] Waiting for replicas on stellar-supercluster/ssc-0416z-9db5e7-sts-bd
[11:52:59 INF] Saw event for statefulset ssc-0416z-9db5e7-sts-bd: Added
[11:52:59 INF] StatefulSet stellar-supercluster/ssc-0416z-9db5e7-sts-bd: 0/3 replicas ready
[11:53:58 INF] Waiting for replicas on stellar-supercluster/ssc-0416z-9db5e7-sts-bd
[11:53:58 INF] Saw event for statefulset ssc-0416z-9db5e7-sts-bd: Added
[11:53:58 INF] StatefulSet stellar-supercluster/ssc-0416z-9db5e7-sts-bd: 0/3 replicas ready
[11:54:58 INF] Waiting for replicas on stellar-supercluster/ssc-0416z-9db5e7-sts-bd
[11:54:58 INF] Saw event for statefulset ssc-0416z-9db5e7-sts-bd: Added
[11:54:58 INF] StatefulSet stellar-supercluster/ssc-0416z-9db5e7-sts-bd: 0/3 replicas ready
[11:55:57 INF] Waiting for replicas on stellar-supercluster/ssc-0416z-9db5e7-sts-bd
[11:55:57 INF] Saw event for statefulset ssc-0416z-9db5e7-sts-bd: Added
[11:55:57 INF] StatefulSet stellar-supercluster/ssc-0416z-9db5e7-sts-bd: 0/3 replicas ready
[11:56:15 INF] There are 7 pods in total

In this particular case, we had a bad image name passed to kubernetes, and running a couple diagnostic commands allows to quickly spot the problem:

+ kubectl get pods -n stellar-supercluster
NAME                         READY   STATUS             RESTARTS   AGE
ssc-1451z-331471-sts-bd-0    3/4     InvalidImageName   0          24m
ssc-1451z-331471-sts-cq-0    3/4     InvalidImageName   0          24m
ssc-1451z-331471-sts-kb-0    3/4     InvalidImageName   0          24m
ssc-1451z-331471-sts-lo-0    3/4     InvalidImageName   0          24m
ssc-1451z-331471-sts-sdf-0   3/4     InvalidImageName   0          24m
ssc-1451z-331471-sts-sp-0    3/4     InvalidImageName   0          24m
ssc-1451z-331471-sts-wx-0    3/4     InvalidImageName   0          24m
 Warning  InspectFailed     25m (x4 over 25m)    kubelet            Failed to apply default image tag "docker-registry.services.stellar-ops.com/dev/stellar-core:19.4.1-1101.da3754bb5.focal~perftests": couldn't parse image reference "docker-registry.services.stellar-ops.com/dev/stellar-core:19.4.1-1101.da3754bb5.focal~perftests": invalid reference format
@MonsieurNicolas MonsieurNicolas added the bug Something isn't working label Oct 12, 2022
@mbsdf mbsdf self-assigned this Sep 13, 2023
@sisuresh sisuresh added the core-team Issue can be worked on by the core team label Oct 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working core-team Issue can be worked on by the core team
Projects
None yet
Development

No branches or pull requests

4 participants