Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cluster status error since 0.33.9 with eks cluster #6391

Open
brian-bk opened this issue Jun 6, 2024 · 3 comments
Open

Cluster status error since 0.33.9 with eks cluster #6391

brian-bk opened this issue Jun 6, 2024 · 3 comments
Labels
bug Something isn't working needs repro case

Comments

@brian-bk
Copy link

brian-bk commented Jun 6, 2024

Expected Behavior

Tilt should be able to connect to the cluster on tilt up etc.

Current Behavior

Tilt is unable to connect to the cluster directly. We still see tilt managing local_resources and our Tiltfile executes some kubectl commands manually via local or local_resource, but the managed k8s resources behind a helm_resource do not work. In addition after the Tiltfile processing finishes there's a noted failure on the (Tiltfile) resource.

Successfully loaded Tiltfile (1m14.658932792s)
Cluster status error: Tilt encountered an error connecting to your Kubernetes cluster:
	Get "[https://<redacted>.gr7.us-east-1.eks.amazonaws.com/version?timeout=32s":](https://<redacted>.gr7.us-east-1.eks.amazonaws.com/version?timeout=32s%22:) context deadline exceeded
You will need to restart Tilt after resolving the issue.

We have tested and in 0.33.8 this works without such issue, and I tested with 0.33.15 and the issue since 0.33.9 still persists.

Steps to Reproduce

  1. Configure an eks cluster and authenticate against it
  2. Run tilt up
  3. Wait for resources to load, but then tilt cannot connect to the cluster even while kubectl commands from inside a local or local_resource resources work

Context

tilt doctor Output

$ tilt doctor
Tilt: v0.33.15, built 2024-05-31
System: darwin-arm64
---
Docker
- Host: unix:///Users/<me>/.docker/run/docker.sock
- Server Version: 26.1.1
- API Version: 1.45
- Builder: 2
- Compose Version: v2.27.0-desktop.2
---
Kubernetes
- Env: eks
- Context: kubernetes-eks-dev
- Cluster Name: arn:aws:eks:us-east-1:<redacted-eks-arn-id>:cluster/kubernetes-eks-dev
- Namespace: default
- Container Runtime: containerd
- Version: v1.27.13-eks-3af4770
- Cluster Local Registry: none
---
Thanks for seeing the Tilt Doctor!
Please send the info above when filing bug reports. 💗

The info below helps us understand how you're using Tilt so we can improve,
but is not required to ask for help.
---
Analytics Settings
--> (These results reflect your personal opt in/out status and may be overridden by an `analytics_settings` call in your Tiltfile)
- User Mode: opt-in
- Machine: b8542883618c2effbdb7c7ceed78623b
- Repo: dqZ55OF3HaxcqT2x/Y9LwQ==
# relevant .kube/config
apiVersion: v1
clusters:
- cluster:
    certificate-authority-data: <redacted>
    server: https://<redacted>.gr7.us-east-1.eks.amazonaws.com
  name: arn:aws:eks:us-east-1:<redacted-eks-arn-id>:cluster/kubernetes-eks-dev
contexts:
- context:
    cluster: arn:aws:eks:us-east-1:<redacted-eks-arn-id>:cluster/kubernetes-eks-dev
    user: arn:aws:eks:us-east-1:<redacted-eks-arn-id>:cluster/kubernetes-eks-dev
  name: kubernetes-eks-dev
current-context: kubernetes-eks-dev
kind: Config
preferences: {}
users:
- name: arn:aws:eks:us-east-1:<redacted-eks-arn-id>:cluster/kubernetes-eks-dev
  user:
    exec:
      apiVersion: client.authentication.k8s.io/v1beta1
      args:
      - --region
      - us-east-1
      - eks
      - get-token
      - --cluster-name
      - kubernetes-eks-dev
      - --output
      - json
      command: aws
      env:
      - name: AWS_PROFILE
        value: <my-profile-name>

About Your Use Case

This has been happening since 0.33.9 and I forgot to report it right away. This still happens on 0.33.15. For now we've actually added a check in our Tiltfile to force people on to <=0.33.8, until this can be resolved. Maybe it's specific to Amazon EKS's authentication, but I'm not sure.

@brian-bk brian-bk added the bug Something isn't working label Jun 6, 2024
@nicks
Copy link
Member

nicks commented Jun 7, 2024

Hmmm...I tried this with my own EKS cluster, and was not able to repro.

I went through all the changes between 0.33.8 and 0.33.9 and didn't see any changes that would affect how tilt computes cluster status.

@nicks
Copy link
Member

nicks commented Jun 7, 2024

can you post the output of:

kubectl get -v=6 --raw /version

?

@brian-bk
Copy link
Author

brian-bk commented Jun 7, 2024

Sure thing

$ kubectl get -v=6 --raw /version
I0607 11:02:35.245274   24199 loader.go:374] Config loaded from file:  /Users/briankleszyk/.kube/config
I0607 11:02:35.953061   24199 round_trippers.go:553] GET https://<redacted>.gr7.us-east-1.eks.amazonaws.com/version 200 OK in 706 milliseconds
{
  "major": "1",
  "minor": "27+",
  "gitVersion": "v1.27.13-eks-3af4770",
  "gitCommit": "4873544ec1ec7d3713084677caa6cf51f3b1ca6f",
  "gitTreeState": "clean",
  "buildDate": "2024-04-30T03:31:44Z",
  "goVersion": "go1.21.9",
  "compiler": "gc",
  "platform": "linux/amd64"
}

🤷 don't know if relevant or not but I (and most of our engineers) are using arm64, with a amd64 cluster.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working needs repro case
Projects
None yet
Development

No branches or pull requests

2 participants