Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kubeadm re-reads kubeconfig in init phase bootstrap-token #3129

Closed
kokes opened this issue Nov 27, 2024 · 11 comments · Fixed by kubernetes/kubernetes#129006
Closed

kubeadm re-reads kubeconfig in init phase bootstrap-token #3129

kokes opened this issue Nov 27, 2024 · 11 comments · Fixed by kubernetes/kubernetes#129006
Labels
area/UX kind/bug Categorizes issue or PR as related to a bug. priority/backlog Higher priority than priority/awaiting-more-evidence.
Milestone

Comments

@kokes
Copy link

kokes commented Nov 27, 2024

What happened?

I ran kubeadm with an in-memory kubeconfig literal (here reproduced with a file on disk and piped in, the principle is the same) and while many commands worked just fine, this init phase ended in a panic.

$ cat admin.conf | ./kubeadm --kubeconfig /dev/stdin init phase bootstrap-token
W1127 10:36:04.466512   80447 version.go:121] could not obtain client version; using remote version: v1.31.3
[bootstrap-token] Using token: vbzyo5.w4wxyalp15a9m0xc
[bootstrap-token] Configuring bootstrap tokens, cluster-info ConfigMap, RBAC Roles
[bootstrap-token] Configured RBAC rules to allow Node Bootstrap tokens to get nodes
[bootstrap-token] Configured RBAC rules to allow Node Bootstrap tokens to post CSRs in order for nodes to get long term certificate credentials
[bootstrap-token] Configured RBAC rules to allow the csrapprover controller automatically approve CSRs from a Node Bootstrap Token
[bootstrap-token] Configured RBAC rules to allow certificate rotation for all node client certificates in the cluster
[bootstrap-token] Creating the "cluster-info" ConfigMap in the "kube-public" namespace
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x2 addr=0x10 pc=0x102534c08]

goroutine 1 [running]:
k8s.io/kubernetes/cmd/kubeadm/app/phases/bootstraptoken/clusterinfo.CreateBootstrapConfigMapIfNotExists({0x103107478, 0x14000439a40}, {0x16f8ead0f, 0xa})
	/Users/okokes/git/kubernetes/cmd/kubeadm/app/phases/bootstraptoken/clusterinfo/clusterinfo.go:56 +0x138
k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/init.runBootstrapToken({0x102f67ac0?, 0x1400013d220})
	/Users/okokes/git/kubernetes/cmd/kubeadm/app/cmd/phases/init/bootstraptoken.go:109 +0x1f4
k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/workflow.(*Runner).Run.func1(0x1400022a000)
	/Users/okokes/git/kubernetes/cmd/kubeadm/app/cmd/phases/workflow/runner.go:261 +0xf4
k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/workflow.(*Runner).visitAll(...)
	/Users/okokes/git/kubernetes/cmd/kubeadm/app/cmd/phases/workflow/runner.go:450
k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/workflow.(*Runner).Run(0x1400049ed80, {0x1400066a200, 0x0, 0x2})
	/Users/okokes/git/kubernetes/cmd/kubeadm/app/cmd/phases/workflow/runner.go:234 +0x2ac
k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/workflow.(*Runner).BindToCommand.func1.1(0x140006ad000?, {0x1400066a200, 0x0, 0x2})
	/Users/okokes/git/kubernetes/cmd/kubeadm/app/cmd/phases/workflow/runner.go:376 +0xe0
github.com/spf13/cobra.(*Command).execute(0x140006b0008, {0x1400066a1e0, 0x2, 0x2})
	/Users/okokes/git/kubernetes/vendor/github.com/spf13/cobra/command.go:985 +0x834
github.com/spf13/cobra.(*Command).ExecuteC(0x1400043c608)
	/Users/okokes/git/kubernetes/vendor/github.com/spf13/cobra/command.go:1117 +0x344
github.com/spf13/cobra.(*Command).Execute(0x10309f078?)
	/Users/okokes/git/kubernetes/vendor/github.com/spf13/cobra/command.go:1041 +0x1c
k8s.io/kubernetes/cmd/kubeadm/app.Run()
	/Users/okokes/git/kubernetes/cmd/kubeadm/app/kubeadm.go:47 +0xac
main.main()
	/Users/okokes/git/kubernetes/cmd/kubeadm/kubeadm.go:25 +0x1c

What did you expect to happen?

I expected the command to run just fine.

How can we reproduce it (as minimally and precisely as possible)?

With a kubeconfig in admin.conf, run this in 9d62330bfa31a4fce28093d052f65ff0e88ac3a0:

go build -o kubeadm ./cmd/kubeadm
cat admin.conf | ./kubeadm --kubeconfig /dev/stdin init phase bootstrap-token

Anything else we need to know?

I added a debug line to LoadFromFile and indeed it gets called twice, so if the source is not to be read repeatedly (e.g. a pipe), it will fail the second time around, because the resulting kubeconfig will be empty

$ cat admin.conf | ./kubeadm --kubeconfig /dev/stdin init phase bootstrap-token
W1127 10:12:46.763090   75623 version.go:121] could not obtain client version; using remote version: v1.31.3
2024/11/27 10:12:46 DBG: loading from file: /dev/stdin
[bootstrap-token] Using token: (...)
[bootstrap-token] Configuring bootstrap tokens, cluster-info ConfigMap, RBAC Roles
[bootstrap-token] Configured RBAC rules to allow Node Bootstrap tokens to get nodes
[bootstrap-token] Configured RBAC rules to allow Node Bootstrap tokens to post CSRs in order for nodes to get long term certificate credentials
[bootstrap-token] Configured RBAC rules to allow the csrapprover controller automatically approve CSRs from a Node Bootstrap Token
[bootstrap-token] Configured RBAC rules to allow certificate rotation for all node client certificates in the cluster
[bootstrap-token] Creating the "cluster-info" ConfigMap in the "kube-public" namespace
2024/11/27 10:12:50 DBG: loading from file: /dev/stdin
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x2 addr=0x10 pc=0x102bbcb88]

goroutine 1 [running]:
k8s.io/kubernetes/cmd/kubeadm/app/phases/bootstraptoken/clusterinfo.CreateBootstrapConfigMapIfNotExists({0x10378f478, 0x1400062e8c0}, {0x16f266d0f, 0xa})
	/Users/okokes/git/kubernetes/cmd/kubeadm/app/phases/bootstraptoken/clusterinfo/clusterinfo.go:56 +0x138
k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/init.runBootstrapToken({0x1035efac0?, 0x14000431c20})
	/Users/okokes/git/kubernetes/cmd/kubeadm/app/cmd/phases/init/bootstraptoken.go:109 +0x1f4
k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/workflow.(*Runner).Run.func1(0x1400016efc0)
	/Users/okokes/git/kubernetes/cmd/kubeadm/app/cmd/phases/workflow/runner.go:261 +0xf4
k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/workflow.(*Runner).visitAll(...)
	/Users/okokes/git/kubernetes/cmd/kubeadm/app/cmd/phases/workflow/runner.go:450
k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/workflow.(*Runner).Run(0x140003a4d80, {0x140004ab860, 0x0, 0x2})
	/Users/okokes/git/kubernetes/cmd/kubeadm/app/cmd/phases/workflow/runner.go:234 +0x2ac
k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/workflow.(*Runner).BindToCommand.func1.1(0x140006cb000?, {0x140004ab860, 0x0, 0x2})
	/Users/okokes/git/kubernetes/cmd/kubeadm/app/cmd/phases/workflow/runner.go:376 +0xe0
github.com/spf13/cobra.(*Command).execute(0x140006da308, {0x140004ab840, 0x2, 0x2})
	/Users/okokes/git/kubernetes/vendor/github.com/spf13/cobra/command.go:985 +0x834
github.com/spf13/cobra.(*Command).ExecuteC(0x14000404908)
	/Users/okokes/git/kubernetes/vendor/github.com/spf13/cobra/command.go:1117 +0x344
github.com/spf13/cobra.(*Command).Execute(0x103727078?)
	/Users/okokes/git/kubernetes/vendor/github.com/spf13/cobra/command.go:1041 +0x1c
k8s.io/kubernetes/cmd/kubeadm/app.Run()
	/Users/okokes/git/kubernetes/cmd/kubeadm/app/kubeadm.go:47 +0xac
main.main()
	/Users/okokes/git/kubernetes/cmd/kubeadm/kubeadm.go:25 +0x1c

In the case of this kubeadm phase, it gets read first in client, err := data.Client(), but then the absolute path is retrieved and passed to clusterinfophase.CreateBootstrapConfigMapIfNotExists(client, data.KubeConfigPath()), where it gets read again.

Kubernetes version

9d62330bfa31a4fce28093d052f65ff0e88ac3a0

Cloud provider

OS version

Darwin foo-bar 23.6.0 Darwin Kernel Version 23.6.0: Wed Jul 31 20:53:05 PDT 2024; root:xnu-10063.141.1.700.5~1/RELEASE_ARM64_T8112 arm64

Install tools

Container runtime (CRI) and version (if applicable)

Related plugins (CNI, CSI, ...) and versions (if applicable)

@kokes kokes added the kind/bug Categorizes issue or PR as related to a bug. label Nov 27, 2024
@k8s-ci-robot k8s-ci-robot added the needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. label Nov 27, 2024
@k8s-ci-robot
Copy link
Contributor

There are no sig labels on this issue. Please add an appropriate label by using one of the following commands:

  • /sig <group-name>
  • /wg <group-name>
  • /committee <group-name>

Please see the group list for a listing of the SIGs, working groups, and committees available.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@k8s-ci-robot k8s-ci-robot added the needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. label Nov 27, 2024
@k8s-ci-robot
Copy link
Contributor

This issue is currently awaiting triage.

If a SIG or subproject determines this is a relevant issue, they will accept it by applying the triage/accepted label and provide further guidance.

The triage/accepted label can be added by org members by writing /triage accepted in a comment.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@kokes
Copy link
Author

kokes commented Nov 27, 2024

Not sure if this is SIG cli or cluster-lifecycle, both seem to be suitable, I'd probably go with the former, since this is mostly about passing data between subcommands, not about the lifecycle itself.

@neolit123
Copy link
Member

/transfer kubeadm

@k8s-ci-robot k8s-ci-robot transferred this issue from kubernetes/kubernetes Nov 27, 2024
@neolit123 neolit123 added area/UX and removed needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Nov 27, 2024
@neolit123
Copy link
Member

neolit123 commented Nov 27, 2024

Not sure if this is SIG cli or cluster-lifecycle, both seem to be suitable, I'd probably go with the former, since this is mostly about passing data between subcommands, not about the lifecycle itself.

diff components are owned by diff sigs, in this case it's SIG CL for kubeadm

@neolit123
Copy link
Member

neolit123 commented Nov 27, 2024

cat admin.conf | ./kubeadm --kubeconfig /dev/stdin init phase bootstrap-token

that is a peculiar use case and i don't think i have seen it before.
could you elaborate why you need to pass it like that?

In the case of this kubeadm phase, it gets read first in client, err := data.Client(), but then the absolute path is retrieved and passed to clusterinfophase.CreateBootstrapConfigMapIfNotExists(client, data.KubeConfigPath()), where it gets read again.

one problem is that data.Client() might not be the same as a client constructed from data.KubeConfigPath(). in dry-run a fake client is used at master.
https://github.com/kubernetes/kubernetes/blob/master/cmd/kubeadm/app/cmd/init.go#L518-L545

i agree that it seems awkward to construct a new client later from the KubeConfigPath(). it seems to be that should be something done during the init "data" initialization only once and a clientcmdapi.Config pointer should be passed around.

potential fix:

would you like to try sending a PR?

@neolit123 neolit123 added the priority/backlog Higher priority than priority/awaiting-more-evidence. label Nov 27, 2024
@neolit123
Copy link
Member

that is a peculiar use case and i don't think i have seen it before.

given the user can just write the file on disk in a safe location and invoke kubeadm with a super user i don't think we should backport a fix to older releases.

@neolit123
Copy link
Member

k8s.io/kubernetes/cmd/kubeadm/app/phases/bootstraptoken/clusterinfo.CreateBootstrapConfigMapIfNotExists({0x103107478, 0x14000439a40}, {0x16f8ead0f, 0xa})
/Users/okokes/git/kubernetes/cmd/kubeadm/app/phases/bootstraptoken/clusterinfo/clusterinfo.go:56 +0x138

one surprise here is that it reaches line 56 and panics while

	klog.V(1).Infoln("[bootstrap-token] loading admin kubeconfig")
	adminConfig, err := clientcmd.LoadFromFile(file)
	if err != nil {
		return errors.Wrap(err, "failed to load admin kubeconfig")
	}
	if err = clientcmdapi.FlattenConfig(adminConfig); err != nil {
		return err
	}

doesn't throw any errors.

@neolit123 neolit123 added this to the v1.33 milestone Nov 27, 2024
@kokes
Copy link
Author

kokes commented Nov 27, 2024

that is a peculiar use case and i don't think i have seen it before.
could you elaborate why you need to pass it like that?

We retrieve these kubeconfigs from a remote store and use them within kubeadm and this was an effort to minimise saving sensitive data on disk (in case we failed to clear them up + not to leak them to other processes).

potential fix:

Thanks for the pointers. I was thinking along similar lines, but I thought that this interface assertion - https://github.com/kubernetes/kubernetes/blob/9d62330bfa31a4fce28093d052f65ff0e88ac3a0/cmd/kubeadm/app/cmd/phases/upgrade/apply/bootstraptoken.go#L48 - would require all implementation to implement this new kubeconfig method... but now I see that there's only one implementation of this interface, so we should be good.

I'll also try and look up the first loading of said kubeconfig, I'm struggling a bit to navigate the Cobra abstractions.


I also don't know if there are other commands affected, but we can try this one first and then we'll see.

@kokes
Copy link
Author

kokes commented Nov 27, 2024

OK, got it to work. I'll see if I can polish the outstanding issues there (the dry run, reuse of a method), add some docs, and I'll submit a PR at some point today or tomorrow.

The diff doesn't look super complex: https://gist.github.com/kokes/9f1601f04cfc8e4a2ee832513b4a3613

@neolit123
Copy link
Member

i can comment more on the PR. thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/UX kind/bug Categorizes issue or PR as related to a bug. priority/backlog Higher priority than priority/awaiting-more-evidence.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants