Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow for a way to determine reliably if a resource is cluster scoped or namespaced #133

Open
reegnz opened this issue Oct 29, 2024 · 9 comments
Labels
community-feedback Asking the community what the best solution might be

Comments

@reegnz
Copy link

reegnz commented Oct 29, 2024

What did you do?

I wanted to split cluster scoped and namespaced resources into separate folders, as described in https://github.com/patrickdappollonio/kubectl-slice/blob/main/docs/why.md

What did you expect to see?

.
├── cluster/
│   ├── aggregate-metacontroller-edit-clusterrole.yaml
│   ├── aggregate-metacontroller-view-clusterrole.yaml
│   ├── compositecontrollers.metacontroller.k8s.io-customresourcedefinition.yaml
│   ├── controllerrevisions.metacontroller.k8s.io-customresourcedefinition.yaml
│   ├── decoratorcontrollers.metacontroller.k8s.io-customresourcedefinition.yaml
│   ├── metacontroller-clusterrolebinding.yaml
│   ├── metacontroller-namespace.yaml
│   └── metacontroller-serviceaccount.yaml
└── namespaces/
    ├── metacontroller/
    ├──── metacontroller-serviceaccount.yaml
    └──── metacontroller-statefulset.yaml

What did you see instead?

By default kubectl-slice doesn't follow the format explained in the why documentation, instead we get a worse format:

.
└── metacontroller/
    ├── clusterrole-aggregate-metacontroller-edit.yaml
    ├── clusterrole-aggregate-metacontroller-view.yaml
    ├── clusterrole-metacontroller.yaml
    ├── clusterrolebinding-metacontroller.yaml
    ├── customresourcedefinition-compositecontrollers.metacontroller.k8s.io.yaml
    ├── customresourcedefinition-controllerrevisions.metacontroller.k8s.io.yaml
    ├── customresourcedefinition-decoratorcontrollers.metacontroller.k8s.io.yaml
    ├── namespace-metacontroller.yaml
    ├── serviceaccount-metacontroller.yaml
    └── statefulset-metacontroller.yaml

I can't seem to find a way to determine if a manifest is namespaced or not. As the .metadata.namespace is not mandatory even on namespaced manifests, I can't simply rely on that information.

At my company I've written something very similar tool in python before discovering this tool, in our internal tool I'm allowing for an override config to declare which resources are namespaced and which ones are cluster scoped, so the script can reliably file the resources into the right folder.

@reegnz
Copy link
Author

reegnz commented Oct 29, 2024

My solution does it like this, which is closer to the why reasoning:

.
├── cluster
│   ├── ClusterRole.external-dns-external-dns.yaml
│   └── ClusterRoleBinding.external-dns-external-dns.yaml
└── namespaces
    └── external-dns
        ├── Deployment.external-dns.yaml
        ├── Service.external-dns.yaml
        └── ServiceAccount.external-dns.yaml

I think I could contribute a behaviour to the tool that is aware of k8s builtin resource scoping (namespaced vs cluster scoped), and allowed to configure kubectl slice to reliably declare which APIs are namespace scoped and which are cluster scoped.

@patrickdappollonio
Copy link
Owner

Hey @reegnz!

Detecting reliably if a resource is cluster-scoped or namespaced is actually quite, quite hard. Even Helm hasn't managed to get this appropriately, but you're more than welcome to try!

The long-story-short about this is that resources change depending on the cluster where they're installed and there's no concept of exclusivity in the Kubernetes world (no hardcoded way to say that foo.internal resources MUST be namespaced, for example) so a bunch of companies could create a CRD called Ingress under the Group corporate.internal where in some of those companies the resource might be namespaced while in others it wouldn't be.

And yes, you might be thinking that somewhat standard Kubernetes resources could be "whitelisted" in a way so the detection happens correctly for those, but even then, it's not a 100% guarantee because cluster operators can change how these resources behave in their own environments.

At the end of the day, the only somewhat accurate option is to use the Discovery API of Kubernetes to ask for that cluster's resources and use that information to drive the decision-making. And while that's perfectly possible, that's where two worlds kind of overlap and I'm not sure we can take that choice deliberately.

Let me put it in an example: if you look at the current usage of kubectl-slice across the Github world we can see you'll notice the app is part of pipelines, and more often than not, CI/CD pipelines.

In those pipelines, "sorting manifests" comes way before we apply them. Even more so, some of those pipelines use a different tool to apply the pipelines to the cluster (think, GitOps, a la ArgoCD or Flux).

If we were to introduce a way to detect "namespace-scoped resources" we would need the Kubeconfig of the target cluster whose data you want to extract (say, a dev environment). Now we're asking everyone that's using this in a pipeline that now they also need to do some lifecycle management: provide secrets, perhaps a RBAC service account or what-have-you in your cloud provider, and on clusters that are behind a private network, a way to access or query it... That now is quite the ask!

Then think about the fact you'll target one cluster. Most companies I've worked with and the stuff I can see on public GitHub often use multiple clusters that might or might not look the same. Perhaps they would slightly differ in the examples I gave you before, where one cluster might have the resource Ingress.corporate.internal to be namespaced, while the other cluster might have it cluster-wide, which leave us back to square one!

The other option I've explored is providing kubectl-slice with a body of knowledge you organize, think like an additional configuration file that maps these things to namespaces or cluster-wide resources but that seems like a lot of code organization and somewhat time-consuming. In the prior example where one cluster might be namespaced and the other one isn't, you would just use a different body of knowledge depending on the target cluster...

All in all, I feel like at that point the feature would become largely unused unfortunately.

The example I provided in the WHY.md file is just to explain the rationale behind this project. Several companies have used the current features to slice them in separate steps (say, exporting CRDs first, then exporting the rest of the resources) thanks to things like --include-kind or --include-name, for example, using your code above, you could make two or more calls:

kubectl-slice --include-kind clusterrole,clusterrolebinding -o cluster/
kubectl-slice --exclude-kind clusterrole,clusterrolebinding -o resources/

All this to say: it sounds like an easy problem but, believe me, we've bounced the idea between a handful of users (Corporate and friends) and we never seem to agree on a solution.

You do seem like you have an idea though and I would be more than happy to review it! My main concern is that a change that would require a kubeconfig, for example, would be a major, major change that would threaten the stability of the project and we don't have the data of how much "used" that feature would be, but based on what I can see in public and private (thanks to corporate users that reach out privately), I have yet to find a major userbase that would need a feature like this, but hey, that might be you! 😄

Let me know how you want to proceed!

@patrickdappollonio patrickdappollonio added the community-feedback Asking the community what the best solution might be label Nov 1, 2024
@patrickdappollonio
Copy link
Owner

As a side note: the main reason too Kubernetes' "server-side apply" exists is also due to these mismatches of "what's true". Their KEP has some good insight about it but it's a bit of a lengthy read.

Also, these threads might shed some light around the nuisances of "namespaced resources" (links in no particular order):

@patrickdappollonio
Copy link
Owner

patrickdappollonio commented Nov 1, 2024

Last note from a friend that works for a Fortune 500 and they actively use this project is that the inability of kubectl-slice to interact with a Kubernetes cluster isn't a bug, but a feature: if we were to enable that option and people would start to use it, major companies would need several extra steps to get the approvals and compliance approvals needed to continue upgrading this tool.

Today they can get away with it because kubectl-slice is just a templating tool. That's something worth keeping in mind.

I'm still not against adding the feature, it would have to be gated enough so the sales pitch to DevSecOps folks or plain security teams makes it for an easy sale as well.

@reegnz
Copy link
Author

reegnz commented Nov 11, 2024

Whoah there, I never said that kubectl-slice should read from the cluster, you're reading too much into this.

You should still be able to provide information about what API-s you are targeting. The tool doesn't need to be SMART about this, it just needs to allow for the user to provide additional information.
I'm not asking detection. I'm really asking, allow the tool to be provided the missing information in form of additional config.
This ticket is not asking for the tool to use any information that is not already provided locally.

I'm presenting this as an example here as I've done a very similar tool in python that we use in our production pipeline. In that tool we feed the resource scope information as a config like this so we can route the files into the appropriate folders (namespaced resources into the right namespace folder, cluster scoped resources into the cluster folder):

---
scopes:
- group: ''
  kind: Namespace
  scope: cluster
- group: apps
  kind: Deployment
  scope: namespace
- group: apps
  kind: DaemonSet
  scope: namespace
- group: apps
  kind: StatefulSet
  scope: namespace
- group: batch
  kind: Job
  scope: namespace
- group: batch
  kind: CronJob
  scope: namespace
- group: ''
  kind: ConfigMap
  scope: namespace
- group: ''
  kind: Service
  scope: namespace
- group: ''
  kind: ServiceAccount
  scope: namespace
- group: storage.k8s.io
  kind: StorageClass
  scope: cluster
- group: networking.k8s.io
  kind: Ingress
  scope: namespace
- group: rbac.authorization.k8s.io
  kind: ClusterRole
  scope: cluster
- group: rbac.authorization.k8s.io
  kind: ClusterRoleBinding
  scope: cluster
- group: storage.k8s.io
  kind: CSIDriver
  scope: cluster
 etc.

I really think if the tool allows the user to bring this information to template better paths, it keeps complexity low, maintains backward compatibility and allows users achieving what is promised right over here in the why document.

I'm still good with our homegrown tool to split manifests (actually contemplating a rewrite in go and open-sourcing it now), just thought that it's interesting that the promise made in your documentation isn't lived up to by the tool, so I opened the ticket to point that out and respectfully ask that that promise is actually supported by the tool. Otherwise it's pretty misleading and I'd ask you to drop that example from the docs presenting as if it's able to render that folder structure. It clearly isn't that capable.

Especially this part, that really made me anticipate adopting the tool just to fall flat on my face:

Where resources that are globally scoped live in the cluster/ folder -- or the folder designated by the service or application -- and namespace-specific resources live inside namespaces/$NAME/.

@reegnz
Copy link
Author

reegnz commented Nov 11, 2024

As for DevSecOps, our security folk like it better to see cluster scoped resources in their own separate folder (clusterroles and what not), as it's easier to develop rules on simple glob rules without having to enumerate all possible cluster-scoped resources names.
Thats just another data-point when it comes to DevSecOps.

@patrickdappollonio
Copy link
Owner

Haha it's not about "reading too much", you opened a can of worms we've discussed (unfortunately in private) with a bunch of users and folks with interests in this project, so I needed to provide full context.

The idea you propose makes sense, it's similar to a separate idea of a knowledge data source that could initially be embedded in the binary with some of the built-in, common CRDs, the two main issues that arised once we tried this in a few places were:

  • Makes super difficult to debug unless now we produce some sort of additional output where the decision making is printed
  • It doesn't account for CRD versions where the CRD itself might've changed from an older to a newer version where the older could've been cluster-scoped and the newer version could've been moved to namespaces

The later is the critically important. While I do acknowledge it's an anti-pattern (you shouldn't "move" a CRD but perhaps just come up with a new one) I can't control what internal companies might do with their own toy CRD objects. It feels "strange" making the decision for them and although they don't have to use the feature if they don't need it, troubleshooting it without a target cluster is kind of difficult especially between version differences.

Regarding the why.md file explanation, take it just as that: it was what prompted me to create the tool but perhaps the output I was expecting back in the day was not what the tool ended up creating. Maybe you read too much into that one 😄

All in all though, I stand by what I said before, I'm not against the feature being there. I would love to see something first, where we can share it with longtime users of this tool, gather some feedback then get it merged. I'm happy to hear you don't want detection from a cluster either, I just think figuring out a path forward that works for most users is what's needed.

Happy to hear your thoughts or if you want to work on this I'm down to review any code or PR!

@reegnz
Copy link
Author

reegnz commented Nov 12, 2024

I was really bummed out when I read the why, and said 'damn, this is what we want, I can throw out some bespoke code' and then it turned out I can't. :)

Anyway I do understand how difficult this could prove to be, still would be a nice to try and support the use-case. Would be used by at least a couple of SRE-s at a smaller multi-billion dollar company. 😜

I'll play around with the code a bit, maybe I can get something working to move the discussion a bit further.

@reegnz
Copy link
Author

reegnz commented Nov 12, 2024

An example for an alternative extension mechanism is how gomplate does it:
https://docs.gomplate.ca/usage/#--plugin
Given that this feature only needs to declare a new template function maybe that's also a course forward.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
community-feedback Asking the community what the best solution might be
Projects
None yet
Development

No branches or pull requests

2 participants