Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Knative should not modify the istio-proxy container spec #15367

Open
hamishforbes opened this issue Jul 1, 2024 · 3 comments
Open

Knative should not modify the istio-proxy container spec #15367

hamishforbes opened this issue Jul 1, 2024 · 3 comments
Labels
kind/bug Categorizes issue or PR as related to a bug.

Comments

@hamishforbes
Copy link

Context

Istio provides 2 mechanisms for customising the spec of the istio-proxy sidecar container.
Limited customisation is possible via annotations (.e.g sidecar.istio.io/proxyCPU)
but you can also include a container named 'istio-proxy' with image: 'auto' in your pod spec.
The istio injection webhook then merges this with the rest of the required config for the sidecar to work correctly.

So you end up with a pod spec that looks something like

apiVersion: v1
kind: Pod
metadata:
  name: foobar
spec:
  containers:
    - name: my-app
      image: "foobar:latest"
    - name: istio-proxy
      image: 'auto'
      resources: {...} # etc

The problem is that in knative-serving this container gets passed through makeContainer and the containerMask

Most of the time these modifications are pretty benign and don't cause any problems, a few superfluous env vars etc, no big deal.

However...
I am trying to enable the istio/kubernetes native sidecars functionality.
When this is active the istio injector mutating webhook adds a default lifecycle to the istio-proxy container.

This is incompatible with the lifecycle added by knative. You end up with httpGet and exec defined in the preStop and the cluster rejects the pod creation

lifecycle:
  preStop:
    httpGet:
      path: /wait-for-drain
      port: 8022
      scheme: HTTP
    exec:
      command:
      - pilot-agent
      - request
      - --debug-port=15020
      - POST
      - drain
spec.initContainers[1].lifecycle.preStop.httpGet: Forbidden: may specify more than 1 handler type

You could argue that the Istio mutating webhook should remove the httpGet field to ensure the preStop is valid.
But ideally knative shouldn't be interfering with this container spec either.

I've tested a quick fix in one of our dev clusters by adding this to the BuildUserContainers function

if rev.Spec.PodSpec.Containers[i].Name == "istio-proxy" {
	containers = append(containers, rev.Spec.PodSpec.Containers[i])
	continue
}

which works fine but there might be a better solution you would like to implement.

What version of Knative?

1.13.0

Expected Behavior

istio-proxy container should be untouched

Actual Behavior

Knative modifies the istio-proxy container spec resulting in an invalid pod spec preventing pods from being launched

Steps to Reproduce the Problem

Install istio and knative.
Enable native sidecars by setting values.pilot.env.ENABLE_NATIVE_SIDECARS=true in Helm.
Any knative services will fail to create pods

@hamishforbes hamishforbes added the kind/bug Categorizes issue or PR as related to a bug. label Jul 1, 2024
@skonto
Copy link
Contributor

skonto commented Jul 11, 2024

Hi @hamishforbes, makeContainer does the following:

func makeContainer(container corev1.Container, rev *v1.Revision) corev1.Container {
	// Adding or removing an overwritten corev1.Container field here? Don't forget to
	// update the fieldmasks / validations in pkg/apis/serving
	container.Lifecycle = userLifecycle
	container.Env = append(container.Env, getKnativeEnvVar(rev)...)

	// Explicitly disable stdin and tty allocation
	container.Stdin = false
	container.TTY = false
	if container.TerminationMessagePolicy == "" {
		container.TerminationMessagePolicy = corev1.TerminationMessageFallbackToLogsOnError
	}

	if container.ReadinessProbe != nil {
		if container.ReadinessProbe.HTTPGet != nil || container.ReadinessProbe.TCPSocket != nil || container.ReadinessProbe.GRPC != nil {
			// HTTP, TCP and gRPC ReadinessProbes are executed by the queue-proxy directly against the
			// container instead of via kubelet.
			container.ReadinessProbe = nil
		}
	}

	return container
}

Once you put a container in the Knative spec it is kind of expected to be managed by Knative and not all projects conform to that, so I view this is issue an enhancement request for the Istio integration.
I suspect we could keep some of the logic and avoid lifecycle, readinessprobe for this special case (since we are integrating with Istio anyway) eg. have a special annotation that selects a specific container as a proxy or use the convention of the name above. I am wondering though if you could fix this via another webhook that modifies the pod (webhooks are applied based on alphabetical orde afaik, a bit of a hack).
@dprotaso @ReToCode anything to add here?

Copy link

This issue is stale because it has been open for 90 days with no
activity. It will automatically close after 30 more days of
inactivity. Reopen the issue with /reopen. Mark the issue as
fresh by adding the comment /remove-lifecycle stale.

@github-actions github-actions bot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Oct 10, 2024
@hamishforbes
Copy link
Author

/remove-lifecycle stale
still an issue

@knative-prow knative-prow bot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Oct 10, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug.
Projects
None yet
Development

No branches or pull requests

2 participants