Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature Request: Provide an MCS Controller Implementation Using Sveltos’s Event Framework #435

Open
kahirokunn opened this issue Jan 14, 2025 · 4 comments

Comments

@kahirokunn
Copy link
Contributor

kahirokunn commented Jan 14, 2025

Summary

We propose a feature (or guideline documentation) illustrating how to implement an MCS (Multi-Cluster Services) controller using Sveltos’s Event Framework. This controller detects the presence of a ServiceExport in a “source” cluster, then automates the creation and maintenance of:

  1. A “derived Service” named derived-$hashServiceExport that inherits key fields (e.g., ports, selectors) from the exported Service.
  2. One EndpointSlice per source cluster, named derived-$hashServiceExport-$clusterId, which reflects the actual Pod endpoints from that source.
  3. A single ServiceImport that references the derived Service and unifies multi-cluster discoverability.

The solution adheres to KEP-1645, following the principle of namespace sameness. It manages ClusterIP, Headless, and LoadBalancer/NodePort services consistently, while explicitly disallowing ExternalName services.

Background

The Kubernetes MCS API standardizes how services can be exported from one cluster and discovered in others. By pairing this with Sveltos’s Event Framework, we can fully automate the “service export → derived service + endpointslice + serviceimport” pipeline. This removes the burden of manually provisioning these resources across multiple clusters.

Proposed Baseline Example

Below are minimal YAML snippets representing the core resources that will be generated or updated whenever a new ServiceExport is detected in a source cluster. The controller computes “$hashServiceExport” from the ServiceExport’s name (refer to the official MCS implementation for details).

# 1) ServiceExport (Source of truth in the exporting cluster)
apiVersion: multicluster.k8s.io/v1alpha1
kind: ServiceExport
metadata:
  name: sample-service
  namespace: default
---
# 2) ServiceImport (created if not already present in the importing clusters)
apiVersion: multicluster.k8s.io/v1alpha1
kind: ServiceImport
metadata:
  name: sample-service
  namespace: default
  annotations:
    multicluster.kubernetes.io/derived-service: "derived-$hashServiceExport"
spec:
  # Type can be:
  # - ClusterSetIP (for ClusterIP/LoadBalancer/NodePort Services)
  # - Headless (for clusterIP: None)
  type: ClusterSetIP
  ports:
    - name: http
      port: 80
      protocol: TCP
  ips:
    - "<the-derived-service-clusterIP-or-NONE>"
status:
  clusters:
    - cluster: cluster-a
---
# 3) Derived Service (always created for ClusterIP, LoadBalancer, NodePort, or Headless)
#    Named derived-$hashServiceExport. It inherits the relevant ports from the original service.
apiVersion: v1
kind: Service
metadata:
  name: derived-$hashServiceExport
  namespace: default
  labels:
    multicluster.kubernetes.io/service-name: "sample-service"
    multicluster.kubernetes.io/service-imported: "true"
    app.kubernetes.io/managed-by: sveltos
ownerReferences:
  - apiVersion: multicluster.k8s.io/v1alpha1
    kind: ServiceImport
    name: sample-service
spec:
  type: ClusterIP # For NodePort/LoadBalancer, convert to ClusterIP
  # or clusterIP: None for a Headless service
  selector:        # Ensures namespace sameness: keep the original service's selector
    app: sample
  ports:
    - name: http
      port: 80
      targetPort: 8080
---
# 4) EndpointSlice (one per source cluster) reflecting the actual Pod endpoints
apiVersion: discovery.k8s.io/v1
kind: EndpointSlice
metadata:
  name: derived-$hashServiceExport-clustera
  namespace: default
  labels:
    kubernetes.io/service-name: sample-service
    multicluster.kubernetes.io/service-name: sample-service
    cluster.x-k8s.io/cluster-name: cluster-a
    endpointslice.kubernetes.io/managed-by: sveltos
ownerReferences:
  - apiVersion: multicluster.k8s.io/v1alpha1
    kind: ServiceImport
    name: sample-service
addressType: IPv4
ports:
  - name: http
    protocol: TCP
    port: 80
endpoints:
  - addresses:
      - "10.0.1.1"
    conditions:
      ready: true
    nodeName: node-1
  - addresses:
      - "10.0.1.2"
    conditions:
      ready: true
    nodeName: node-2

Workflow

  1. A ServiceExport resource is created in a source cluster.
  2. Sveltos’s Event Framework detects the new ServiceExport and retrieves information about the corresponding Service (e.g., type, ports, selectors).
  3. The controller generates (or updates):
    • A ServiceImport object (type = ClusterSetIP or Headless, depending on the original Service), referencing a derived service.
    • A “derived-$hashServiceExport” Service that captures either the ClusterIP or Headless specs.
    • One EndpointSlice “derived-$hashServiceExport-$clusterId” per source cluster reflecting actual Pod endpoints.
  4. For LoadBalancer or NodePort services, the derived Service is set to ClusterIP.
  5. For headless services, clusterIP: None is used on the derived Service and an EndpointSlice is still created for DNS-based resolution.
  6. If the service type is ExternalName, the ServiceExport is marked invalid (via a status condition), and no derived resources are created.
  7. Finalizers or OwnerReferences ensure that if the ServiceExport is removed, the derived Service and EndpointSlice(s) are cleaned up automatically.
  8. These resources are synchronized to all member clusters in the ClusterSet, providing consistent multi-cluster service discovery.

Benefits

  • Fully Automated Creation: Eliminates manual steps for provisioning MCS resources across clusters.
  • Unified Approach for Multiple Service Types: ClusterIP, NodePort, and LoadBalancer services are all converted consistently (with optional special handling for headless).
  • Clean Resource Ownership: OwnerReferences let Kubernetes handle garbage collection automatically, simplifying lifecycle management.
  • DNS-Friendly for Headless Services: CoreDNS can resolve these exported headless services, thanks to the EndpointSlice objects.
  • Extensible Design: Sveltos’s Event Framework can be extended to track additional resource states or integrate with other multi-cluster components if needed.

By following the strategies outlined above and implementing them in Sveltos’s Event Framework, platform engineers can reliably export services from any source cluster and consume them with minimal configuration overhead. This significantly accelerates multi-cluster use cases—whether for high availability, traffic optimization, or cross-environment integrations—without inventing a proprietary approach.

@gianlucam76
Copy link
Member

Thank you @kahirokunn

Let me summarise it to see if I got it.

  • When a ServiceExport is created in ClusterA:
    . Sveltos gets the corresponding Kubernetes Service and Endpoints and creates an EndpointSlice
    * this is supposed to be created in the other clusters part of the clusterSet
    * in your example port is https/443. Is this a constant?
    . Sveltos create a ServiceImport in the other clusters part of the clusterSet
    . if corresponding Kubernetes Service does not exist, Sveltos creates a Service with no selector and type ClusterSetIP in the other clusters part of the clusterSet

Is that correct?

@kahirokunn
Copy link
Contributor Author

kahirokunn commented Jan 14, 2025

Thank you @gianlucam76
Yes, that's correct! Let me clarify about the port:

The port (443/https in the example) is not a constant - it matches exactly with the port of the Kubernetes Service that has the same name as the ServiceExport.

Everything else in your summary is accurate!

Also, one important point to add: Even if you have three clusters in a clusterset and two of them create ServiceExports, there should only be one ServiceImport created.

@gianlucam76
Copy link
Member

Thank you. This is achievable already with Sveltos. Next week (I am pretty tight this week), I will prepare the Sveltos configuration for this and share.
We can make sure we create only one ServiceImport by collecting all ServiceExports in the management cluster and then aggregating from there before posting ServiceImports in the other clusters.

I might need your help testing it though.

@kahirokunn
Copy link
Contributor Author

kahirokunn commented Jan 15, 2025

Please allow me to re-read the KEP and share the revised version with you.

MCS Controller Implementation Guide Using Sveltos - Revised

Overview

This guide details how to automate the implementation of the Kubernetes Multi-Cluster Services (MCS) API using Sveltos's Event Framework.
Based on KEP-1645 and the principle of namespace sameness, we present accurate conversion patterns for each service type.

In this revised document, we introduce the concept of creating derived Services named “derived-$hashServiceExport,” where $hash is computed from the ServiceExport name. Refer to the following implementation for hashing:
https://github.com/kubernetes-sigs/mcs-api/blob/b4f72b8c11b640b049a2c247994a2de3eb0dda75/pkg/controllers/common.go#L39-L43

We also create one EndpointSlice for each source cluster associated with the ServiceExport, named “derived-$hash-$clusterId.” For clarity, the label “multicluster.kubernetes.io/service-name: ” is added to both the EndpointSlice and the derived Service. Additionally, we establish OwnerReferences from the ServiceImport to the Service, and from the Service to the EndpointSlice.

Processing Patterns by Service Type

For reference:
https://github.com/kubernetes/enhancements/blob/master/keps/sig-multicluster/1645-multi-cluster-services-api/README.md#clusterset-service-behavior-expectations

1. ClusterIP / LoadBalancer / NodePort Services

These service types can be handled with the same processing pattern. Below is an example:

Original Service in Source Cluster (ClusterID: cluster-a)

apiVersion: v1
kind: Service
metadata:
  name: web-service
  namespace: default
spec:
  type: ClusterIP  # or LoadBalancer or NodePort
  selector:
    app: web
  ports:
    - name: http
      port: 80
      targetPort: 8080

ServiceExport (ClusterID: cluster-a)

apiVersion: multicluster.k8s.io/v1alpha1
kind: ServiceExport
metadata:
  name: web-service
  namespace: default

Generated ServiceImport (ClusterID: cluster-b)

apiVersion: multicluster.k8s.io/v1alpha1
kind: ServiceImport
metadata:
  name: web-service
  namespace: default
  annotations:
    multicluster.kubernetes.io/derived-service: derived-$hashServiceExport
spec:
  type: ClusterSetIP
  ports:
    - name: http
      port: 80
      protocol: TCP
  ips:
    - "10.96.0.1"  # Cluster IP assigned to the derived Service
status:
  clusters:
  - cluster: cluster-a

Generated Service (ClusterID: cluster-b)

apiVersion: v1
kind: Service
metadata:
  name: derived-$hashServiceExport
  namespace: default
  labels:
    multicluster.kubernetes.io/service-name: web-service
    multicluster.kubernetes.io/service-imported: "true"
    app.kubernetes.io/managed-by: sveltos
ownerReferences:
  - apiVersion: multicluster.k8s.io/v1alpha1
    kind: ServiceImport
    name: web-service
    # other fields (uid, controller, blockOwnerDeletion) required by OwnerReference
spec:
  type: ClusterIP
  selector:  # Selector is maintained based on namespace sameness
    app: web
  ports:
    - name: http
      port: 80
      targetPort: 8080

Generated EndpointSlice (ClusterID: cluster-b)

apiVersion: discovery.k8s.io/v1
kind: EndpointSlice
metadata:
  name: derived-$hashServiceExport-clustera
  namespace: default
  labels:
    kubernetes.io/service-name: web-service
    multicluster.kubernetes.io/service-name: web-service
    cluster.x-k8s.io/cluster-name: cluster-a
    endpointslice.kubernetes.io/managed-by: sveltos
ownerReferences:
  - apiVersion: multicluster.k8s.io/v1alpha1
    kind: ServiceImport
    name: web-service
    # other fields (uid, controller, blockOwnerDeletion) required by OwnerReference
addressType: IPv4
ports:
  - name: http
    protocol: TCP
    port: 80
endpoints:
  - addresses:
      - "10.0.1.1"
    conditions:
      ready: true
    nodeName: node-a
  - addresses:
      - "10.0.1.2"
    conditions:
      ready: true
    nodeName: node-b

2. Headless Services (clusterIP: None)

https://github.com/coredns/multicluster/blob/49f47d950355f793d656aec8a6d198daf1d888b1/multicluster.go#L347-L381

Headless services require special handling, typically relying on DNS-based service discovery instead of a VIP (Virtual IP). Initially, one might assume an MCS controller only needs to create ServiceImport objects; however, since CoreDNS uses EndpointSlices as a record source (reference: https://github.com/coredns/multicluster/blob/49f47d950355f793d656aec8a6d198daf1d888b1/multicluster.go#L347-L381), the controller must also create EndpointSlices for each source cluster for correct DNS-based service discovery.

Original Service in Source Cluster (ClusterId: cluster-a)

apiVersion: v1
kind: Service
metadata:
  name: stateful-service
  namespace: default
spec:
  clusterIP: None
  selector:
    app: stateful
  ports:
    - name: http
      port: 80
      targetPort: 8080

ServiceExport (ClusterId: cluster-a)

apiVersion: multicluster.k8s.io/v1alpha1
kind: ServiceExport
metadata:
  name: stateful-service
  namespace: default

Generated ServiceImport (ClusterID: cluster-b)

apiVersion: multicluster.k8s.io/v1alpha1
kind: ServiceImport
metadata:
  name: stateful-service
  namespace: default
  annotations:
    multicluster.kubernetes.io/derived-service: derived-$hashServiceExport
spec:
  type: Headless
  ports:
    - name: http
      port: 80
      protocol: TCP
status:
  clusters:
  - cluster: cluster-a

Generated EndpointSlice for Headless Service (ClusterID: cluster-b)

apiVersion: discovery.k8s.io/v1
kind: EndpointSlice
metadata:
  name: derived-$hashServiceExport-clustera
  namespace: default
  labels:
    kubernetes.io/service-name: stateful-service
    multicluster.kubernetes.io/service-name: stateful-service
    cluster.x-k8s.io/cluster-name: cluster-a
    endpointslice.kubernetes.io/managed-by: sveltos
ownerReferences:
  - apiVersion: multicluster.k8s.io/v1alpha1
    kind: ServiceImport
    name: web-service
    # other fields (uid, controller, blockOwnerDeletion) required by OwnerReference
addressType: IPv4
ports:
  - name: http
    protocol: TCP
    port: 80
endpoints:
  - addresses:
      - "10.0.2.1"
    conditions:
      ready: true
    nodeName: node-c
  - addresses:
      - "10.0.2.2"
    conditions:
      ready: true
    nodeName: node-d

3. ExternalName Services

ExternalName services cannot be exported.

Original Service in Source Cluster (ClusterId: cluster-a)

apiVersion: v1
kind: Service
metadata:
  name: external-db
  namespace: default
spec:
  type: ExternalName
  externalName: db.example.com

ServiceExport (ClusterId: cluster-a)

apiVersion: multicluster.k8s.io/v1alpha1
kind: ServiceExport
metadata:
  name: external-db
  namespace: default

This ServiceExport will fail with the following status:

apiVersion: multicluster.k8s.io/v1alpha1
kind: ServiceExport
metadata:
  name: external-db
  namespace: default
status:
  conditions:
    - type: InvalidService
      status: "True"
      reason: UnsupportedServiceType
      message: "ExternalName services cannot be exported"

Key Implementation Points

  1. Selector Maintenance

    • Following namespace sameness principles, all imported Services maintain their selectors.
    • This enables Pods with matching labels in the importing cluster to be automatically added as service endpoints when applicable.
  2. (Optional) Derived Service Names

    • For each ServiceExport, create a derived Service named “derived-$hashServiceExport” where $hash is computed from the ServiceExport name.
    • One EndpointSlice per source cluster is created for every exported service. The EndpointSlice is named “derived-$hashServiceExport-$clusterId.”
  3. CoreDNS Integration for Headless Services

    • Although the IP address resolution for headless services is handled by CoreDNS, actual EndpointSlices are still required because they serve as data sources for DNS records.
    • Since there is no difference in EndpointSlices themselves between Headless and ClusterIP services, there is no need for conditional branching - you can process them exactly the same way as EndpointSlices for other services.
  4. Labeling for Multi-Cluster

    • EndpointSlices should include the label “multicluster.kubernetes.io/service-name: ,” which is required by the KEP.
    • For better UX and resource management, we also recommend adding the same label to the derived Service.
  5. (Optional) OwnerReferences for Automatic Garbage Collection

    • The ServiceImport object is set as the owner of the derived Service.
    • The derived Service is set as the owner of the EndpointSlice.
    • This hierarchy allows you to track resource relationships efficiently and simplifies finalizer handling.
  6. Service Type Conversion

    • LoadBalancer and NodePort services are converted to ClusterIP in the derived Services.
    • For headless services (clusterIP: None), create corresponding headless services in target clusters to maintain DNS resolution consistency. This allows CoreDNS to create EndpointSlices containing IPs of Pods matching the headless service's selector, adhering to namespace sameness.
    • ExternalName services cannot be exported.
  7. Conflict Resolution

    • Only one ServiceImport is created when the same service is exported from multiple clusters (i.e., we do not create per-cluster ServiceImport objects).
    • Warnings are issued when mixing headless and non-headless services.
    • Conflicts are communicated via ServiceExport Conditions.
  8. Scalability and Error Handling

    • Efficient EndpointSlice updates minimize inter-cluster communication overhead.
    • Validation failures are reflected in ServiceExport status conditions.
    • A retry mechanism should handle transient resource creation or update issues.
  9. (Optional) Monitoring and Debugging

    • Resources are labeled appropriately for easier traceability and debugging.
    • Events are recorded for state changes.
    • Metrics can be exported to monitor the MCS controller’s performance and potential bottlenecks.

Architectural Considerations

  1. Sveltos's Role

    • Detection of ServiceExports.
    • (Optional) Monitoring of ServiceExports.
    • Automatic creation and cleanup of EndpointSlice, derived Service, and ServiceImport objects.
    • State synchronization across clusters.
  2. Namespace Sameness

    • The same namespace structure must exist across clusters to avoid conflicts.
    • Proper synchronization of namespace resources is assumed (integrations with Sveltos’s CRDs or other tooling).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants