Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

extraChecksd.configDataMap is not automatically mounted to cluster check runners pods #1586

Open
ajax-khadzhynov-m opened this issue Dec 20, 2024 · 3 comments
Labels
bug Something isn't working

Comments

@ajax-khadzhynov-m
Copy link

Conditions:

I have a clusterAgent configured with extraConfd.configDataMap for YAML configs and a clusterCheckRunner configured with extraChecksd.configDataMap for Python scripts.

When creating this configuration, I was guided by this docs where we just need to add parameters extraConfd.configDataMap extraChecksd.configDataMap, as I understand that mounts automatically to pods, and nothing else is said about the fact that you need to mount theextraChecksd configmap separately.

My custom resource yaml configuration:

apiVersion: datadoghq.com/v2alpha1
kind: DatadogAgent
metadata:
  name: datadog
  namespace: datadog
spec:
  features:
    ...
    clusterChecks:
      enabled: true
      useClusterChecksRunners: true
    ...
  global:
    clusterName: monitoring
    credentials:
      apiSecret:
        keyName: api-key
        secretName: datadog-secret
      appSecret:
        keyName: app-key
        secretName: datadog-secret
    criSocketPath: /run/dockershim.sock
    logLevel: info
    podLabelsAsTags:
      env_name: env_name
      env_tag: env_tag
  override:
    clusterAgent:
      extraConfd:
        configDataMap:
          custom.yaml: |-
           <yaml_data>
      replicas: 2
    clusterChecksRunner:
      containers:
        agent:
          env:
            - name: DD_DISABLE_CLUSTER_NAME_TAG_KEY
              value: "true"
          resources:
            requests:
              cpu: "2"
              memory: 4000Mi
      extraChecksd:
        configDataMap:
          custom.py: |
            <python_script_data>
    nodeAgent:
      env:
        - name: DD_EC2_PREFER_IMDSV2
          value: "true"
        - name: DD_COLLECT_EC2_TAGS
          value: "true"
        - name: DD_SECRET_BACKEND_COMMAND
          value: /readsecret_multiple_providers.sh
      image:
        jmxEnabled: true
      tolerations:
        - operator: Exists

Problem:
As a result, I can see that extraConfd.configDataMap was created and mounted, but extraChecksd.configDataMap was created but not mounted.
I tried different approaches, for example, creating both ConfigMaps for one agent, but the result was the same.

ClusterAgent pod yaml output where confd exist in volumeand in volumeMounts:

apiVersion: v1
kind: Pod
metadata:
  name: datadog-cluster-agent
  generateName: datadog-cluster-agent
  namespace: datadog
  ...
spec:
  volumes:
    - name: installinfo
      configMap:
        name: datadog-install-info
        defaultMode: 420
    - name: confd
      configMap:
        name: clusteragent-extra-confd
        items:
          ...
          - key: <yaml_configs>.yaml
            path: <yaml_configs>.yaml
          ...
        defaultMode: 420
    - name: logdatadog
      emptyDir: {}
    - name: certificates
      emptyDir: {}
    - name: tmp
      emptyDir: {}
    - name: ksm-core-config
      configMap:
        name: datadog-kube-state-metrics-core-config
        defaultMode: 420
    - name: orchestrator-explorer-config
      configMap:
        name: datadog-orchestrator-explorer-config
        defaultMode: 420
  containers:
    - name: cluster-agent
      image: gcr.io/datadoghq/cluster-agent:7.54.0
      ports:
        ...
      env:
        ...
      resources: {}
      volumeMounts:
        - name: installinfo
          readOnly: true
          mountPath: /etc/datadog-agent/install_info
          subPath: install_info
        - name: confd
          readOnly: true
          mountPath: /conf.d
        - name: logdatadog
          mountPath: /var/log/datadog
        - name: certificates
          mountPath: /etc/datadog-agent/certificates
        - name: tmp
          mountPath: /tmp
        - name: ksm-core-config
          readOnly: true
          mountPath: /etc/datadog-agent/conf.d/kubernetes_state_core.d
        ...

ClusterCheckRunner pod yaml output where checksd exist in volume but absent in volumeMounts:

apiVersion: v1
kind: Pod
metadata:
  name: datadog-cluster-checks-runner
  generateName: datadog-cluster-checks-runner
  namespace: datadog
spec:
  volumes:
    - name: installinfo
      configMap:
        name: datadog-install-info
        defaultMode: 420
    - name: config
      emptyDir: {}
    - name: remove-corechecks
      emptyDir: {}
    - name: logdatadog
      emptyDir: {}
    - name: tmp
      emptyDir: {}
    - name: checksd
      configMap:
        name: clusterchecksrunner-extra-checksd
        defaultMode: 420
  initContainers:
    - name: init-config
      image: gcr.io/datadoghq/agent:7.54.0
      command:
        - bash
        - '-c'
      args:
        ...
      env:
        ...
      resources: {}
      volumeMounts:
        - name: installinfo
          readOnly: true
          mountPath: /etc/datadog-agent/install_info
          subPath: install_info
        - name: config
          mountPath: /etc/datadog-agent
        - name: logdatadog
          mountPath: /var/log/datadog
        - name: tmp
          mountPath: /tmp
        - name: remove-corechecks
          mountPath: /etc/datadog-agent/conf.d
      terminationMessagePath: /dev/termination-log
      terminationMessagePolicy: File
      imagePullPolicy: IfNotPresent
  containers:
    - name: agent
      image: gcr.io/datadoghq/agent:7.54.0
      command:
        - bash
        - '-c'
      args:
        - agent run
      env:
        ...
      resources:
        requests:
          cpu: '2'
          memory: 4000Mi
      volumeMounts:
        - name: installinfo
          readOnly: true
          mountPath: /etc/datadog-agent/install_info
          subPath: install_info
        - name: config
          mountPath: /etc/datadog-agent
        - name: logdatadog
          mountPath: /var/log/datadog
        - name: tmp
          mountPath: /tmp
        - name: remove-corechecks
          mountPath: /etc/datadog-agent/conf.d
        ...

JFYI: ConfigMaps as k8s objects were created successfully in the correct namespace, and there are no errors in the Datadog Operator and agents logs.

Expected:
The ConfigMap created by extraChecksd.configDataMap is mounted automatically, just like it is with the extraConfd.configDataMap parameter.

Workaround:
Manually mount the created checksd ConfigMap by adding VolumeMounts in the configuration.

@tbavelier
Copy link
Member

tbavelier commented Dec 20, 2024

Hello @ajax-khadzhynov-m ,

Thank you for the report! I was able to reproduce in my environment. Digging further, the volumeMount is automatically added on the init-container of the nodeAgent, so setting:

    nodeAgent:
      extraChecksd:
        configMap:
          name: checks-config
      extraConfd:
        configMap:
          name: confd-config

does result in both volume and volumeMount and the custom check running there. This is done by:


This volumeMount is not present on the other hand for the cluster Agent (expected since it doesn't run any checks), and neither on the runner:
volumeMounts := []corev1.VolumeMount{
common.GetVolumeMountForInstallInfo(),
common.GetVolumeMountForConfig(),
common.GetVolumeMountForLogs(),
common.GetVolumeMountForTmp(),
common.GetVolumeMountForRmCorechecks(),
}

I'll check with our team if this is expected behaviour, but in the meantime, you should indeed add a volumeMount override if trying to use a custom check on the runner and not the node Agent.

Tracked internally: https://datadoghq.atlassian.net/browse/CECO-1892
A custom build simply adding common.GetVolumeMountForChecksd() in the volumeMounts array fixes it, but might have other implications, so to be investigated

@tbavelier tbavelier added the bug Something isn't working label Dec 20, 2024
@tbavelier tbavelier changed the title extraChecksd.configDataMap is not automatically mounted to pods extraChecksd.configDataMap is not automatically mounted to cluster check runners pods Dec 20, 2024
@ajax-khadzhynov-m
Copy link
Author

When you use Cluster Check Runners, a small, dedicated set of Agents runs the cluster checks, leaving the endpoint checks to the normal Agent. This strategy can be beneficial to control the dispatching of cluster checks, especially when the scale of your cluster checks increases.

@tbavelier thanks for your quick response.
In my opinion, we need this functionality for the clusterCheckRunner, which, as stated in the documentation, was created to run checks and offload nodeAgents, please correct me if I'm wrong.

@tbavelier
Copy link
Member

I understand! As you found, there is a workaround in the meantime by adding in DatadogAgent the volumeMount, even though it is not ideal. Feel free to open a support ticket at https://www.datadoghq.com/support/ for better tracking and providing how important such a feature is for your organisation. As edited in my message, there is a possible fix but it needs to be investigated further.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants