[DBMON-6018] ClickHouse support for DBM #22341

sangeetashivaji · 2026-01-15T18:47:01Z

What does this PR do?

We want to support ClickHouse in DBM and this PR includes agent changes to help support

Query Metrics
Query Activity
Query Completion

Note: the default collection interval for Query Metrics is 10s, Query Activity it is 1s, Completed Query Samples it is 10s.

Majority of the logic sits in the three new files added statement_activity.py, statements.py & completed_query_samples.py

Motivation

Review checklist (to be filled by reviewers)

Feature or bugfix MUST have appropriate tests (unit, integration, e2e)
Add the qa/skip-qa label if the PR doesn't need to be tested during QA.
If you need to backport this PR to another branch, you can add the backport/<branch-name> label to the PR and it will automatically open a backport PR once this one is merged

codecov · 2026-01-16T16:18:11Z

Codecov Report

❌ Patch coverage is 82.99051% with 215 lines in your changes missing coverage. Please review.
✅ Project coverage is 85.04%. Comparing base (34cfce2) to head (6bda942).
⚠️ Report is 2 commits behind head on master.

❗ There is a different number of reports uploaded between BASE (34cfce2) and HEAD (6bda942). Click for more details.

HEAD has 1 upload less than BASE

Flag BASE (34cfce2) HEAD (6bda942)

2 1

Additional details and impacted files

Flag	Coverage Δ
active_directory	`?`
activemq_xml	`?`
aerospike	`?`
airflow	`?`
amazon_msk	`?`
ambari	`?`
apache	`?`
appgate_sdp	`?`
arangodb	`?`
argo_rollouts	`?`
argo_workflows	`?`
argocd	`?`
aspdotnet	`?`
avi_vantage	`?`
aws_neuron	`?`
azure_iot_edge	`?`
boundary	`?`
btrfs	`?`
cacti	`?`
calico	`?`
cassandra_nodetool	`?`
celery	`?`
ceph	`?`
cert_manager	`?`
cilium	`?`
cisco_aci	`?`
citrix_hypervisor	`?`
clickhouse	`?`
cloud_foundry_api	`?`
cloudera	`?`
cockroachdb	`?`
consul	`?`
coredns	`?`
couch	`?`
couchbase	`?`
crio	`?`
datadog_checks_base	`?`
datadog_checks_dev	`?`
datadog_checks_downloader	`?`
datadog_cluster_agent	`?`
dcgm	`?`
ddev	`?`
directory	`?`
disk	`?`
dns_check	`?`
dotnetclr	`?`
druid	`?`
duckdb	`?`
ecs_fargate	`?`
eks_fargate	`?`
elastic	`?`
envoy	`?`
esxi	`?`
etcd	`?`
exchange_server	`?`
external_dns	`?`
falco	`?`
fluentd	`?`
fluxcd	`?`
fly_io	`?`
foundationdb	`?`
gearmand	`?`
gitlab	`?`
gitlab_runner	`?`
glusterfs	`?`
go_expvar	`?`
gunicorn	`?`
haproxy	`?`
harbor	`?`
hazelcast	`?`
hdfs_datanode	`?`
hdfs_namenode	`?`
http_check	`?`
ibm_ace	`?`
ibm_db2	`?`
ibm_i	`?`
ibm_mq	`?`
ibm_was	`?`
iis	`?`
impala	`?`
infiniband	`?`
istio	`?`
kafka_consumer	`?`
karpenter	`?`
keda	`?`
kong	`?`
krakend	`?`
kube_apiserver_metrics	`?`
kube_controller_manager	`?`
kube_dns	`?`
kube_metrics_server	`?`
kube_proxy	`?`
kube_scheduler	`?`
kubeflow	`?`
kubelet	`?`
kubernetes_cluster_autoscaler	`?`
kubernetes_state	`?`
kubevirt_api	`?`
kubevirt_controller	`?`
kubevirt_handler	`?`
kuma	`?`
kyototycoon	`?`
kyverno	`?`
lighttpd	`?`
linkerd	`?`
linux_proc_extras	`?`
litellm	`?`
lustre	`?`
mac_audit_logs	`?`
mapr	`?`
mapreduce	`?`
marathon	`?`
marklogic	`?`
mcache	`?`
mesos_master	`?`
milvus	`?`
mongo	`?`
mysql	`?`
nagios	`?`
network	`?`
nfsstat	`?`
nginx	`?`
nginx_ingress_controller	`?`
nvidia_nim	`?`
nvidia_triton	`?`
octopus_deploy	`?`
openldap	`?`
openmetrics	`?`
openstack	`?`
openstack_controller	`?`
pdh_check	`?`
pgbouncer	`?`
php_fpm	`?`
postfix	`?`
postgres	`?`
powerdns_recursor	`?`
process	`?`
prometheus	`?`
proxmox	`?`
proxysql	`?`
pulsar	`?`
quarkus	`?`
rabbitmq	`?`
ray	`?`
redisdb	`?`
rethinkdb	`?`
riak	`?`
riakcs	`?`
sap_hana	`?`
scylla	`?`
silk	`?`
silverstripe_cms	`?`
singlestore	`?`
slurm	`?`
snmp	`?`
snowflake	`?`
sonarqube	`?`
sonatype_nexus	`?`
spark	`?`
sqlserver	`?`
squid	`?`
ssh_check	`?`
statsd	`?`
strimzi	`?`
supabase	`?`
supervisord	`?`
system_core	`?`
system_swap	`?`
tcp_check	`?`
teamcity	`?`
tekton	`?`
teleport	`?`
temporal	`?`
teradata	`?`
tibco_ems	`?`
tls	`?`
torchserve	`?`
traefik_mesh	`?`
traffic_server	`?`
twemproxy	`?`
twistlock	`?`
varnish	`?`
vault	`?`
velero	`?`
vertica	`?`
vllm	`?`
voltdb	`?`
vsphere	`?`
weaviate	`?`
win32_event_log	`?`
windows_performance_counters	`?`
windows_service	`?`
wmi_check	`?`
yarn	`?`
zk	`?`

Flags with carried forward coverage won't be shown. Click here to find out more.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 5e525d34a3

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

chatgpt-codex-connector · 2026-01-20T17:29:11Z

clickhouse/datadog_checks/clickhouse/completed_query_samples.py

+            if not rows:
+                # No new queries, but still advance checkpoint
+                if self._current_checkpoint_microseconds:
+                    self._save_checkpoint(self._current_checkpoint_microseconds)
+                    self._last_checkpoint_microseconds = self._current_checkpoint_microseconds
+                    self._log.debug("Advanced checkpoint (no new completed queries)")


Avoid advancing completion checkpoint on query errors

This block advances the checkpoint whenever rows is empty, but _collect_completed_queries() catches query/processing errors and returns [] (see the exception path later in the same file), so a transient ClickHouse error or permission issue will look identical to “no new queries.” In that failure scenario the checkpoint still moves forward, permanently skipping the failed window and losing completion samples. Consider distinguishing “no data” from “error” (e.g., re-raise or return a sentinel) before advancing the checkpoint.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-01-20T17:29:11Z

clickhouse/datadog_checks/clickhouse/statements.py

+            rows = self._collect_metrics_rows()
+            if not rows:
+                # Even if no rows, save the checkpoint to advance the window
+                # This prevents re-querying the same empty window repeatedly
+                if self._pending_checkpoint_microseconds:
+                    self._save_checkpoint(self._pending_checkpoint_microseconds)


Don’t save metrics checkpoint when query_log load fails

This early-return saves _pending_checkpoint_microseconds when no rows are returned, but _load_query_log_statements() swallows exceptions and returns an empty list on failure. If the query_log fetch fails (e.g., transient connection issue), this path still persists the checkpoint, causing the next run to skip that entire window and drop metrics. Treat error vs empty-result separately (e.g., let the exception bubble or set a failure flag) before saving the checkpoint.

Useful? React with 👍 / 👎.

Review from brett0000FF is dismissed. Related teams and files:

documentation
- clickhouse/assets/configuration/spec.yaml

sangeetashivaji · 2026-01-22T20:03:15Z

clickhouse/assets/configuration/spec.yaml

+      description: |
+        Set to `true` when connecting through a single endpoint that load-balances across multiple nodes.
+
+        When enabled, the agent uses `clusterAllReplicas('default', system.<table>)` to query


Do we want to give out details around why we are using this?

I'd probably put extended details in the docs rather than in the spec

makes sense! will update the spec here

sethsamuel · 2026-01-22T20:04:51Z

clickhouse/assets/configuration/spec.yaml

+          value:
+            type: boolean
+            example: false
+    - name: database_instance_collection_interval


Why do we need this as a config at all? Is there any use case for changing it?

discussed offline we'd want to remove this config

sethsamuel · 2026-01-22T20:06:00Z

clickhouse/datadog_checks/clickhouse/clickhouse.py

 # (C) Datadog, Inc. 2019-present
 # All rights reserved
 # Licensed under a 3-clause BSD style license (see LICENSE)
+import json


nit: use from datadog_checks.base.utils.format import json to smartly load a more efficient json libary

will fix this

sethsamuel · 2026-01-22T20:06:48Z

clickhouse/datadog_checks/clickhouse/clickhouse.py

+    from datadog_checks.base.stubs import datadog_agent
+

 class ClickhouseCheck(AgentCheck):


This check should probably extend DatabaseCheck: from datadog_checks.base.checks.db import DatabaseCheck. That gives access to shared DBM functions and properties.

sethsamuel · 2026-01-22T20:07:14Z

clickhouse/datadog_checks/clickhouse/clickhouse.py

+        # Build typed configuration
+        config, validation_result = build_config(self)
+        self._config = config
+        self._validation_result = validation_result


Can we emit a DBM agent health event with the config?

sethsamuel · 2026-01-22T20:08:41Z

clickhouse/datadog_checks/clickhouse/clickhouse.py

+        self._agent_hostname = None
+
+        # _database_instance_emitted: limit the collection and transmission of the database instance metadata
+        self._database_instance_emitted = TTLCache(


Do we need a TTLCache for this? Can we just use a database_instance_last_emitted var or such?

TTLCache is an overkill - so will be using a variable for this

sethsamuel · 2026-01-22T20:31:28Z

clickhouse/datadog_checks/clickhouse/statements.py

+
+            # Only save checkpoint after ALL payloads are successfully submitted
+            # This ensures we don't lose data if submission fails partway through
+            if self._pending_checkpoint_microseconds:


What happens if we double submit some activity?

sethsamuel · 2026-01-22T20:31:47Z

clickhouse/datadog_checks/clickhouse/statements.py

+            # Do NOT save checkpoint on error - this ensures we retry the same window
+            return []
+
+    def _get_clickhouse_version(self):


Should this be in the main check file?

yes that's right! it should be in the main check

sethsamuel · 2026-01-22T20:36:03Z

clickhouse/datadog_checks/clickhouse/completed_query_samples.py

+    is_initial_query
+FROM {query_log_table}
+WHERE
+  event_time_microseconds > fromUnixTimestamp64Micro({last_checkpoint_microseconds})


Same set of questions here as in statements

sethsamuel · 2026-01-22T20:37:32Z

clickhouse/datadog_checks/clickhouse/completed_query_samples.py

+  event_time_microseconds > fromUnixTimestamp64Micro({last_checkpoint_microseconds})
+  AND event_time_microseconds <= fromUnixTimestamp64Micro({current_checkpoint_microseconds})
+  AND event_date >= toDate(fromUnixTimestamp64Micro({last_checkpoint_microseconds}))
+  AND type = 'QueryFinish'


This seems highly duplicative with statements. Could the querying/batching/etc be abstracted out to create two minimal jobs that mostly do the same thing? Or should they actually be one job and just collect both on the same interval?

sethsamuel · 2026-01-22T20:38:07Z

clickhouse/datadog_checks/clickhouse/clickhouse.py

    SERVICE_CHECK_CONNECT = 'can_connect'

    def __init__(self, name, init_config, instances):
        super(ClickhouseCheck, self).__init__(name, init_config, instances)


If you add in DBM health integration you'll also get things like uncaught errors and missed collection intervals for free.

temporal-github-worker-1 bot added agent/review-requested docs/review-requested ecosystems/review-requested product/review-requested labels Jan 15, 2026

datadog-agent-integrations-bot bot added documentation integration/clickhouse labels Jan 15, 2026

sangeetashivaji force-pushed the sangeeta.shivajirao/clickhouse-fixes-jan13 branch 2 times, most recently from c5aeec3 to 4b37573 Compare January 16, 2026 16:11

datadog-agent-integrations-bot bot added dev/tooling dev_package labels Jan 16, 2026

sangeetashivaji changed the title ~~[DRAFT] Sangeeta.shivajirao/clickhouse fixes jan13~~ [DBMON-6018] ClickHouse support for DBM Jan 20, 2026

sangeetashivaji marked this pull request as ready for review January 20, 2026 17:25

sangeetashivaji requested review from a team as code owners January 20, 2026 17:25

datadog-agent-integrations-bot bot added team/agent-integrations team/documentation labels Jan 20, 2026

chatgpt-codex-connector bot reviewed Jan 20, 2026

View reviewed changes

brett0000FF previously approved these changes Jan 20, 2026

View reviewed changes

temporal-github-worker-1 bot added docs/approved and removed docs/review-requested labels Jan 20, 2026

temporal-github-worker-1 bot added docs/review-requested and removed docs/approved labels Jan 20, 2026

sangeetashivaji force-pushed the sangeeta.shivajirao/clickhouse-fixes-jan13 branch 2 times, most recently from 59dc9e0 to 2872c02 Compare January 21, 2026 22:54

sangeetashivaji requested a review from a team as a code owner January 22, 2026 17:55

datadog-agent-integrations-bot bot added integration/postgres team/database-monitoring-agent labels Jan 22, 2026

sangeetashivaji force-pushed the sangeeta.shivajirao/clickhouse-fixes-jan13 branch from fc7785d to 2df0d33 Compare January 22, 2026 17:59

sangeetashivaji added 19 commits January 22, 2026 14:30

Update changelog

b200cb9

Fix: Update test assertions for query_log_table placeholder

0a2032d

Fix tests

61cdf9f

Fixing CI

fc16509

Fixing CI

2fed3ff

Fixing CI

450b5bb

Revert accidental change to datadog_checks_dev

279a3af

Fix SharedConfig model: add default values for optional fields

c8d8136

Fix config models: add default values for Optional fields

f558700

Trigger CI

cd820da

Fix shared.py to match ddev generated format with = None defaults

06148ba

Rename clickhouse_cloud to single_endpoint_mode for more generic naming

54c4ff6

Update conf.yaml.example to use single_endpoint_mode

f31cc83

Fix config model ordering and checkpoint error handling

29adc71

Fix trailing whitespace in conf.yaml.example

b6f01b6

Fix ruff line length formatting

c499179

Use tag manager to regularize tag management

c4f8eaa

Fixing errors

7b0c023

Cleanup code to have only query activity

ba84005

sangeetashivaji force-pushed the sangeeta.shivajirao/clickhouse-fixes-jan13 branch from 75e49ec to ba84005 Compare January 22, 2026 19:30

Updating change log

328a4cb

sangeetashivaji commented Jan 22, 2026

View reviewed changes

sethsamuel reviewed Jan 22, 2026

View reviewed changes

sangeetashivaji added 7 commits January 22, 2026 17:04

Address comments

4a9fb82

Update

c6f0885

Update

d074613

Update

45fc81c

Update

6bda942

Address comments - 2

17737e3

Address comments

0cfb1b4

		from datadog_checks.base.stubs import datadog_agent


		class ClickhouseCheck(AgentCheck):

[DBMON-6018] ClickHouse support for DBM #22341

Are you sure you want to change the base?

[DBMON-6018] ClickHouse support for DBM #22341

Conversation

sangeetashivaji commented Jan 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Motivation

Review checklist (to be filled by reviewers)

Uh oh!

codecov bot commented Jan 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector bot Jan 20, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector bot Jan 20, 2026

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

sangeetashivaji commented Jan 15, 2026 •

edited

Loading

codecov bot commented Jan 16, 2026 •

edited

Loading