Skip to content

Query Blocklist#19011

Open
mshahid6 wants to merge 4 commits intoapache:masterfrom
mshahid6:broker-config-query-blocklist
Open

Query Blocklist#19011
mshahid6 wants to merge 4 commits intoapache:masterfrom
mshahid6:broker-config-query-blocklist

Conversation

@mshahid6
Copy link
Contributor

@mshahid6 mshahid6 commented Feb 11, 2026

Fixes #18964

Description

Added query blocklist for dynamically blocking queries in the situation there is a rogue app/user spamming the cluster without relying on static configs/restarts.

QueryBlocklistRule

Enforced early in QueryLifecycle (after init) and throws DruidException when a query matches a rule i.e. if ALL specified criteria match (AND logic). Null or empty criteria act as wildcards (match everything):

  • dataSources: Matches if ANY datasource in the query intersects with the rule's datasources
  • queryTypes: Matches if the query type is in the rule's query types
  • contextMatches: Matches if ALL key-value pairs in the rule match the query context (exact string match)

Query blocklist is managed via the existing coordinator config APIs:

GET /druid/coordinator/v1/config

POST /druid/coordinator/v1/config
  {
    "queryBlocklist": [
      {
        "ruleName": "block-wikipedia-groupbys",
        "dataSources": ["wikipedia"],
        "queryTypes": ["groupBy"],
        "contextMatches": {"priority": "0"}
      }
    ]
  }

Considerations

Creating a separate broker-level dynamic config was considered which can also be used for other future features such as datasource aliasing, routing rules or feature flags. Following contributor feedback, this implementation reuses the existing CoordinatorDynamicConfig and coordinator-broker config sync infrastructure instead of creating a separate broker config to have fewer servers communicating with the metadata store and have a future built-in metadata store operations go through central leaders

The query blocklist follows a "best effort" approach:

  • If BrokerViewOfCoordinatorConfig is null, blocklist check is skipped
  • If CoordinatorDynamicConfig is null (config not yet loaded), queries are allowed to proceed
  • Once config is loaded, blocklist enforcement begins

Rationale: The query blocklist is an operational safety feature, not a security control. Blocking all queries when config is unavailable would be more disruptive than the problem it's trying to solve. Operators can still use coordinator/historical unavailability to enforce hard blocks if needed.

Release note

Added query blocklist feature for dynamically blocking queries without restarts. Operators can block queries by datasource, query type, or query context using the new /druid/broker/v1/config API. Rules use AND logic (all criteria must match) and are stored in the metadata database.

Key changed/added classes in this PR
  • CoordinatorDynamicConfig - Added queryBlocklist field
    • QueryBlocklistRule - Rule-based query matching
    • QueryLifecycle - Added checkQueryBlocklist() method
    • QueryLifecycleFactory - Injects BrokerViewOfCoordinatorConfig for accessing blocklist
    • QueryBlocklistRuleTest
    • QueryLifecycleTest
    • CoordinatorDynamicConfigTest - Updated constructor tests

This PR has:

  • been self-reviewed.
  • added documentation for new or modified features or behaviors.
  • a release note entry in the PR description.
  • added Javadocs for most classes and all non-trivial methods. Linked related entities via Javadoc links.
  • added or updated version, license, or notice information in licenses.yaml
  • added comments explaining the "why" and the intent of the code wherever would not be obvious for an unfamiliar reader.
  • added unit tests or modified existing tests to cover new code paths, ensuring the threshold for code coverage is met.
  • added integration tests.
  • been tested in a test Druid cluster.

Copy link
Contributor

@gianm gianm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a useful concept, although I think it should be done in such a way that the Coordinator handles communications with the metadata store. Other things that work this way include lookup definitions, users and roles for basic authentication and authorization, centralized schema, etc.

The rationale for why we've done it this way, in these other cases, is three-fold:

  • Operationally, it's better if a smaller number of servers is communicating with the metadata store, to avoid overloading it and to make it easier to configure security for the metadata store.
  • Syncing dynamic configurations from a central leader (Coordinator) allows updates to propagate more quickly.
  • Someday we may want to make an option to use a builtin metadata store. This will be simplest to implement if metadata store operations go through central leaders. Those leaders would naturally become the place where

The way it can work is:

  1. Move the user-facing POST and GET APIs to the Coordinator.
  2. Brokers pull configs from the Coordinator on startup.
  3. When the Coordinator receives an update to the config, it should push it out to an internal POST API on the Brokers.
  4. Brokers also pull configs from the Coordinator periodically, in case they missed a push.

There are possible race conditions surrounding the push and pull: it's possible a new config is getting pushed out around the same time that the Broker is doing a scheduled pull. To solve these you can include a timestamp in the config object, and have the Brokers reject configs with older timestamps than the current one.

Please also consider what guarantees should exist around availability of the Broker config. Is it "required", i.e., if the Broker can't fetch it on startup then startup will fail? Or is it "best effort", i.e., if the Broker can't fetch it on startup, we proceed with a null config or default config for some time, until the next scheduled pull succeeds?

@jtuglu1 jtuglu1 self-requested a review February 12, 2026 19:06
Copy link
Contributor

@jtuglu1 jtuglu1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did a first pass – overall looking good!

this.validDebugDimensions = validateDebugDimensions(debugDimensions);
this.turboLoadingNodes = Configs.valueOrDefault(turboLoadingNodes, Set.of());
this.cloneServers = Configs.valueOrDefault(cloneServers, Map.of());
this.queryBlocklist = queryBlocklist != null ? ImmutableList.copyOf(queryBlocklist) : Defaults.QUERY_BLOCKLIST;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason we are creating an immutable copy of this? Maybe let's use Configs.valueOrDefault to follow convention here with the rest of the containers being serde'd.


private static class Defaults
{
static final List<QueryBlocklistRule> QUERY_BLOCKLIST = ImmutableList.of();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: List.of()

throw DruidException.forPersona(DruidException.Persona.USER)
.ofCategory(DruidException.Category.FORBIDDEN)
.build(
"Query blocked by rule[%s]",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this log the query ID? It'd be nice to co-locate the query ID and the rule name in a log so users/operators don't need to grep that separately.

* @param query the query to check
* @return true if the query matches this rule, false otherwise
*/
public boolean matches(Query<?> query)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be worth caching this result for all 3 types. JVM might be smart and branch predict this every time but since this is called per-query on constant fields, it might be worth explicitly doing this to save ≥ 2 comparisons per query.

@mshahid6 mshahid6 changed the title Broker Dynamic Config and Query Blocklist Query Blocklist Feb 12, 2026
@jtuglu1
Copy link
Contributor

jtuglu1 commented Feb 13, 2026

Another thing I forgot to mention is docs + UI, namely:

Docs:

  1. Add the field to the coordinator dynamic config description, explaining what it does.
  2. Separate documentation section for the rulesets. What the rule format looks like (description of each field, its default, etc.). The AND logic between the fields' predicates.

UI:
For completeness since coordinator config already in UI, do we want to add this as a field? Should be pretty simple to do with an AI tool.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Broker-level dynamic config + query blocklist

3 participants