id	title	sidebar_label
filters	Query filters	Filters

Apache Druid supports two query languages: Druid SQL and native queries. This document describes the native language. For information about aggregators available in SQL, refer to the SQL documentation.

A filter is a JSON object indicating which rows of data should be included in the computation for a query. It’s essentially the equivalent of the WHERE clause in SQL. Filters are commonly applied on dimensions, but can be applied on aggregated metrics, for example, see Filtered aggregator and Having filters.

Apache Druid supports the following types of filters.

Selector filter

The simplest filter is a selector filter. The selector filter will match a specific dimension with a specific value. Selector filters can be used as the base filters for more complex Boolean expressions of filters.

The grammar for a SELECTOR filter is as follows:

"filter": { "type": "selector", "dimension": <dimension_string>, "value": <dimension_value_string> }

This is the equivalent of WHERE <dimension_string> = '<dimension_value_string>' or WHERE <dimension_string> IS NULL (if the value is null).

The selector filter supports the use of extraction functions, see Filtering with Extraction Functions for details.

Column comparison filter

The column comparison filter is similar to the selector filter, but instead compares dimensions to each other. For example:

"filter": { "type": "columnComparison", "dimensions": [<dimension_a>, <dimension_b>] }

This is the equivalent of WHERE <dimension_a> = <dimension_b>.

dimensions is list of DimensionSpecs, making it possible to apply an extraction function if needed.

Regular expression filter

The regular expression filter is similar to the selector filter, but using regular expressions. It matches the specified dimension with the given pattern. The pattern can be any standard Java regular expression.

"filter": { "type": "regex", "dimension": <dimension_string>, "pattern": <pattern_string> }

The regex filter supports the use of extraction functions, see Filtering with Extraction Functions for details.

Logical expression filters

AND

The grammar for an AND filter is as follows:

"filter": { "type": "and", "fields": [<filter>, <filter>, ...] }

The filters in fields can be any other filter defined on this page.

OR

The grammar for an OR filter is as follows:

"filter": { "type": "or", "fields": [<filter>, <filter>, ...] }

The filters in fields can be any other filter defined on this page.

NOT

The grammar for a NOT filter is as follows:

"filter": { "type": "not", "field": <filter> }

The filter specified at field can be any other filter defined on this page.

JavaScript filter

The JavaScript filter matches a dimension against the specified JavaScript function predicate. The filter matches values for which the function returns true.

The function takes a single argument, the dimension value, and returns either true or false.

{
  "type" : "javascript",
  "dimension" : <dimension_string>,
  "function" : "function(value) { <...> }"
}

Example The following matches any dimension values for the dimension name between 'bar' and 'foo'

{
  "type" : "javascript",
  "dimension" : "name",
  "function" : "function(x) { return(x >= 'bar' && x <= 'foo') }"
}

The JavaScript filter supports the use of extraction functions, see Filtering with Extraction Functions for details.

JavaScript-based functionality is disabled by default. Please refer to the Druid JavaScript programming guide for guidelines about using Druid's JavaScript functionality, including instructions on how to enable it.

Extraction filter

The extraction filter is now deprecated. The selector filter with an extraction function specified provides identical functionality and should be used instead.

Extraction filter matches a dimension using some specific Extraction function. The following filter matches the values for which the extraction function has transformation entry input_key=output_value where output_value is equal to the filter value and input_key is present as dimension.

Example The following matches dimension values in [product_1, product_3, product_5] for the column product

{
    "filter": {
        "type": "extraction",
        "dimension": "product",
        "value": "bar_1",
        "extractionFn": {
            "type": "lookup",
            "lookup": {
                "type": "map",
                "map": {
                    "product_1": "bar_1",
                    "product_5": "bar_1",
                    "product_3": "bar_1"
                }
            }
        }
    }
}

Search filter

Search filters can be used to filter on partial string matches.

{
    "filter": {
        "type": "search",
        "dimension": "product",
        "query": {
          "type": "insensitive_contains",
          "value": "foo"
        }
    }
}

property	description	required?
type	This String should always be "search".	yes
dimension	The dimension to perform the search over.	yes
query	A JSON object for the type of search. See search query spec for more information.	yes
extractionFn	Extraction function to apply to the dimension	no

The search filter supports the use of extraction functions, see Filtering with Extraction Functions for details.

Search query spec

Contains

property	description	required?
type	This String should always be "contains".	yes
value	A String value to run the search over.	yes
caseSensitive	Whether two string should be compared as case sensitive or not	no (default == false)

Insensitive Contains

property	description	required?
type	This String should always be "insensitive_contains".	yes
value	A String value to run the search over.	yes

Note that an "insensitive_contains" search is equivalent to a "contains" search with "caseSensitive": false (or not provided).

Fragment

property	description	required?
type	This String should always be "fragment".	yes
values	A JSON array of String values to run the search over.	yes
caseSensitive	Whether strings should be compared as case sensitive or not. Default: false(insensitive)	no

In filter

In filter can be used to express the following SQL query:

 SELECT COUNT(*) AS 'Count' FROM `table` WHERE `outlaw` IN ('Good', 'Bad', 'Ugly')

The grammar for a "in" filter is as follows:

{
    "type": "in",
    "dimension": "outlaw",
    "values": ["Good", "Bad", "Ugly"]
}

The "in" filter supports the use of extraction functions, see Filtering with Extraction Functions for details.

If an empty values array is passed to the "in" filter, it will simply return an empty result.

If the dimension is a multi-valued dimension, the "in" filter will return true if one of the dimension values is in the values array.

If the values array contains null, the "in" filter matches null values. This differs from the SQL IN filter, which does not match NULL values.

Like filter

Like filters can be used for basic wildcard searches. They are equivalent to the SQL LIKE operator. Special characters supported are "%" (matches any number of characters) and "_" (matches any one character).

property	type	description	required?
type	String	This should always be "like".	yes
dimension	String	The dimension to filter on	yes
pattern	String	LIKE pattern, such as "foo%" or "___bar".	yes
escape	String	An escape character that can be used to escape special characters.	no
extractionFn	Extraction function	Extraction function to apply to the dimension	no

Like filters support the use of extraction functions, see Filtering with Extraction Functions for details.

This Like filter expresses the condition last_name LIKE "D%" (i.e. last_name starts with "D").

{
    "type": "like",
    "dimension": "last_name",
    "pattern": "D%"
}

Bound filter

Bound filters can be used to filter on ranges of dimension values. It can be used for comparison filtering like greater than, less than, greater than or equal to, less than or equal to, and "between" (if both "lower" and "upper" are set).

property	type	description	required?
type	String	This should always be "bound".	yes
dimension	String	The dimension to filter on	yes
lower	String	The lower bound for the filter	no
upper	String	The upper bound for the filter	no
lowerStrict	Boolean	Perform strict comparison on the lower bound (">" instead of ">=")	no, default: false
upperStrict	Boolean	Perform strict comparison on the upper bound ("<" instead of "<=")	no, default: false
ordering	String	Specifies the sorting order to use when comparing values against the bound. Can be one of the following values: "lexicographic", "alphanumeric", "numeric", "strlen", "version". See Sorting Orders for more details.	no, default: "lexicographic"
extractionFn	Extraction function	Extraction function to apply to the dimension	no

Bound filters support the use of extraction functions, see Filtering with Extraction Functions for details.

The following bound filter expresses the condition 21 <= age <= 31:

{
    "type": "bound",
    "dimension": "age",
    "lower": "21",
    "upper": "31" ,
    "ordering": "numeric"
}

This filter expresses the condition foo <= name <= hoo, using the default lexicographic sorting order.

{
    "type": "bound",
    "dimension": "name",
    "lower": "foo",
    "upper": "hoo"
}

Using strict bounds, this filter expresses the condition 21 < age < 31

{
    "type": "bound",
    "dimension": "age",
    "lower": "21",
    "lowerStrict": true,
    "upper": "31" ,
    "upperStrict": true,
    "ordering": "numeric"
}

The user can also specify a one-sided bound by omitting "upper" or "lower". This filter expresses age < 31.

{
    "type": "bound",
    "dimension": "age",
    "upper": "31" ,
    "upperStrict": true,
    "ordering": "numeric"
}

Likewise, this filter expresses age >= 18

{
    "type": "bound",
    "dimension": "age",
    "lower": "18" ,
    "ordering": "numeric"
}

Interval filter

The Interval filter enables range filtering on columns that contain long millisecond values, with the boundaries specified as ISO 8601 time intervals. It is suitable for the __time column, long metric columns, and dimensions with values that can be parsed as long milliseconds.

This filter converts the ISO 8601 intervals to long millisecond start/end ranges and translates to an OR of Bound filters on those millisecond ranges, with numeric comparison. The Bound filters will have left-closed and right-open matching (i.e., start <= time < end).

property	type	description	required?
type	String	This should always be "interval".	yes
dimension	String	The dimension to filter on	yes
intervals	Array	A JSON array containing ISO-8601 interval strings. This defines the time ranges to filter on.	yes
extractionFn	Extraction function	Extraction function to apply to the dimension	no

The interval filter supports the use of extraction functions, see Filtering with Extraction Functions for details.

If an extraction function is used with this filter, the extraction function should output values that are parseable as long milliseconds.

The following example filters on the time ranges of October 1-7, 2014 and November 15-16, 2014.

{
    "type" : "interval",
    "dimension" : "__time",
    "intervals" : [
      "2014-10-01T00:00:00.000Z/2014-10-07T00:00:00.000Z",
      "2014-11-15T00:00:00.000Z/2014-11-16T00:00:00.000Z"
    ]
}

The filter above is equivalent to the following OR of Bound filters:

{
    "type": "or",
    "fields": [
      {
        "type": "bound",
        "dimension": "__time",
        "lower": "1412121600000",
        "lowerStrict": false,
        "upper": "1412640000000" ,
        "upperStrict": true,
        "ordering": "numeric"
      },
      {
         "type": "bound",
         "dimension": "__time",
         "lower": "1416009600000",
         "lowerStrict": false,
         "upper": "1416096000000" ,
         "upperStrict": true,
         "ordering": "numeric"
      }
    ]
}

Filtering with extraction functions

All filters except the "spatial" filter support extraction functions. An extraction function is defined by setting the "extractionFn" field on a filter. See Extraction function for more details on extraction functions.

If specified, the extraction function will be used to transform input values before the filter is applied. The example below shows a selector filter combined with an extraction function. This filter will transform input values according to the values defined in the lookup map; transformed values will then be matched with the string "bar_1".

Example The following matches dimension values in [product_1, product_3, product_5] for the column product

{
    "filter": {
        "type": "selector",
        "dimension": "product",
        "value": "bar_1",
        "extractionFn": {
            "type": "lookup",
            "lookup": {
                "type": "map",
                "map": {
                    "product_1": "bar_1",
                    "product_5": "bar_1",
                    "product_3": "bar_1"
                }
            }
        }
    }
}

Column types

Druid supports filtering on timestamp, string, long, and float columns.

Note that only string columns have bitmap indexes. Therefore, queries that filter on other column types will need to scan those columns.

Filtering on numeric columns

When filtering on numeric columns, you can write filters as if they were strings. In most cases, your filter will be converted into a numeric predicate and will be applied to the numeric column values directly. In some cases (such as the "regex" filter) the numeric column values will be converted to strings during the scan.

For example, filtering on a specific value, myFloatColumn = 10.1:

"filter": {
  "type": "selector",
  "dimension": "myFloatColumn",
  "value": "10.1"
}

Filtering on a range of values, 10 <= myFloatColumn < 20:

"filter": {
  "type": "bound",
  "dimension": "myFloatColumn",
  "ordering": "numeric",
  "lower": "10",
  "lowerStrict": false,
  "upper": "20",
  "upperStrict": true
}

Filtering on the timestamp column

Query filters can also be applied to the timestamp column. The timestamp column has long millisecond values. To refer to the timestamp column, use the string __time as the dimension name. Like numeric dimensions, timestamp filters should be specified as if the timestamp values were strings.

If the user wishes to interpret the timestamp with a specific format, timezone, or locale, the Time Format Extraction Function is useful.

For example, filtering on a long timestamp value:

"filter": {
  "type": "selector",
  "dimension": "__time",
  "value": "124457387532"
}

Filtering on day of week:

"filter": {
  "type": "selector",
  "dimension": "__time",
  "value": "Friday",
  "extractionFn": {
    "type": "timeFormat",
    "format": "EEEE",
    "timeZone": "America/New_York",
    "locale": "en"
  }
}

Filtering on a set of ISO 8601 intervals:

{
    "type" : "interval",
    "dimension" : "__time",
    "intervals" : [
      "2014-10-01T00:00:00.000Z/2014-10-07T00:00:00.000Z",
      "2014-11-15T00:00:00.000Z/2014-11-16T00:00:00.000Z"
    ]
}

True filter

The true filter is a filter which matches all values. It can be used to temporarily disable other filters without removing the filter.

{ "type" : "true" }

Expression filter

The expression filter allows for the implementation of arbitrary conditions, leveraging the Druid expression system.

This filter allows for more flexibility, but it might be less performant than a combination of the other filters on this page due to the fact that not all filter optimizations are in place yet.

{ 
    "type" : "expression" ,
    "expression" : "((product_type == 42) && (!is_deleted))"
}

See the Druid expression system for more details.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

filters.md

filters.md

Selector filter

Column comparison filter

Regular expression filter

Logical expression filters

AND

OR

NOT

JavaScript filter

Extraction filter

Search filter

Search query spec

Contains

Insensitive Contains

Fragment

In filter

Like filter

Bound filter

Interval filter

Filtering with extraction functions

Column types

Filtering on numeric columns

Filtering on the timestamp column

True filter

Expression filter

Files

filters.md

Latest commit

History

filters.md

File metadata and controls

Selector filter

Column comparison filter

Regular expression filter

Logical expression filters

AND

OR

NOT

JavaScript filter

Extraction filter

Search filter

Search query spec

Contains

Insensitive Contains

Fragment

In filter

Like filter

Bound filter

Interval filter

Filtering with extraction functions

Column types

Filtering on numeric columns

Filtering on the timestamp column

True filter

Expression filter