Skip to content

Commit

Permalink
Merge pull request #456 from dbt-labs/feature/add-source-freshness-data
Browse files Browse the repository at this point in the history
  • Loading branch information
b-per authored Aug 20, 2024
2 parents 68486e5 + 1a89805 commit 087f02d
Show file tree
Hide file tree
Showing 9 changed files with 56 additions and 1 deletion.
12 changes: 12 additions & 0 deletions docs/rules/testing.md
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,18 @@ You can optionally extend this test to apply to more node types (`source`,`snaps

Snapshots should always have a multi-field primary key in order to function, while sources and seeds may not. Depending on your expectations for duplicates and null values, different kinds of primary key tests may be appropriate. Consider your use case carefully.

---
## Missing Source Freshness

`fct_sources_without_freshness` ([source](https://github.com/dbt-labs/dbt-project-evaluator/tree/main/models/marts/tests/fct_sources_without_freshness.sql)) lists every source that does not have a source freshness threshold defined. Any source that does not have one or both of warn_after and error_after will be flagged by this model.

**Reason to Flag**

Source freshness is useful for understanding if your data pipelines are in a healthy state and is a critical component of defining SLAs for your warehouse. Enabling freshness for sources also facilitates [referencing the source freshness results in the selectors](https://docs.getdbt.com/reference/node-selection/methods#the-source_status-method) for a more efficient execution.

**How to Remediate**

Apply a [source freshness block](https://docs.getdbt.com/docs/build/sources#declaring-source-freshness) to the source definition. This can be implemented at either the source name or table name level.
---

## Test Coverage
Expand Down
4 changes: 4 additions & 0 deletions integration_tests/models/staging/source_1/source.yml
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,8 @@ sources:
- name: source_1
description: this is source 1.
schema: real_schema
freshness: # default freshness
warn_after: {count: 12, period: hour}
# database: real_database
tables:
- name: table_1
Expand All @@ -14,6 +16,8 @@ sources:
- name: table_2
- name: table_4
- name: table_5
freshness: # default freshness
warn_after: null
- name: raw_table_5
identifier: table_5

Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
resource_name
source_2.table_3
source_1.table_5
6 changes: 6 additions & 0 deletions integration_tests/seeds/tests/tests_seeds.yml
Original file line number Diff line number Diff line change
Expand Up @@ -32,3 +32,9 @@ seeds:
- intermediate_test_coverage_pct
- marts_test_coverage_pct
- other_test_coverage_pct

- name: test_fct_sources_without_freshness
tests:
- dbt_utils.equality:
name: equality_fct_sources_without_freshness
compare_model: ref('fct_sources_without_freshness')
2 changes: 2 additions & 0 deletions macros/unpack/get_source_values.sql
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,8 @@
"cast(" ~ dbt_project_evaluator.is_not_empty_string(node.description) | trim ~ " as boolean)",
"cast(" ~ node.config.enabled ~ " as boolean)",
wrap_string_with_quotes(node.loaded_at_field | replace("'", "_")),
"cast(" ~ (dbt_project_evaluator.is_not_empty_string(node.freshness.warn_after.count)
or dbt_project_evaluator.is_not_empty_string(node.freshness.error_after.count)) | trim ~ " as boolean)",
wrap_string_with_quotes(node.database),
wrap_string_with_quotes(node.schema),
wrap_string_with_quotes(node.package_name),
Expand Down
1 change: 1 addition & 0 deletions models/marts/core/int_all_graph_resources.sql
Original file line number Diff line number Diff line change
Expand Up @@ -111,6 +111,7 @@ joined as (
unioned_with_calc.source_name, -- NULL for non-source resources
unioned_with_calc.is_source_described,
unioned_with_calc.loaded_at_field,
unioned_with_calc.is_freshness_enabled,
unioned_with_calc.loader,
unioned_with_calc.identifier,
unioned_with_calc.hard_coded_references, -- NULL for non-model resources
Expand Down
21 changes: 21 additions & 0 deletions models/marts/tests/fct_sources_without_freshness.sql
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
with

all_resources as (
select * from {{ ref('int_all_graph_resources') }}
where not is_excluded

),

final as (

select distinct
resource_name

from all_resources
where not is_freshness_enabled and resource_type = 'source'

)

select * from final

{{ filter_exceptions() }}
7 changes: 6 additions & 1 deletion models/marts/tests/testing.yml
Original file line number Diff line number Diff line change
Expand Up @@ -20,4 +20,9 @@ models:
- name: fct_missing_primary_key_tests
description: this model has one record for every model without unique and not null tests configured on a single column
tests:
- is_empty
- is_empty

- name: fct_sources_without_freshness
description: This table shows each source that does not have a source freshness defined, either as a warn or an error
tests:
- is_empty
1 change: 1 addition & 0 deletions models/staging/graph/stg_sources.sql
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,7 @@ select
cast(True as boolean) as is_described,
cast(True as boolean) as is_enabled,
cast(null as {{ dbt_project_evaluator.type_string_dpe() }}) as loaded_at_field,
cast(True as boolean) as is_freshness_enabled,
cast(null as {{ dbt_project_evaluator.type_string_dpe() }}) as database,
cast(null as {{ dbt_project_evaluator.type_string_dpe() }}) as schema,
cast(null as {{ dbt_project_evaluator.type_string_dpe() }}) as package_name,
Expand Down

0 comments on commit 087f02d

Please sign in to comment.