Skip to content

Commit

Permalink
Merge pull request #54 from fivetran/MagicBot/add-union-schema
Browse files Browse the repository at this point in the history
Feature: Union schema compatibility
  • Loading branch information
fivetran-catfritz authored Oct 12, 2023
2 parents f7156ca + 4335611 commit dc5dada
Show file tree
Hide file tree
Showing 25 changed files with 212 additions and 48 deletions.
2 changes: 1 addition & 1 deletion .buildkite/pipeline.yml
Original file line number Diff line number Diff line change
Expand Up @@ -58,7 +58,7 @@ steps:
commands: |
bash .buildkite/scripts/run_models.sh redshift
- label: ":bricks: Run Tests - Databricks"
- label: ":databricks: Run Tests - Databricks"
key: "run_dbt_databricks"
plugins:
- docker#v3.13.0:
Expand Down
30 changes: 28 additions & 2 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,9 +1,35 @@
# dbt_linkedin_source v0.UPDATE.UPDATE
# dbt_linkedin_source v0.8.0
[PR #54](https://github.com/fivetran/dbt_linkedin_source/pull/54) includes the following updates:

## Under the Hood:
## Breaking changes
- Updated materializations of non-`tmp` staging models from views to tables. This is to bring the materializations into alignment with other ad reporting packages and eliminate errors in Redshift.
- Updated the name of the source created by this package from `linkedin` to `linkedin_ads`. This was to bring the naming used in this package in alignment with our other ad packages and for compatibility with the union schema feature.
- ❗ If you are using this source, you will need to update the name.
- Updated the following identifiers for consistency with the source name and compatibility with the union schema feature:

| current | previous |
|----------|----------|
| linkedin_ads_account_history_identifier | linkedin_account_history_identifier
| linkedin_ads_ad_analytics_by_creative_identifier | linkedin_ad_analytics_by_creative_identifier
| linkedin_ads_campaign_group_history_identifier | linkedin_campaign_group_history_identifier
| linkedin_ads_campaign_history_identifier | linkedin_campaign_history_identifier
| linkedin_ads_creative_history_identifier | linkedin_creative_history_identifier
| linkedin_ads_ad_analytics_by_campaign_identifier | linkedin_ad_analytics_by_campaign_identifier

- If you are using the previous identifier, be sure to update to the current version!

## Feature update 🎉
- Unioning capability! This adds the ability to union source data from multiple linkedin connectors. Refer to the [Union Multiple Connectors README section](https://github.com/fivetran/dbt_linkedin_source/blob/main/README.md#union-multiple-connectors) for more details.

## Under the hood 🚘
- Updated tmp models to union source data using the `fivetran_utils.union_data` macro.
- To distinguish which source each field comes from, added `source_relation` column in each staging model and applied the `fivetran_utils.source_relation` macro.
- Updated tests to account for the new `source_relation` column.

[PR #51](https://github.com/fivetran/dbt_linkedin_source/pull/51) includes the following updates:
- Incorporated the new `fivetran_utils.drop_schemas_automation` macro into the end of each Buildkite integration test job.
- Updated the pull request [templates](/.github).

# dbt_linkedin_source v0.7.0
## 🚨 Breaking Changes 🚨
Due to Linkedin Ads API [change in January 2023](https://learn.microsoft.com/en-us/linkedin/marketing/integrations/recent-changes?view=li-lms-2022-12#january-2023), there have been updates in the Linkedin Ads Fivetran Connector and therefore, updates to this Linkedin package.
Expand Down
16 changes: 12 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -47,7 +47,7 @@ If you are **not** using the [Linkedin transformation package](https://github.c
# packages.yml
packages:
- package: fivetran/linkedin_source
version: [">=0.7.0", "<0.8.0"]
version: [">=0.8.0", "<0.9.0"]
```

## Step 3: Define database and schema variables
Expand All @@ -61,7 +61,17 @@ vars:
```

## (Optional) Step 4: Additional configurations
<details><summary>Expand for configurations</summary>
### Union multiple connectors
If you have multiple linkedin connectors in Fivetran and would like to use this package on all of them simultaneously, we have provided functionality to do so. The package will union all of the data together and pass the unioned table into the transformations. You will be able to see which source it came from in the `source_relation` column of each model. To use this functionality, you will need to set either the `linkedin_ads_union_schemas` OR `linkedin_ads_union_databases` variables (cannot do both) in your root `dbt_project.yml` file:

```yml
vars:
linkedin_ads_union_schemas: ['linkedin_usa','linkedin_canada'] # use this if the data is in different schemas/datasets of the same database/project
linkedin_ads_union_databases: ['linkedin_usa','linkedin_canada'] # use this if the data is in different databases/projects but uses the same schema name
```
Please be aware that the native `source.yml` connection set up in the package will not function when the union schema/database feature is utilized. Although the data will be correctly combined, you will not observe the sources linked to the package models in the Directed Acyclic Graph (DAG). This happens because the package includes only one defined `source.yml`.

To connect your multiple schema/database sources to the package models, follow the steps outlined in the [Union Data Defined Sources Configuration](https://github.com/fivetran/dbt_fivetran_utils/tree/releases/v0.4.latest#union_data-source) section of the Fivetran Utils documentation for the union_data macro. This will ensure a proper configuration and correct visualization of connections in the DAG.

### Switching to Local Currency
Additionally, the package allows you to select whether you want to add in costs in USD or the local currency of the ad. By default, the package uses USD. If you would like to have costs in the local currency, add the following variable to your `dbt_project.yml` file:
Expand Down Expand Up @@ -112,8 +122,6 @@ vars:
linkedin_ads_<default_source_table_name>_identifier: your_table_name
```

</details>

## (Optional) Step 5: Orchestrate your models with Fivetran Transformations for dbt Core™
<details><summary>Expand for more details</summary>

Expand Down
18 changes: 10 additions & 8 deletions dbt_project.yml
Original file line number Diff line number Diff line change
@@ -1,18 +1,20 @@
name: 'linkedin_source'
version: '0.7.0'
version: '0.8.0'
config-version: 2
require-dbt-version: [">=1.3.0", "<2.0.0"]
models:
linkedin_source:
+materialized: view
tmp:
+materialized: view
+schema: linkedin_ads_source
+materialized: table
vars:
linkedin_source:
account_history: "{{ source('linkedin','account_history') }}"
ad_analytics_by_creative: "{{ source('linkedin','ad_analytics_by_creative') }}"
campaign_group_history: "{{ source('linkedin','campaign_group_history') }}"
campaign_history: "{{ source('linkedin','campaign_history') }}"
creative_history: "{{ source('linkedin','creative_history') }}"
ad_analytics_by_campaign: "{{ source('linkedin', 'ad_analytics_by_campaign') }}"
account_history: "{{ source('linkedin_ads','account_history') }}"
ad_analytics_by_creative: "{{ source('linkedin_ads','ad_analytics_by_creative') }}"
campaign_group_history: "{{ source('linkedin_ads','campaign_group_history') }}"
campaign_history: "{{ source('linkedin_ads','campaign_history') }}"
creative_history: "{{ source('linkedin_ads','creative_history') }}"
ad_analytics_by_campaign: "{{ source('linkedin_ads', 'ad_analytics_by_campaign') }}"
linkedin_ads__campaign_passthrough_metrics: []
linkedin_ads__creative_passthrough_metrics: []
2 changes: 1 addition & 1 deletion docs/catalog.json

Large diffs are not rendered by default.

8 changes: 4 additions & 4 deletions docs/index.html

Large diffs are not rendered by default.

2 changes: 1 addition & 1 deletion docs/manifest.json

Large diffs are not rendered by default.

2 changes: 1 addition & 1 deletion docs/run_results.json

Large diffs are not rendered by default.

12 changes: 6 additions & 6 deletions integration_tests/ci/sample.profiles.yml
Original file line number Diff line number Diff line change
Expand Up @@ -16,13 +16,13 @@ integration_tests:
pass: "{{ env_var('CI_REDSHIFT_DBT_PASS') }}"
dbname: "{{ env_var('CI_REDSHIFT_DBT_DBNAME') }}"
port: 5439
schema: linkedin_source_integration_tests_1
schema: linkedin_source_integration_tests_3
threads: 8
bigquery:
type: bigquery
method: service-account-json
project: 'dbt-package-testing'
schema: linkedin_source_integration_tests_1
schema: linkedin_source_integration_tests_3
threads: 8
keyfile_json: "{{ env_var('GCLOUD_SERVICE_KEY') | as_native }}"
snowflake:
Expand All @@ -33,7 +33,7 @@ integration_tests:
role: "{{ env_var('CI_SNOWFLAKE_DBT_ROLE') }}"
database: "{{ env_var('CI_SNOWFLAKE_DBT_DATABASE') }}"
warehouse: "{{ env_var('CI_SNOWFLAKE_DBT_WAREHOUSE') }}"
schema: linkedin_source_integration_tests_1
schema: linkedin_source_integration_tests_3
threads: 8
postgres:
type: postgres
Expand All @@ -42,13 +42,13 @@ integration_tests:
pass: "{{ env_var('CI_POSTGRES_DBT_PASS') }}"
dbname: "{{ env_var('CI_POSTGRES_DBT_DBNAME') }}"
port: 5432
schema: linkedin_source_integration_tests_1
schema: linkedin_source_integration_tests_3
threads: 8
databricks:
catalog: "{{ env_var('CI_DATABRICKS_DBT_CATALOG') }}"
host: "{{ env_var('CI_DATABRICKS_DBT_HOST') }}"
http_path: "{{ env_var('CI_DATABRICKS_DBT_HTTP_PATH') }}"
schema: linkedin_source_integration_tests_1
threads: 2
schema: linkedin_source_integration_tests_3
threads: 8
token: "{{ env_var('CI_DATABRICKS_DBT_TOKEN') }}"
type: databricks
4 changes: 2 additions & 2 deletions integration_tests/dbt_project.yml
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
name: 'linkedin_source_integration_tests'
version: '0.7.0'
version: '0.8.0'
profile: 'integration_tests'
config-version: 2

Expand All @@ -12,7 +12,7 @@ vars:
linkedin_ads_creative_history_identifier: "linkedin_creative_history_data"
linkedin_ads_ad_analytics_by_campaign_identifier: "linkedin_ad_analytics_by_campaign_data"

linkedin_ads_schema: linkedin_source_integration_tests_1
linkedin_ads_schema: linkedin_source_integration_tests_3

seeds:
linkedin_source_integration_tests:
Expand Down
3 changes: 3 additions & 0 deletions models/docs.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
{% docs source_relation %}
The source of the record if the unioning functionality is being used. If not this field will be empty.
{% enddocs %}
2 changes: 1 addition & 1 deletion models/src_linkedin.yml
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
version: 2

sources:
- name: linkedin
- name: linkedin_ads # This source will only be used if you are using a single linkedin source connector. If multiple sources are being unioned, their tables will be directly referenced via adapter.get_relation.
schema: "{{ var('linkedin_ads_schema', 'linkedin_ads') }}"
database: "{% if target.type != 'spark'%}{{ var('linkedin_ads_database', target.database) }}{% endif %}"

Expand Down
25 changes: 24 additions & 1 deletion models/stg_linkedin.yml
Original file line number Diff line number Diff line change
Expand Up @@ -7,9 +7,13 @@ models:
tests:
- dbt_utils.unique_combination_of_columns:
combination_of_columns:
- source_relation
- date_day
- creative_id
columns:
- name: source_relation
description: "{{ doc('source_relation') }}"

- name: creative_id
description: The ID of the related creative.
tests:
Expand All @@ -34,9 +38,13 @@ models:
tests:
- dbt_utils.unique_combination_of_columns:
combination_of_columns:
- source_relation
- date_day
- campaign_id
columns:
- name: source_relation
description: "{{ doc('source_relation') }}"

- name: campaign_id
description: The ID of the related creative.
tests:
Expand All @@ -59,6 +67,9 @@ models:
- name: stg_linkedin_ads__creative_history
description: Each record represents a 'version' of a creative.
columns:
- name: source_relation
description: "{{ doc('source_relation') }}"

- name: creative_id
description: Unique internal ID representing the creative.
tests:
Expand Down Expand Up @@ -111,16 +122,20 @@ models:
description: The utm_term parameter of the ad, extracted from the `click_uri`.

- name: is_latest_version
description: Boolean of whether the record is the latest version of the cretive.
description: Boolean of whether the record is the latest version of the creative.

- name: stg_linkedin_ads__campaign_history
description: Each record represents a 'version' of a campaign.
tests:
- dbt_utils.unique_combination_of_columns:
combination_of_columns:
- source_relation
- version_tag
- campaign_id
columns:
- name: source_relation
description: "{{ doc('source_relation') }}"

- name: campaign_id
description: Unique internal ID representing the campaign.
tests:
Expand Down Expand Up @@ -238,9 +253,13 @@ models:
tests:
- dbt_utils.unique_combination_of_columns:
combination_of_columns:
- source_relation
- last_modified_at
- campaign_group_id
columns:
- name: source_relation
description: "{{ doc('source_relation') }}"

- name: campaign_group_id
description: Unique internal ID representing the campaign group.
tests:
Expand Down Expand Up @@ -288,9 +307,13 @@ models:
tests:
- dbt_utils.unique_combination_of_columns:
combination_of_columns:
- source_relation
- account_id
- version_tag
columns:
- name: source_relation
description: "{{ doc('source_relation') }}"

- name: account_id
description: Unique internal ID representing the account.
tests:
Expand Down
9 changes: 8 additions & 1 deletion models/stg_linkedin_ads__account_history.sql
Original file line number Diff line number Diff line change
Expand Up @@ -14,11 +14,18 @@ with base as (
staging_columns=get_account_history_columns()
)
}}

{{ fivetran_utils.source_relation(
union_schema_variable='linkedin_ads_union_schemas',
union_database_variable='linkedin_ads_union_databases')
}}

from base

), fields as (

select
source_relation,
id as account_id,
name as account_name,
currency,
Expand All @@ -27,7 +34,7 @@ with base as (
type,
cast(last_modified_time as {{ dbt.type_timestamp() }}) as last_modified_at,
cast(created_time as {{ dbt.type_timestamp() }}) as created_at,
row_number() over (partition by id order by last_modified_time desc) = 1 as is_latest_version
row_number() over (partition by source_relation, id order by last_modified_time desc) = 1 as is_latest_version

from macro

Expand Down
7 changes: 7 additions & 0 deletions models/stg_linkedin_ads__ad_analytics_by_campaign.sql
Original file line number Diff line number Diff line change
Expand Up @@ -15,12 +15,19 @@ macro as (
staging_columns=get_ad_analytics_by_campaign_columns()
)
}}

{{ fivetran_utils.source_relation(
union_schema_variable='linkedin_ads_union_schemas',
union_database_variable='linkedin_ads_union_databases')
}}

from base
),

fields as (

select
source_relation,
{{ dbt.date_trunc('day', 'day') }} as date_day,
campaign_id,
clicks,
Expand Down
7 changes: 7 additions & 0 deletions models/stg_linkedin_ads__ad_analytics_by_creative.sql
Original file line number Diff line number Diff line change
Expand Up @@ -14,11 +14,18 @@ with base as (
staging_columns=get_ad_analytics_by_creative_columns()
)
}}

{{ fivetran_utils.source_relation(
union_schema_variable='linkedin_ads_union_schemas',
union_database_variable='linkedin_ads_union_databases')
}}

from base

), fields as (

select
source_relation,
{{ dbt.date_trunc('day', 'day') }} as date_day,
creative_id,
clicks,
Expand Down
9 changes: 8 additions & 1 deletion models/stg_linkedin_ads__campaign_group_history.sql
Original file line number Diff line number Diff line change
Expand Up @@ -14,11 +14,18 @@ with base as (
staging_columns=get_campaign_group_history_columns()
)
}}

{{ fivetran_utils.source_relation(
union_schema_variable='linkedin_ads_union_schemas',
union_database_variable='linkedin_ads_union_databases')
}}

from base

), fields as (

select
source_relation,
id as campaign_group_id,
name as campaign_group_name,
account_id,
Expand All @@ -28,7 +35,7 @@ with base as (
cast(run_schedule_end as {{ dbt.type_timestamp() }}) as run_schedule_end_at,
cast(last_modified_time as {{ dbt.type_timestamp() }}) as last_modified_at,
cast(created_time as {{ dbt.type_timestamp() }}) as created_at,
row_number() over (partition by id order by last_modified_time desc) = 1 as is_latest_version
row_number() over (partition by source_relation, id order by last_modified_time desc) = 1 as is_latest_version

from macro

Expand Down
9 changes: 8 additions & 1 deletion models/stg_linkedin_ads__campaign_history.sql
Original file line number Diff line number Diff line change
Expand Up @@ -14,11 +14,18 @@ with base as (
staging_columns=get_campaign_history_columns()
)
}}

{{ fivetran_utils.source_relation(
union_schema_variable='linkedin_ads_union_schemas',
union_database_variable='linkedin_ads_union_databases')
}}

from base

), fields as (

select
source_relation,
id as campaign_id,
name as campaign_name,
cast(version_tag as numeric) as version_tag,
Expand All @@ -43,7 +50,7 @@ with base as (
cast(run_schedule_end as {{ dbt.type_timestamp() }}) as run_schedule_end_at,
cast(last_modified_time as {{ dbt.type_timestamp() }}) as last_modified_at,
cast(created_time as {{ dbt.type_timestamp() }}) as created_at,
row_number() over (partition by id order by last_modified_time desc) = 1 as is_latest_version
row_number() over (partition by source_relation, id order by last_modified_time desc) = 1 as is_latest_version

from macro

Expand Down
Loading

0 comments on commit dc5dada

Please sign in to comment.