Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bugfix/klaviyo-too-many-partitions #26

Merged
merged 10 commits into from
Sep 16, 2024

Conversation

fivetran-joemarkiewicz
Copy link
Collaborator

@fivetran-joemarkiewicz fivetran-joemarkiewicz commented Sep 5, 2024

PR Overview

This PR will address the following Issue/Feature: v0.7.0

This PR will result in the following new package version: No linked issue as this is a result of an upstream change in the Klaviyo package. See dbt_klaviyo PR 41 for more details.

This is a breaking change due to the upstream changes which require a full refresh for bigquery users.

Please provide the finalized CHANGELOG entry which details the relevant changes included in this PR:

Upstream Klaviyo Breaking Changes (Full refresh required after upgrading)

  • Upstream incremental models within the dbt_klaviyo package running on BigQuery have had the partition_by logic removed from incremental models running on BigQuery. This change affects only BigQuery warehouses and resolves the too many partitions error that some users encountered. The partitioning was also deemed unnecessary for the mentioned models and their downstream references, offering no performance benefit. By removing it, we eliminate both the error risk and an unneeded configuration. Refer to the v0.8.0 dbt_klaviyo release notes for more details. This change applies to the following models:
    • int_klaviyo__event_attribution
    • klaviyo__events

Under the Hood

  • Added consistency and integrity validation tests for the following models:
    • shopify_holistic_reporting__customer_enhanced
    • shopify_holistic_reporting__daily_customer_metrics
    • shopify_holistic_reporting__orders_attribution
  • Cleaned up unnecessary variable configuration within the integration_tests/dbt_project.yml file.

PR Checklist

Basic Validation

Please acknowledge that you have successfully performed the following commands locally:

  • dbt run –full-refresh && dbt test
  • dbt run (if incremental models are present) && dbt test

Before marking this PR as "ready for review" the following have been applied:

  • The appropriate issue has been linked, tagged, and properly assigned
  • All necessary documentation and version upgrades have been applied
  • docs were regenerated (unless this PR does not include any code or yml updates)
  • BuildKite integration tests are passing
  • Detailed validation steps have been provided below

Detailed Validation

Please share any and all of your validation steps:

See below for the results of the validation tests (full refresh and incremental runs)

image

image

If you had to summarize this PR in an emoji, which would it be?

🐟

@fivetran-joemarkiewicz fivetran-joemarkiewicz self-assigned this Sep 5, 2024
packages.yml Outdated Show resolved Hide resolved
@fivetran-joemarkiewicz fivetran-joemarkiewicz marked this pull request as ready for review September 5, 2024 21:40
Copy link
Contributor

@fivetran-catfritz fivetran-catfritz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One small change but the rest lgtm!

CHANGELOG.md Outdated Show resolved Hide resolved
Co-authored-by: fivetran-catfritz <[email protected]>
CHANGELOG.md Outdated Show resolved Hide resolved
CHANGELOG.md Outdated Show resolved Hide resolved
Copy link
Contributor

@fivetran-avinash fivetran-avinash left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@fivetran-joemarkiewicz A few small CHANGELOG revisions.

Before approving, I also noticed that shopify_holistic_reporting__orders_attribution (link) had similar partition by logic that was removed in dbt_klaviyo. It seems created_timestamp is involved in a join in that model, so it might hold more value than the partitions we've removed in dbt_klaviyo, but I'm wondering if the config needs to be modified here to improve performance. Curious as to your thoughts here!

@fivetran-joemarkiewicz
Copy link
Collaborator Author

@fivetran-avinash good callout on the partition present in the orders attribution model. After looking into it, I see we are aggressively partitioning on the timestamp grain for a field we aren't including in our incremental logic. Therefore, removing the partition would be more appropriate in this case. See the updated validations on the latest commit.

image

image

CHANGELOG.md Outdated Show resolved Hide resolved
Copy link
Contributor

@fivetran-avinash fivetran-avinash left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@fivetran-joemarkiewicz Thanks for the quick turnaround! One small change, but otherwise good to go. Make sure to update the packages!

@fivetran-joemarkiewicz fivetran-joemarkiewicz merged commit 33ab35a into main Sep 16, 2024
7 checks passed
@fivetran-joemarkiewicz fivetran-joemarkiewicz deleted the bugfix/klaviyo-too-many-partitions branch September 16, 2024 21:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants