Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add data type and time zone considerations #6335

Open
wants to merge 6 commits into
base: current
Choose a base branch
from

Conversation

mirnawong1
Copy link
Contributor

@mirnawong1 mirnawong1 commented Oct 22, 2024

Adding clarifying information about time zones and data types per internal slack thread


🚀 Deployment available! Here are the direct links to the updated files:

@mirnawong1 mirnawong1 requested a review from a team as a code owner October 22, 2024 09:11
@github-actions github-actions bot added content Improvements or additions to content Docs team Authored by the Docs team @dbt Labs size: small This change will take 1 to 2 days to address labels Oct 22, 2024
Copy link

vercel bot commented Oct 22, 2024

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Updated (UTC)
docs-getdbt-com ✅ Ready (Inspect) Visit Preview Oct 22, 2024 7:04pm


- Consistent data types — Both your dimension column and the time spine column should use the same data type to allow accurate comparisons. Functions like `DATE_TRUNC` don't change the data type of the input in some databases (like Snowflake). Using different data types can lead to mismatches and inaccurate results.

We recommend using `DATETIME` or `TIMESTAMP` data types for your time dimensions and time spine, as they support all granularities. The `DATE` data type may not support higher granularities like hours or minutes.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: higher -> smaller


We recommend using `DATETIME` or `TIMESTAMP` data types for your time dimensions and time spine, as they support all granularities. The `DATE` data type may not support higher granularities like hours or minutes.

- Consistent time zones — Ensure that all your time-related data uses the same time zone. MetricFlow supports UTC and currently doesn't perform any timezone manipulation. This means inconsistent time zones can cause unexpected results during aggregations and comparisons.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would change this section slightly:
Time zones — MetricFlow currently doesn't perform any timezone manipulation. When working with timezone-aware data, inconsistent time zones may lead to unexpected results during aggregations and comparisons.
Reasons for the change:

  • MF doesn't support UTC per se, it just ignores timezones entirely. UTC is just the most common way that people are likely to store their data for timezone consistency.
  • I don't think we should ask them to ensure all their data uses the same timezone since they might have a valid reason for not doing that. We just want to warn them that there might be weird interactions in that case.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
content Improvements or additions to content Docs team Authored by the Docs team @dbt Labs size: small This change will take 1 to 2 days to address
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants