Skip to content

Refactor Sessionization Logic

Compare
Choose a tag to compare
@fivetran-jamie fivetran-jamie released this 05 Oct 22:54
7b05388

Howdy!

This release addresses two kinds of issues:

  1. Some BigQuery users were seeing resources exceeded errors in the mixpanel__sessions model, due to some hefty window functions.
    Fix: date_day is now also incorporated as a partition clause in this model's window functions (example), similar to the de-duping logic. This means that an event that occurs at 11:59pm will NEVER be sessionized with an event that occurs two minutes later at 12:01am.
  2. Some folks' event data comes with null device_ids. Device_id was previously relied on as a partition clause in the window functions of mixpanel__sessions, which caused events to never be batched into sessions in this scenario.
    Fix: the sessions model now coalesces device_id and people_id into a user_id column, which is employed as a partition clause in the model's window functions.