-
Notifications
You must be signed in to change notification settings - Fork 244
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Safety data schema #2735
base: dev
Are you sure you want to change the base?
Safety data schema #2735
Conversation
Codecov ReportAttention: Patch coverage is
❗ Your organization needs to install the Codecov GitHub app to enable full functionality. Additional details and impacted files@@ Coverage Diff @@
## dev #2735 +/- ##
==========================================
- Coverage 53.22% 52.75% -0.47%
==========================================
Files 119 122 +3
Lines 9924 10256 +332
==========================================
+ Hits 5282 5411 +129
- Misses 4642 4845 +203
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
e9cc41d
to
005fee4
Compare
Hi @atalyaalon , @citizen-dror , |
@ziv17 this pr looks like a great start! I suggest the following changes before merging it: Tables structure: We have 2 options -
Three data tables: The safety data tables will 3 data tables rather than 2 (added vehicle): safety_data_accident, safety_data_involved, safety_data_vehicles. Each of these tables will contain the following fields for join to hold: We can use foreign key constraints (hence relationships) and index these for each table for a better performance, like this example The safety_data_accident, safety_data_involved, safety_data_vehicles will be the facts table, and we can also create relationships with dimension tables (Road Type, Accident Type, etc) meaning we'll have foreign key constraints for each dimension table (using star schema) Note that I considered using materialized views, but they cannot directly maintain relationships. One flat table In addition I suggest that this table will have relationships with dimension tables (Road Type, Accident Type, etc) meaning we'll have foreign key constraints for each dimension table (using star schema)
3.1 value fields to be joined with Dimension tables for hebrew/english: 3.2 fields w/o hebrew: 3.3 id fields: 3.4 Fields to be calculated (Calculation will be sent in a different issue) Note - the fields needed to create relationships with dimension tables, in order to query the hebrew fields are:indexed in all tables: |
Hi @atalyaalon , |
The are suffice. We don't need this field in safety data. (Perhaps in the future we'll remove it from the "hebrew" tables, however right now it's used in various queries so we need to make sure we keep functionality beforehand, and that it's not needed by the data team) |
Please review the suggested schema. If the direction is OK I will prepare poc tables and prepare POC of the
/involved
(replacing the/accidents
) query.