-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
FIX (LAMP_ALL_RT_fields): Use Static Schedule to filter Initial Station of Trip #398
base: main
Are you sure you want to change the base?
Conversation
Coverage of commit
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a few small questions.
" ( " | ||
" SELECT " | ||
" DISTINCT ON (coalesce(static_trips.branch_route_id,static_trips.trunk_route_id),static_route_patterns.direction_id,static_route_patterns.static_version_key) " | ||
" static_route_patterns.direction_id as direction_id " | ||
" , static_route_patterns.representative_trip_id as representative_trip_id " | ||
" , static_trips.trunk_route_id as trunk_route_id " | ||
" , coalesce(static_trips.branch_route_id, static_trips.trunk_route_id) as route_id " | ||
" , static_route_patterns.static_version_key as static_version_key " | ||
" FROM " | ||
" static_route_patterns " | ||
" JOIN static_trips on " | ||
" static_route_patterns.representative_trip_id = static_trips.trip_id " | ||
" AND static_route_patterns.static_version_key = static_trips.static_version_key " | ||
" JOIN static_routes on " | ||
" static_routes.route_id = static_trips.route_id " | ||
" AND static_routes.static_version_key = static_trips.static_version_key " | ||
" WHERE " | ||
" static_routes.route_type < 2 " | ||
" AND (static_route_patterns.route_pattern_typicality = 1 " | ||
" or static_route_patterns.route_pattern_typicality = 5) " | ||
" order by " | ||
" coalesce(static_trips.branch_route_id, static_trips.trunk_route_id), " | ||
" static_route_patterns.direction_id, " | ||
" static_route_patterns.static_version_key, " | ||
" static_route_patterns.route_pattern_typicality desc) as canon_trips " | ||
" JOIN static_stop_times on " | ||
" canon_trips.representative_trip_id = static_stop_times.trip_id " | ||
" AND canon_trips.static_version_key = static_stop_times.static_version_key " | ||
" JOIN static_stops on " | ||
" static_stop_times.stop_id = static_stops.stop_id " | ||
" AND static_stop_times.static_version_key = static_stops.static_version_key ) as drop_station " |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sanity check for my sake.
this gets every station on every trip, then its filtered out to only be the first station for every trip.
we use that for the filtering later rather than the ve.canonical_stop_sequence
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this subquery is the same one used to create canonical_stop_sequence
but we are limiting it to the first parent_station
in each each direction and then doing an anti-join with the drop_flag
field.
" ve.service_date, vt.route_id, vt.direction_id, vt.vehicle_id, vt.start_time" | ||
" ve.service_date, vt.vehicle_id, vt.start_time" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
was the order change requested?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It wasn't requested, but we're only doing this to reduce the parquet file size and dropping the two SORT field fields gave us a speed up in query time with almost no change in file size.
Coverage of commit
|
f" AND vt.service_date >= {max_start_date.strftime('%Y%m%d')} ", | ||
f" AND ve.service_date >= {max_start_date.strftime('%Y%m%d')} ", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
faster because the base of the query is on vehicle events?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🍰
This change drops using
canonical_stop_sequence
for filtering out initial trip stations and instead uses static schedule data to drop any events relating to the initial canonical station of a trip.Asana Task: https://app.asana.com/0/1205827492903547/1207100677882489