-
Notifications
You must be signed in to change notification settings - Fork 662
fix(postgres): Fix postgres asof joins to consider predicates correctly #11580
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
f42e29a
to
2d3bde0
Compare
2d3bde0
to
8bbbb38
Compare
9eeb5d0
to
592b423
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not that familiar with the details of how asof joins work, but I don't see any issues here.
@@ -84,6 +84,7 @@ def time_keyed_right(time_keyed_df2): | |||
) | |||
@pytest.mark.notyet( | |||
[ | |||
"clickhouse", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this because it also truncates to seconds, as in the test below? Does this PR therefore regress behavior for clickhouse (eg a user could be relying on some behavior, and we break it here)? Did you think if there would be a way to avoid this regression? If it's too hard, I could be fine with the regression, but at least worth talking about trying to avoid a regression.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this happens because of second truncation, and there is no regression in functionality for clickhouse: we were just not testing this case before.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Apologies, that reason is not correct. This test fails on clickhouse because it does not support asof joins that do not also have an equality predicate on the join. It seems a bit arbitrary, but it might be a way to improve performance in their internal implementation. The case with a noop equality predicate (where all rows have the same key value) is in test_noop_keyed_asof_join
Description of changes
On the implementation for ASOF JOIN for postgres in #11024 there were a couple of bugs that we have found through real-world usage:
on=None
, which leads to invalid syntax for postgres.This PR moves all the predicates into the subquery and sets the join's
on
clause toTRUE
because the join will only be a one-to-one or a one-to-zero join.The above had not been caught by the existing asof join tests, so I modified the
test_asof_join
test so that there are no predicates in addition to the inequality condition, and added atest_keyed_asof_join
with a key but no tolerance since the tolerance was obfuscating the second bug.