-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add config option skip_physical_aggregate_schema_check
#13176
Conversation
skip_physical_aggregate_schema_check
I plan to backport this option to the 42 branch as well defaulted to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm thanks @alamb
See additional context here: #13065 (comment) |
/// When set to true, skips verifying that the schema produced by | ||
/// planning the input of `LogicalPlan::Aggregate` exactly matches the | ||
/// schema of the input plan. | ||
/// |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
wondering if blank lines can break the formatting and tests eventually failed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the reason the tests failed before is that I forgot to update the information_schema.slt
test
Thanks again @comphead and @jayzhan211 |
* Add option to skip physical aggregate check * tweak wording * update test
PR to backport #13189 |
Looking at #13190 If we relax the check around this, how many problems would remain to be addressed? |
It seems to me the check is finding real things (where the schema of the However, it also seems to me like maybe the discrepancy isn't a huge deal (so maybe it shouldn't be an error 🤔 ) |
Real things like type mismatch, or real things like stricter non-null guarantees (which may be considered an optimization, not an error)? |
What I know of was related to non-null guarantees as well as user defined metadata (on the Schema and the Field) |
Which issue does this PR close?
Closes #13065
Rationale for this change
Some plans that used to run in DataFusion 41.0.0 started erroring in DataFusion 42.0.0 due to a new check that was added.
The delta-rs 42.0.0 upgrade has it this delta-io/delta-rs#2886 (comment), and we hit the same thing in InfluxDB IOx as well.
To help unblock upgrades, I would like to permit users to optionally disable this check.
This config flag is meant to be temporary while we fix all the underlying bugs.
Background: a new check for exact schema equality added in #11989 (released in
DataFusion 42.0.0
). This has found quite a few bugs where the schema doesn't quite match due to nullability or metadata mismatch -- see the list on #12733. There is at least one more bug @wiedld and I are tracking down.After upgrading to DataFusion 42 some of these plans now error (due to bugs in DataFusion, see list on #12733)
What changes are included in this PR?
Add a
skip_physical_aggregate_schema_check
option to disable this check so downstream users can workaround any issues they hitAre these changes tested?
The doc is tested by CI. Since this a workaround for bugs
Are there any user-facing changes?
A new config option