-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix recursive-protection
feature flag
#13887
Conversation
Given this experience, we really need a test for the Maybe after @Omega359 's PR here we can add these tests |
@@ -36,7 +36,6 @@ name = "datafusion_common" | |||
path = "src/lib.rs" | |||
|
|||
[features] | |||
default = ["recursive-protection"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
having the feature enabled by default in subcrates meant I think it basically was not possible to disable when using datafusion core crate
datafusion/core/Cargo.toml
Outdated
@@ -69,6 +69,13 @@ pyarrow = ["datafusion-common/pyarrow", "parquet"] | |||
regex_expressions = [ | |||
"datafusion-functions/regex_expressions", | |||
] | |||
recusive-protection = [ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This adds the recursive-protection
feature to the datafusion
(core) crate which then activates the feature in the subcrates
@@ -62,10 +63,11 @@ impl<S: ContextProvider> SqlToRel<'_, S> { | |||
// The functions called from `set_expr_to_plan()` need more than 128KB | |||
// stack in debug builds as investigated in: | |||
// https://github.com/apache/datafusion/pull/13310#discussion_r1836813902 | |||
let min_stack_size = recursive::get_minimum_stack_size(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
here is pretty good evidence that the feature flag is not working as expected, as this code requires recursive
to compile but it wasn't failing the build -- I pulled the logic its own structure
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Late LGTM. StackGuard
is a very clever fix.
Also before I merge this I want to rename the flag to |
avro = ["apache-avro"] | ||
backtrace = [] | ||
pyarrow = ["pyo3", "arrow/pyarrow", "parquet"] | ||
force_hash_collisions = [] | ||
recursive-protection = ["dep:recursive"] | ||
recursive_protection = ["dep:recursive"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also renamed the feature to use an underscore (_
to be consistent with the other feaures)
FYI @buraksenn and @peter-toth |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
One concern is that some users may depend on the feature, so maybe we should highlight the change at the next release log.
THis is a good point -- I believe 44.0.0 will be the first release that will have this feature -- 43 did not have it: https://github.com/apache/datafusion/blob/main/dev/changelog/43.0.0.md |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM but I have not tested with Comet. Thanks @alamb
I have tested so locally -- specifically I checked out the code in apache/datafusion-comet#1154 Changed it to use a local checkout with this PR: --- a/native/Cargo.toml
+++ b/native/Cargo.toml
@@ -39,15 +39,15 @@ arrow-buffer = { version = "53.3.0" }
arrow-data = { version = "53.3.0" }
arrow-schema = { version = "53.3.0" }
parquet = { version = "53.3.0", default-features = false, features = ["experimental"] }
-datafusion = { git = "https://github.com/apache/datafusion.git", rev = "242f45f", default-features = false, features = ["unicode_expressions", "crypto_expressions"] }
-datafusion-common = { git = "https://github.com/apache/datafusion.git", rev = "242f45f", default-features = false }
-datafusion-functions = { git = "https://github.com/apache/datafusion.git", rev = "242f45f", default-features = false, features = ["crypto_expressions"] }
-datafusion-functions-nested = { git = "https://github.com/apache/datafusion.git", rev = "242f45f", default-features = false }
-datafusion-expr = { git = "https://github.com/apache/datafusion.git", rev = "242f45f", default-features = false }
-datafusion-expr-common = { git = "https://github.com/apache/datafusion.git", rev = "242f45f", default-features = false }
-datafusion-execution = { git = "https://github.com/apache/datafusion.git", rev = "242f45f", default-features = false }
-datafusion-physical-plan = { git = "https://github.com/apache/datafusion.git", rev = "242f45f", default-features = false }
-datafusion-physical-expr = { git = "https://github.com/apache/datafusion.git", rev = "242f45f", default-features = false }
+datafusion = { path = "/Users/andrewlamb/Software/datafusion/datafusion/core", default-features = false, features = ["unicode_expressions", "crypto_expressions"] }
+datafusion-common = { path = "/Users/andrewlamb/Software/datafusion/datafusion/common", default-features = false }
+datafusion-functions = { path = "/Users/andrewlamb/Software/datafusion/datafusion/functions", default-features = false, features = ["crypto_expressions"] }
+datafusion-functions-nested = { path = "/Users/andrewlamb/Software/datafusion/datafusion/functions-nested", default-features = false }
+datafusion-expr = { path = "/Users/andrewlamb/Software/datafusion/datafusion/expr", default-features = false }
+datafusion-expr-common = { path = "/Users/andrewlamb/Software/datafusion/datafusion/expr-common", default-features = false }
+datafusion-execution = { path = "/Users/andrewlamb/Software/datafusion/datafusion/execution", default-features = false }
+datafusion-physical-plan = { path = "/Users/andrewlamb/Software/datafusion/datafusion/core", default-features = false }
+datafusion-physical-expr = { path = "/Users/andrewlamb/Software/datafusion/datafusion/physical-expr", default-features = false }
datafusion-comet-spark-expr = { path = "spark-expr", version = "0.5.0" }
datafusion-comet-proto = { path = "proto", version = "0.5.0" }
chrono = { version = "0.4", default-features = false, features = ["clock"] } and verified that this command passes: MIRIFLAGS="-Zmiri-disable-isolation" cargo +nightly miri test -p datafusion-comet -- test_unpack_dictionary_primitive |
Thank you for the reviews @andygrove and @xudong963 |
Which issue does this PR close?
recursive
dependency an optional feature #13766Rationale for this change
The recursive-protect flag was added in #13778, but when @andygrove tried to disable it in comet, recursive was still enabled.
I am not an expert in the crate feature mechanism, but the feature appears to get activated if a downstream crate like comet uses a crate like
datafusion
(the core crate) that passes the feature throughWhat changes are included in this PR?
recursive-protection
torecursive_protection
to be consistent with other flagsAre these changes tested?
Here is an example showing how I tested to verify that the feature flagging still works
I also verified that the MIRI tests pass on apache/datafusion-comet#1154 with this fix
Are there any user-facing changes?