Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix recursive-protection feature flag #13887

Merged
merged 4 commits into from
Dec 24, 2024

Conversation

alamb
Copy link
Contributor

@alamb alamb commented Dec 23, 2024

Which issue does this PR close?

Rationale for this change

The recursive-protect flag was added in #13778, but when @andygrove tried to disable it in comet, recursive was still enabled.

I am not an expert in the crate feature mechanism, but the feature appears to get activated if a downstream crate like comet uses a crate like datafusion (the core crate) that passes the feature through

What changes are included in this PR?

  1. Change feature flag to enable recursive protection at the main crate level
  2. rename recursive-protection to recursive_protection to be consistent with other flags

Are these changes tested?

  1. Make feature not enabled by default in sub crates
  2. Add feature to the datafusion core crate
  3. move another stack protection mechanism into a guarded section

Here is an example showing how I tested to verify that the feature flagging still works

$ cargo test --features=recursive-protection -p datafusion-common --lib | grep test_large_tree
    Finished `test` profile [unoptimized + debuginfo] target(s) in 0.08s
     Running unittests src/lib.rs (target/debug/deps/datafusion_common-3dd7d455d375513b)
$ cargo test -p datafusion-common --lib | grep test_large_tree
   Compiling datafusion-common v43.0.0 (/Users/andrewlamb/Software/datafusion/datafusion/common)
    Finished `test` profile [unoptimized + debuginfo] target(s) in 1.32s
     Running unittests src/lib.rs (target/debug/deps/datafusion_common-a4d89ac32a58d472)
$

I also verified that the MIRI tests pass on apache/datafusion-comet#1154 with this fix

Are there any user-facing changes?

@github-actions github-actions bot added sql SQL Planner logical-expr Logical plan and expressions optimizer Optimizer rules core Core DataFusion crate common Related to common crate labels Dec 23, 2024
@alamb
Copy link
Contributor Author

alamb commented Dec 23, 2024

Given this experience, we really need a test for the recursive-protection feature flag / compiling datafusion with various feature flags. Maybe someone can figure out how to do so, following the model of arrow-rs:

https://github.com/apache/arrow-rs/blob/2c84f243b882eff69806cd7294d38bf422fdb24a/.github/workflows/arrow.yml#L63-L97

Maybe after @Omega359 's PR here we can add these tests

@@ -36,7 +36,6 @@ name = "datafusion_common"
path = "src/lib.rs"

[features]
default = ["recursive-protection"]
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

having the feature enabled by default in subcrates meant I think it basically was not possible to disable when using datafusion core crate

@@ -69,6 +69,13 @@ pyarrow = ["datafusion-common/pyarrow", "parquet"]
regex_expressions = [
"datafusion-functions/regex_expressions",
]
recusive-protection = [
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This adds the recursive-protection feature to the datafusion (core) crate which then activates the feature in the subcrates

@@ -62,10 +63,11 @@ impl<S: ContextProvider> SqlToRel<'_, S> {
// The functions called from `set_expr_to_plan()` need more than 128KB
// stack in debug builds as investigated in:
// https://github.com/apache/datafusion/pull/13310#discussion_r1836813902
let min_stack_size = recursive::get_minimum_stack_size();
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

here is pretty good evidence that the feature flag is not working as expected, as this code requires recursive to compile but it wasn't failing the build -- I pulled the logic its own structure

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Late LGTM. StackGuard is a very clever fix.

@alamb alamb mentioned this pull request Dec 23, 2024
10 tasks
@alamb
Copy link
Contributor Author

alamb commented Dec 23, 2024

Also before I merge this I want to rename the flag to recursive_protection (underscore) to be consistent with the rest

avro = ["apache-avro"]
backtrace = []
pyarrow = ["pyo3", "arrow/pyarrow", "parquet"]
force_hash_collisions = []
recursive-protection = ["dep:recursive"]
recursive_protection = ["dep:recursive"]
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also renamed the feature to use an underscore (_ to be consistent with the other feaures)

@github-actions github-actions bot added the documentation Improvements or additions to documentation label Dec 23, 2024
@alamb alamb requested a review from andygrove December 24, 2024 13:01
@alamb
Copy link
Contributor Author

alamb commented Dec 24, 2024

FYI @buraksenn and @peter-toth

Copy link
Member

@xudong963 xudong963 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

One concern is that some users may depend on the feature, so maybe we should highlight the change at the next release log.

@alamb
Copy link
Contributor Author

alamb commented Dec 24, 2024

LGTM.

One concern is that some users may depend on the feature, so maybe we should highlight the change at the next release log.

THis is a good point -- I believe 44.0.0 will be the first release that will have this feature -- 43 did not have it: https://github.com/apache/datafusion/blob/main/dev/changelog/43.0.0.md

Copy link
Member

@andygrove andygrove left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM but I have not tested with Comet. Thanks @alamb

@alamb
Copy link
Contributor Author

alamb commented Dec 24, 2024

LGTM but I have not tested with Comet. Thanks @alamb

I have tested so locally -- specifically I checked out the code in apache/datafusion-comet#1154

Changed it to use a local checkout with this PR:

--- a/native/Cargo.toml
+++ b/native/Cargo.toml
@@ -39,15 +39,15 @@ arrow-buffer = { version = "53.3.0" }
 arrow-data = { version = "53.3.0" }
 arrow-schema = { version = "53.3.0" }
 parquet = { version = "53.3.0", default-features = false, features = ["experimental"] }
-datafusion = { git = "https://github.com/apache/datafusion.git", rev = "242f45f", default-features = false, features = ["unicode_expressions", "crypto_expressions"] }
-datafusion-common = { git = "https://github.com/apache/datafusion.git", rev = "242f45f", default-features = false }
-datafusion-functions = { git = "https://github.com/apache/datafusion.git", rev = "242f45f", default-features = false, features = ["crypto_expressions"] }
-datafusion-functions-nested = { git = "https://github.com/apache/datafusion.git", rev = "242f45f", default-features = false }
-datafusion-expr = { git = "https://github.com/apache/datafusion.git", rev = "242f45f", default-features = false }
-datafusion-expr-common = { git = "https://github.com/apache/datafusion.git", rev = "242f45f", default-features = false }
-datafusion-execution = { git = "https://github.com/apache/datafusion.git", rev = "242f45f", default-features = false }
-datafusion-physical-plan = { git = "https://github.com/apache/datafusion.git", rev = "242f45f", default-features = false }
-datafusion-physical-expr = { git = "https://github.com/apache/datafusion.git", rev = "242f45f", default-features = false }
+datafusion = { path = "/Users/andrewlamb/Software/datafusion/datafusion/core", default-features = false, features = ["unicode_expressions", "crypto_expressions"] }
+datafusion-common = { path = "/Users/andrewlamb/Software/datafusion/datafusion/common", default-features = false }
+datafusion-functions = { path = "/Users/andrewlamb/Software/datafusion/datafusion/functions", default-features = false, features = ["crypto_expressions"] }
+datafusion-functions-nested = { path = "/Users/andrewlamb/Software/datafusion/datafusion/functions-nested", default-features = false }
+datafusion-expr = { path = "/Users/andrewlamb/Software/datafusion/datafusion/expr", default-features = false }
+datafusion-expr-common = { path = "/Users/andrewlamb/Software/datafusion/datafusion/expr-common", default-features = false }
+datafusion-execution = { path = "/Users/andrewlamb/Software/datafusion/datafusion/execution", default-features = false }
+datafusion-physical-plan = { path = "/Users/andrewlamb/Software/datafusion/datafusion/core", default-features = false }
+datafusion-physical-expr = { path = "/Users/andrewlamb/Software/datafusion/datafusion/physical-expr", default-features = false }
 datafusion-comet-spark-expr = { path = "spark-expr", version = "0.5.0" }
 datafusion-comet-proto = { path = "proto", version = "0.5.0" }
 chrono = { version = "0.4", default-features = false, features = ["clock"] }

and verified that this command passes:

MIRIFLAGS="-Zmiri-disable-isolation" cargo +nightly miri test -p datafusion-comet -- test_unpack_dictionary_primitive

@alamb
Copy link
Contributor Author

alamb commented Dec 24, 2024

Thank you for the reviews @andygrove and @xudong963

@alamb alamb merged commit e99e02b into apache:main Dec 24, 2024
26 checks passed
@alamb alamb deleted the alamb/fix_recursive_protect branch December 24, 2024 21:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
common Related to common crate core Core DataFusion crate documentation Improvements or additions to documentation logical-expr Logical plan and expressions optimizer Optimizer rules sql SQL Planner
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Making the recursive dependency an optional feature
4 participants