Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move EnforceSorting into datafusion-physical-optimizer crate #14219

Merged

Conversation

buraksenn
Copy link
Contributor

Which issue does this PR close?

Closes #14185

Rationale for this change

Extract remaining physical optimizer out of core

What changes are included in this PR?

Move EnforceSorting out of core package

Are these changes tested?

Existing tests that were moved to integration tests

@github-actions github-actions bot added optimizer Optimizer rules core Core DataFusion crate labels Jan 20, 2025
@buraksenn buraksenn force-pushed the move-enforce-sorting-to-new-crate branch 2 times, most recently from 7732010 to 61fbe14 Compare January 21, 2025 08:54
@buraksenn
Copy link
Contributor Author

I've these errors in replace in replace_with_order_preserving_variants tests:

expected:

[
    "SortExec: expr=[a@0 ASC NULLS LAST], preserve_partitioning=[false]",
    "  CoalescePartitionsExec",
    "    RepartitionExec: partitioning=Hash([c@1], 8), input_partitions=8",
    "      RepartitionExec: partitioning=RoundRobinBatch(8), input_partitions=1",
    "        StreamingTableExec: partition_sizes=1, projection=[a, c, d], infinite_source=true, output_ordering=[a@0 ASC NULLS LAST]",
]
actual:

[
    "SortExec: expr=[a@0 ASC NULLS LAST], preserve_partitioning=[false]",
    "  CoalescePartitionsExec",
    "    RepartitionExec: partitioning=Hash([c@2], 8), input_partitions=8",
    "      RepartitionExec: partitioning=RoundRobinBatch(8), input_partitions=1",
    "        StreamingTableExec: partition_sizes=1, projection=[a, b, c, d], infinite_source=true, output_ordering=[a@0 ASC NULLS LAST]",
]

Not sure why but projection changes so I'm working on fixing it

@buraksenn buraksenn force-pushed the move-enforce-sorting-to-new-crate branch from 61fbe14 to fef3c6e Compare January 21, 2025 08:59
@buraksenn
Copy link
Contributor Author

cc @berkaysynnada

@berkaysynnada
Copy link
Contributor

cc @berkaysynnada

There are two function having the same name (stream_exec_ordered), one in replace_with_order_preserving_variants.rs with a built-in projection, and one in core/src/test/mod.rs, w/o a projection, which differs the test results. You've mixed up them. I guess you've deleted one of them actually.

@buraksenn buraksenn force-pushed the move-enforce-sorting-to-new-crate branch from a1b2197 to 9c40e49 Compare January 21, 2025 12:50
Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @buraksenn and @berkaysynnada

I did some spot checks on this PR and I think it looks EPIC -- very nice. The fact that no plans change gives me confidence the refactoring is good.

I took the liberty of pushing a commit that removed a chance to the datafusion-testing pin:

Screenshot 2025-01-21 at 4 26 20 PM

But otherwise I think this is ready to merge. I will leave it open a bit longer to give @berkaysynnada a chance to comment if they want.

Copy link
Member

@xudong963 xudong963 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did some random checks, lgtm

@berkaysynnada berkaysynnada merged commit 274e535 into apache:main Jan 22, 2025
27 checks passed
@berkaysynnada berkaysynnada deleted the move-enforce-sorting-to-new-crate branch January 22, 2025 11:32
@berkaysynnada
Copy link
Contributor

LGTM, thank you @buraksenn. I'll continue with the #14190

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
core Core DataFusion crate optimizer Optimizer rules
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Move EnforceSorting into datafusion-physical-optimizer crate
4 participants