[Feature] Add remote maintainer framework for Paimon tables#4068
Draft
[Feature] Add remote maintainer framework for Paimon tables#4068
Conversation
This commit introduces a framework for executing Paimon table maintenance operations (snapshot expiration, orphan file cleanup) remotely on Spark optimizers, following the existing Optimizer pattern. Changes: - Add MaintainerInput/Output interfaces and base implementations - Add MaintainerExecutor/Factory interfaces for remote execution - Create amoro-optimizer-paimon-spark module with SparkMaintainerExecutor - Implement PaimonSnapshotExpire* components for snapshot expiration - Add placeholder SparkOptimizer for future Paimon optimizing support Co-Authored-By: Claude (glm-4.7) <noreply@anthropic.com>
Contributor
|
Thanks for working on this. Wondering how does this maintainer framework work with the current process/external process API. I notice there has been some Paimon related work on this by @LiangDai-Mars and @baiyangtx . I think it's the right time to have a dicussion on this. Curious how people think @zhoujinsong By the way I'm open to have a new framwork if we could provide better extension to multi-format |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This PR introduces a framework for executing Paimon table maintenance operations (snapshot expiration, orphan file cleanup) remotely on Spark optimizers.
Changes:
Key Design: