[AMORO-4037] Add support for dynamic refresh interval for table metadata refreshing in TableRuntimeRefreshExecutor#4038
Conversation
60e829e to
298152d
Compare
|
@zhoujinsong @turboFei @xxubai Could you help me review it when you are free? Thanks! |
298152d to
4fec76a
Compare
There was a problem hiding this comment.
Pull request overview
This PR implements a dynamic refresh interval feature for table metadata refreshing in TableRuntimeRefreshExecutor. The feature uses an AIMD (Additive Increase Multiplicative Decrease) algorithm to adjust refresh intervals based on whether tables need optimization, helping to reduce resource consumption for healthy tables while maintaining responsiveness for tables requiring optimization.
Changes:
- Added configuration properties for adaptive refresh intervals with configurable maximum interval and increase step
- Implemented adaptive interval calculation logic that increases intervals for healthy tables and decreases them for tables needing optimization
- Added comprehensive test coverage for the adaptive interval functionality
Reviewed changes
Copilot reviewed 7 out of 7 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
| docs/user-guides/configurations.md | Added documentation for three new configuration properties related to adaptive refresh intervals |
| amoro-format-iceberg/src/main/java/org/apache/amoro/table/TableProperties.java | Defined constants for adaptive refresh table properties and their default values |
| amoro-common/src/main/java/org/apache/amoro/config/OptimizingConfig.java | Added fields, getters, setters, and updated equals/hashCode/toString methods for adaptive refresh configuration |
| amoro-ams/src/main/java/org/apache/amoro/server/table/TableConfigurations.java | Integrated parsing of adaptive refresh properties from table properties into OptimizingConfig |
| amoro-ams/src/main/java/org/apache/amoro/server/table/DefaultTableRuntime.java | Added fields and methods to track latest refresh interval and optimization evaluation status |
| amoro-ams/src/main/java/org/apache/amoro/server/scheduler/inline/TableRuntimeRefreshExecutor.java | Implemented core adaptive interval logic with AIMD algorithm and updated execute method to apply adaptive intervals |
| amoro-ams/src/test/java/org/apache/amoro/server/scheduler/inline/TestTableRuntimeRefreshExecutor.java | Added comprehensive test cases covering various adaptive interval scenarios including boundary conditions |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
16b59c2 to
3298bcf
Compare
klion26
left a comment
There was a problem hiding this comment.
Thanks for the contribution. I left some comments; please let me know what you think. thanks.
IMHO, this feature is valuable(especially when users use HMS as the catalog), as we can't use the event-based trigger as RestCatalog, but when there are many "static table" in the catalog, it will be a great burden for AMS to evaluate these tables, and changing the interval for every table is not a doable(as there are so many tables), and keep the default behavior as the same as before, would not add any inconvenience if we don't need it.
docs/user-guides/configurations.md
Outdated
| | self-optimizing.min-plan-interval | 60000 | The minimum time interval between two self-optimizing planning action | | ||
| | self-optimizing.filter | NULL | Filter conditions for self-optimizing, using SQL conditional expressions, without supporting any functions. For the timestamp column condition, the ISO date-time formatter must be used. For example: op_time > '2007-12-03T10:15:30'. | | ||
| | self-optimizing.refresh-table.adaptive.enabled | false | Whether to enable adaptive refresh interval for refreshing table metadata | | ||
| | self-optimizing.refresh-table.adaptive.max-interval | 3600000(1 hour) | The maximum time interval in milliseconds to refresh table metadata | |
There was a problem hiding this comment.
Maybe we can set the max-interval to an unset value, so that we can keep the behavior same as before the current pr.
| public static final int SNAPSHOT_MIN_COUNT_DEFAULT = 1; | ||
|
|
||
| public static final String SELF_OPTIMIZING_REFRESH_TABLE_ADAPTIVE_ENABLED = | ||
| "self-optimizing.refresh-table.adaptive.enabled"; |
There was a problem hiding this comment.
If we need to set self-optimizing.refresh-table.adaptive.enabled and self-optimizing.refresh-table.adaptive.max-interval, how about using only self-optimizing.refresh-table.adaptive.max-interval? Some value (like -1 or zero) means that disable it, and some other value means that to enable it.
| public static final boolean SELF_OPTIMIZING_REFRESH_TABLE_ADAPTIVE_ENABLED_DEFAULT = false; | ||
| public static final String SELF_OPTIMIZING_REFRESH_TABLE_ADAPTIVE_MAX_INTERVAL = | ||
| "self-optimizing.refresh-table.adaptive.max-interval"; | ||
| public static final long SELF_OPTIMIZING_REFRESH_TABLE_ADAPTIVE_MAX_INTERVAL_DEFAULT = |
There was a problem hiding this comment.
How about add the time unit in the name, SELF_OPTIMIZING_REFRESH_TABLE_ADAPTIVE_MAX_INTERVAL_DEFAULT -> SELF_OPTIMIZING_REFRESH_TABLE_ADAPTIVE_MAX_INTERVAL_MS_DEFAULT
| if (optimizingConfig.isMetadataBasedTriggerEnabled() | ||
| && !MetadataBasedEvaluationEvent.isEvaluatingNecessary( | ||
| optimizingConfig, table, tableRuntime.getLastPlanTime())) { | ||
| tableRuntime.setLatestEvaluatedNeedOptimizing(false); |
There was a problem hiding this comment.
Is it better that we change the return value of tryEvaluatingPendingInput to boolean -- indicate whether we need to optimize after evaluating?
So that we can call tableRuntime.setLatestEvaluteNeedOptimizing in one place in execute below,
Please let me what do you think about it.
...-ams/src/main/java/org/apache/amoro/server/scheduler/inline/TableRuntimeRefreshExecutor.java
Show resolved
Hide resolved
| executor.execute(mockTableRuntime); | ||
|
|
||
| // Verify that setLatestRefreshInterval was called with the expected value | ||
| verify(mockTableRuntime, Mockito.times(1)).setLatestRefreshInterval(expectedInterval); |
There was a problem hiding this comment.
Instead using verify here to validate, is there any way we can validate the value of the true interval after executor.execute in L147.
The true interval is the value keeped in the table runtime.
|
|
||
| // Test unhealthy table (need optimizing) - interval should decrease | ||
| mockTableRuntime = | ||
| getMockTableRuntimeWithAdaptiveInterval(tableRuntime, true, true, INTERVAL * 2); |
There was a problem hiding this comment.
Is there a way to reuse this tableRuntime so we can observe the change in interval before and after two consecutive execute calls? Reusing a single tableRuntime function would also be more suitable for production scenarios.
1a658f8 to
4f0ca86
Compare
4f0ca86 to
72aad3b
Compare
klion26
left a comment
There was a problem hiding this comment.
Thanks for the update, I've left some comments, please take another look.
| tableRuntime.getTableIdentifier()); | ||
| return false; | ||
| } else { | ||
| logger.debug( |
There was a problem hiding this comment.
Could we add some comments to describle the "semantics" of the return value, and why would this case would be true?
There was a problem hiding this comment.
From below, the name is needOptimizing, looks like whethere we need to do an opitmizing, but in this branch, we don't need to optimizing in current loop? as there is an ongoing optimizing process
| } | ||
|
|
||
| // Update adaptive interval according to evaluated result. | ||
| if (defaultTableRuntime.getOptimizingConfig().getRefreshTableAdaptiveMaxIntervalMs() > 0) { |
There was a problem hiding this comment.
Maybe we can add a function in OptimizingConfig which return true/false show whethere we enabled this feature
docs/user-guides/configurations.md
Outdated
| | self-optimizing.full.rewrite-all-files | true | Whether full optimizing rewrites all files or skips files that do not need to be optimized | | ||
| | self-optimizing.min-plan-interval | 60000 | The minimum time interval between two self-optimizing planning action | | ||
| | self-optimizing.filter | NULL | Filter conditions for self-optimizing, using SQL conditional expressions, without supporting any functions. For the timestamp column condition, the ISO date-time formatter must be used. For example: op_time > '2007-12-03T10:15:30'. | | ||
| | self-optimizing.refresh-table.adaptive.max-interval-ms | -1 | The maximum time interval in milliseconds to refresh table metadata. The default value is -1, which disables the adaptive refresh. | |
There was a problem hiding this comment.
Maybe we need to add some description here: after this has been enabled, the interval may be larger than defaultTableRuntime.getOptimizingConfig().getMinorLeastInterval() * 4L / 5
| createDatabase(); | ||
| createTable(); | ||
|
|
||
| DefaultTableRuntime tableRuntime = |
There was a problem hiding this comment.
Could you please reuse the same table runtime, in the current test, we'll create every new table runtime in buildTableRuntimeWithAdaptiveRefresh
There was a problem hiding this comment.
If we need to update the config for an existing table runtime, maybe we can add a new class who extend DefaultTableRuntime and overrides the getStore function.
class TestTableRuntime extends DefaultTableRuntime {
TableRuntimeStore testStore;
TestTableRuntime(TableRuntimeStore store) {
...
this.testStore = store
}
void updateRuntimeStore(TableRuntimeStore newStore) {
...
this.testStore = newStore;
}
@Override
TableConfiguration getStore() {
this.testStore;
}
| // Test minimum boundary - interval should not below INTERVAL | ||
| // The unhealthy table (need optimizing) - interval should decrease half | ||
| // The currentInterval is INTERVAL + STEP, the latest interval is INTERVAL not (INTERVAL + STEP) | ||
| // / 2 |
There was a problem hiding this comment.
Place (INTERVAL + STEP) / 2 in the same line is better than 2 lines
808f66b to
fbbd84f
Compare
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## master #4038 +/- ##
============================================
- Coverage 28.70% 28.68% -0.02%
- Complexity 3944 3951 +7
============================================
Files 656 654 -2
Lines 52382 52385 +3
Branches 6642 6630 -12
============================================
- Hits 15035 15026 -9
- Misses 36249 36265 +16
+ Partials 1098 1094 -4
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
fbbd84f to
5955420
Compare
| return this; | ||
| } | ||
|
|
||
| public boolean isRefreshTableAdaptiveEnabled(long minInterval) { |
There was a problem hiding this comment.
Why do we need to pass in this parameter? will compare the refreshTableAdaptiveMaxIntervalMs to a default value in the current class be better?
The logic here seems is different than the doc below The maximum time interval in milliseconds to refresh table metadata. 0 means disable adaptive refresh.?
1e01c3f to
08871a7
Compare
…g in the TableRuntimeRefreshExecutor.
…e.max-interval-ms
08871a7 to
5d564df
Compare
Why are the changes needed?
Close #4037.
Brief change log
How was this patch tested?
Add some test cases that check the changes thoroughly including negative and positive cases if possible
Add screenshots for manual tests if appropriate
Run test locally before making a pull request
Documentation