-
Notifications
You must be signed in to change notification settings - Fork 13.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FLINK-37021][state/forst] Fix incorrect paths when reusing files for checkpoints. #26040
base: master
Are you sure you want to change the base?
Conversation
@flinkbot run azure |
565b881
to
7230e0c
Compare
7230e0c
to
b62907e
Compare
|
||
// Try path-copying first. If failed, fallback to bytes-copying | ||
StreamStateHandle targetStateHandle = | ||
tryPathCopyingToCheckpoint(sourceStateHandle, checkpointStreamFactory, stateScope); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tryPathCopyingToCheckpoint can fail with an IOException as well as returning null. It would be better to always return null for a failure I think - otherwise for an IOException we will not try bytesCopyingToCheckpoint.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the comments. I've modified the code as suggested. PTAL
filePath.getName()); | ||
LOG.trace("Write file to checkpoint: {}, {}", filePath, handleAndLocalPath.getHandle()); | ||
return handleAndLocalPath; | ||
return HandleAndLocalPath.of(targetStateHandle, dbFilePath.getName()); | ||
} | ||
|
||
/** | ||
* Duplicate file to checkpoint storage by calling {@link CheckpointStreamFactory#duplicate} if | ||
* possible. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: javadoc @params etc are missing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added params and returns in the javadoc
@Nullable ForStFlinkFileSystem forStFlinkFileSystem, | ||
boolean isDbPathUnderCheckpointPathForSnapshot) { | ||
DataTransferStrategy strategy; | ||
if (forStFlinkFileSystem == null || isDbPathUnderCheckpointPathForSnapshot) { | ||
if (sharingFilesStrategy != SnapshotType.SharingFilesStrategy.FORWARD_BACKWARD |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think for ForStIncrementalSnapshotStrategy
, the SnapshotType.SharingFilesStrategy.FORWARD
is not supported since it breaks the precondition of file sharing between CP and DB.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed. I've added a check in ForStIncrementalSnapshotStrategy#asyncSnapshot()
. It throws an IllegalArgumentException
when encountering SharingFilesStrategy.FORWARD
.
What is the purpose of the change
This pull request addresses several bugs in ForSt that may prevent proper file reuse, leading to slower or failed snapshot and restoration operations.
Brief change log
DataTransferStrategyBuilder
function properly.URI
to tell whether twoFileSystem
are the same.SharingFilesStrategy
is 'FORWARD_BACKWARD'dbFilePath
andrealSourcePath
.Verifying this change
This change is already covered by existing tests, such as (please describe tests).
Does this pull request potentially affect one of the following parts:
@Public(Evolving)
: (no)Documentation