-
Notifications
You must be signed in to change notification settings - Fork 3.4k
HBASE-29240: Backup ancestry trees should have the ability to be invalidated #6891
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
| table.startBackupExclusiveOperation(); | ||
| table.invalidateTableAncestry(tableName); | ||
| } finally { | ||
| table.finishBackupExclusiveOperation(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any exceptions will bubble up to the caller. If modern backups are enabled and we fail to invalidate the table ancestry, it will effectively block the truncation. This sounds correct, otherwise we're stuck in a situation where the table has been truncated, and the operator creating a new backup has no way of knowing at backup creation time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good. And to add to that: if the truncation were to fail at some other point (after we already invalidated the backup), there's no harm done. Only effect is that a full backup will be done where one could do an incremental one.
| import org.apache.yetus.audience.InterfaceAudience; | ||
|
|
||
| @InterfaceAudience.LimitedPrivate(HBaseInterfaceAudience.CONFIG) | ||
| public class BackupMasterObserver implements MasterObserver, MasterCoprocessor { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As of this jira, all backup-related classes were moved to their own module. This means hbase-server doesn't know about the BackupSystemTable. This is a workaround, as we can't bake this logic right into the truncation command directly.
f772da1 to
0b47bb4
Compare
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
|
Taking a look at javac issues + unit test failures edit - done |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
DieterDP-ng
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Promising PR. Might need to align the observer with HBASE-29003, depending on which one gets merged first.
| @InterfaceAudience.LimitedPrivate(HBaseInterfaceAudience.CONFIG) | ||
| public class BackupMasterObserver implements MasterObserver, MasterCoprocessor { | ||
|
|
||
| private final Optional<MasterObserver> observer; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This feels silly. I think the general trend for Observers is to just create the Optional in the getXObserver method.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was looking to avoid additional allocations, but I'm happy to oblige with the existing pattern
| } | ||
|
|
||
| @Override | ||
| public void preTruncateTable(ObserverContext<MasterCoprocessorEnvironment> ctx, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should add a check whether the backup system is enabled or not:
if (!BackupManager.isBackupEnabled(cfg)) {
LOG.debug("Skipping preTruncateTable hook since backup is disabled");
return;
}
| table.startBackupExclusiveOperation(); | ||
| table.invalidateTableAncestry(tableName); | ||
| } finally { | ||
| table.finishBackupExclusiveOperation(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good. And to add to that: if the truncation were to fail at some other point (after we already invalidated the backup), there's no harm done. Only effect is that a full backup will be done where one could do an incremental one.
| } | ||
|
|
||
| public void invalidateTableAncestry(TableName toInvalidate) throws IOException { | ||
| List<BackupInfo> fullTableBackups = getCompletedFullBackups(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What would happen if a (full) backup was running when this method is called? The running backup would not be invalidated, since it's not a completed backup yet (per this line).
I see 2 options:
- Also have the observer trigger this method in a
postTruncate - Disallow truncates during a backup cycle
|
|
||
| for (BackupInfo backupInfo : fullTableBackups) { | ||
| // to minimize the amount of mutations against the backup system table, we only | ||
| // need to update full backups that have currently have a valid ancestry line |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Typo: "that have currently have"
| List<Put> invalidatePuts = new ArrayList<>(fullTableBackups.size()); | ||
|
|
||
| for (BackupInfo backupInfo : fullTableBackups) { | ||
| // to minimize the amount of mutations against the backup system table, we only |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We also only need to consider the most recent FULL backup per backup root.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense, added an additional check to make sure we aren't adding Puts for older full backups
| FullTableBackupClient.class); | ||
|
|
||
| // Release the lock created by initializing the TableBackupClient | ||
| backupSystemTable.finishBackupExclusiveOperation(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you clarify why this step is needed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, both TableBackupClient implementations have a weird UX that basically expects that you're calling execute after instantiating them. This is because the TableBackupClient calls an init method that starts a backup session. My test is just checking to see which class was instantiated, so we're never implicitly (via execute) finishing the backup exclusive operation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can add this explanation to the comment, if that would be helpful
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would be helpful. Or even better would be to move that lock creation in the TableBackupClient to a location that does make sense.
| optional uint32 workers_number = 11; | ||
| optional uint64 bandwidth = 12; | ||
| map<string, RSTimestampMap> table_set_timestamp = 13; | ||
| optional bool isInvalidAncestry = 14; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I find this name somewhat cryptic. It's also tightly knit to this specific usecase. I can imagine that we want to use the same mechanism to let HBase know that a specific backup root isn't getting further updates (so it can stop keeping track of all the incremental stuff for that root).
So, I suggest to name this "disallowFurtherIncrementals" or something similar.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense, thanks for the suggestion
| optional uint32 workers_number = 11; | ||
| optional uint64 bandwidth = 12; | ||
| map<string, RSTimestampMap> table_set_timestamp = 13; | ||
| optional bool isInvalidAncestry = 14; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As mentioned in the comment above, the fact that a backup chain (= one full backup followed by zero or more incrementals in the same backup root) has been closed (by invalidation), also implies that there's no more need to track changes for a new incremental backup of that chain.
In other words, BackupHFileCleaner would also benefit from being updated in this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As I see it, the BackupHFileCleaner already cleans up bulkloads for any tables that have been fully backed up at any point. The list of TableNames is fetched using this method. I have two thoughts
- This seems... wrong? I would suspect that we'd want to only include names of tables that have a full backup as their most recent backup. Additionally, this method doesn't seem to take into account that a table could be part of multiple backup sets. Let me know what you think here.
- Are you saying we need to account for a full backup that is still running, but has had it's table truncated? So we need to update the method to deal with that "race condition"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems... wrong? I would suspect that we'd want to only include names of tables that have a full backup as their most recent backup
For example, given that we're only pulling for completed backups, how do we know there isn't an incremental backup running at this time?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems... wrong? I would suspect that we'd want to only include names of tables that have a full backup as their most recent backup. Additionally, this method doesn't seem to take into account that a table could be part of multiple backup sets.
BackupHFileCleaner indeed doesn't work properly for different backup roots, that's logged as HBASE-28706.
Besides that, I'm not aware of any issue. Can you elaborate what you think goes wrong? The BackupHFileCleaner effectively blocks the cleanup of HFiles that are registered in the backupTable. When an incremental or full backup completes, those registrations get removed. There's no issue if an incremental backup is running while BackupHFileCleaner is executing.
Are you saying we need to account for a full backup that is still running, but has had it's table truncated? So we need to update the method to deal with that "race condition"?
(I presume you meant you didn't understand my starting statement of this thread, and aren't referring to this other comment)
I'm saying that the BackupHFileCleaner should allow the deletion of HFiles that were registered for a backup that has been invalidated. At the top of my head, I see 2 options:
- When invalidating a backup, we should also delete the HFile registrations.
- The
BackupHFileCleanershould ignore tables of a backup root if the most recent backup on that root is invalidated. (Looks trickier to implement, since it doesn't correctly support multi-roots yet.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you elaborate what you think goes wrong?
I might be wrong, but it seems this could be problematic if there's an ongoing incremental backup a that time we try to delete the files.
Say you have a table that has been successfully fully backed up, and is now in the process of executing an incremental backup. From my understanding, the cleaner will denote a file as deleteable if the file belongs to a bulkload of a table that was fully backed up at any point in time. So I could see a case where you delete the bulkloaded files before you're able to back them up incrementally. Please let me know if I'm misguided here.
I'm saying that the BackupHFileCleaner should allow the deletion of HFiles that were registered for a backup that has been invalidated
Agreed with your thought here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I see the confusion. Let me clarify.
The bulkloads in the backup table don't specify what can be deleted, they specify what cannot be deleted. The BackupHFileCleaner blocks the deletion of those files by not returning them in the getDeletableFiles method.
And to clarify the timeline of those bulkloads in the backup table:
- If an HBase bulkload is done for a table that has a registered FULL backup in the backup table, that bulkload is registered.
BackupHFileCleanernow protects that file from being deleted. - At a point in the future, an incremental backup is started. That incremental backup will include the data of all registered bulkloads. After completion, the bulkload registration is deleted. So now the
BackupHFileCleanerallows for deletion of the file. - There is one race condition, namely if an HBase bulkload is done just after the
BackupHFileCleaner#loadHFileRefshas completed. This is mitigated by the use of theprevReadFromBackupTblandsecondPrevReadFromBackupTbltimestamp inBackupHFileCleaner.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah yes, that makes sense to me. I'm caught up now thank you
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated so that disallowing further incrementals (sorry for the mid-convo terminology switchup) will also clean up the HFile registrations
240c6a6 to
c21ac2d
Compare
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
|
|
||
| BackupSystemTable table = new BackupSystemTable(env.getConnection()); | ||
| try { | ||
| table.startBackupExclusiveOperation(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does it make more sense to include lock acquisition/release inside the disallowFurtherIncremetnals method? That's not the pattern that we follow now, but I wonder if we should to make sure each caller isn't required to take out the lock
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, do we actually need to use the lock here? Thinking out loud...
Yes: this prevents a truncation happening while a backup is running, which might cause incomplete data to be stored. In that case, it makes sense that deleteBulkLoads is also included in this lock. So the lock should stay outside of disallowFurtherIncremetnals.
No: the lock system seems intended to avoid multiple backup operations from overlapping. Using it here would mean it also impacts a non-backup-related system. Is it really bad if a truncate happens during a backup? How is it different from a bulk-load happening or regular mutations during a backup (which is allowed).
Currently hovering towards that we don't need a lock here.
This comment has been minimized.
This comment has been minimized.
1 similar comment
This comment has been minimized.
This comment has been minimized.
9d1ca12 to
a9d1c2a
Compare
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
|
🎊 +1 overall
This message was automatically generated. |
|
🎊 +1 overall
This message was automatically generated. |
|
@DieterDP-ng I think I've addressed all your comments here, please let me know if there's anything else you think needs addressing |
DieterDP-ng
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also suggest to rebase on current master, since HBASE-29003 has been merged.
| throw new IncrementalBackupsDisallowedException(request); | ||
| } | ||
|
|
||
| LOG.info("Incremental backups disallowed for backupId {}, creating a full backup", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: it might be unclear what the backupId refers to.
| } | ||
| } | ||
|
|
||
| private static String getLatestFullBackupId(Connection conn, BackupRequest request) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The log line above made me realize this approach is wrong.
An incremental backup might build on top of multiple FULL backups: it includes all tables for which a FULL backup exists in the backup root. For example, given tables T1, T2, T3:
- Timestamp 0: Full backup: T1, T2
- Timestamp 1: Full backup: T1, T3
- Timestamp 2: Incr backup: T1, T2, T3
If we were to truncate T2, the full backup at time 0 would get the disallowedIncremental marker.
But this method would only check the full backup at time 1. Correct behavior would be to check all FULL backups (from newest to oldest) up until all tables present in the backup root are covered. TableBackupClient#getAncestors might be useful here.
Realizing the complexity here, we do have 2 options:
- Current approach - meaning that creating a single FULL backup to T2 would re-enable incremental backups.
- Somewhat simpler approach, though with more code changes - we could move the "disallow incrementals" to the backuproot-level data (similar to eg backupStartCode). This would mean only a single GET has to be done in order to find out whether or not incrementals are allowed.
|
|
||
| BackupSystemTable table = new BackupSystemTable(env.getConnection()); | ||
| try { | ||
| table.startBackupExclusiveOperation(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, do we actually need to use the lock here? Thinking out loud...
Yes: this prevents a truncation happening while a backup is running, which might cause incomplete data to be stored. In that case, it makes sense that deleteBulkLoads is also included in this lock. So the lock should stay outside of disallowFurtherIncremetnals.
No: the lock system seems intended to avoid multiple backup operations from overlapping. Using it here would mean it also impacts a non-backup-related system. Is it really bad if a truncate happens during a backup? How is it different from a bulk-load happening or regular mutations during a backup (which is allowed).
Currently hovering towards that we don't need a lock here.
| } | ||
|
|
||
| /** | ||
| * @param toDisallow Any most recent full back containing this table will be marked as disallowing |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This line makes little sense. Typos?
| // need to update the most recent full backups that allow incremental backups | ||
| if ( | ||
| backupInfo.getTables().contains(toDisallow) && backupInfo.getType() == BackupType.FULL | ||
| && !backupInfo.isDisallowFurtherIncrementals() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This check can be moved to an inner check. In case we truncate the same table multiple times in a row, current behavior will keep adjusting older and older backupInfos.
| } | ||
| } | ||
|
|
||
| private List<BackupInfo> getCompletedFullBackupsSortedByHistoryDesc() throws IOException { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Better to re-use BackupSystemTable#getBackupInfos.
Note that the backupInfos are sorted naturally due to the scanning order.
| @InterfaceAudience.Public | ||
| @InterfaceStability.Evolving | ||
| public class IncrementalBackupsDisallowedException extends HBaseIOException { | ||
| public IncrementalBackupsDisallowedException(BackupRequest request) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For helping API users deal with this exception, I'd put the table list as an accessible field in the exception.
So a user could do:
try {
makeHbaseIncrBackup();
} catch (IncrementalBackupsDisallowedException e) {
makeHbaseFullBackup(e.tables());
makeHbaseIncrBackup();
}
| * @return sorted list of BackupCompleteData | ||
| */ | ||
| public static ArrayList<BackupInfo> sortHistoryListDesc(ArrayList<BackupInfo> historyList) { | ||
| public static ArrayList<BackupInfo> sortHistoryListDesc(List<BackupInfo> historyList) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this method can just be removed. The list coming from BackupSystemTable#getBackupInfos is already sorted due to the scanning order. (If I'm mistaken here, the javadoc for getBackupInfos should be updated.)
| public void testDisallowFurtherIncrementals() throws Exception { | ||
| try (BackupSystemTable table = new BackupSystemTable(conn)) { | ||
| TableName toInvalidate = TableName.valueOf("t1"); | ||
| List<TableName> t1 = Lists.newArrayList(toInvalidate, TableName.valueOf("t2")); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Your naming of the lists is a bit confusing. Suggestion: t1_t2, t1_t3, ...
No description provided.