-
Notifications
You must be signed in to change notification settings - Fork 99
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
IGNITE-23079 Make Raft storages destruction durable #4987
base: main
Are you sure you want to change the base?
Conversation
dfc7dd9
to
91a601e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a nice first attempt to approach the issue, but it seems to be more complicated than I thought before.
- It would be much better to use a single intent per group, not 2 per group
- Log storage factories should not be aware about destroying something on start (at least, at seems that we could avoid this). Current approach duplicates this logic in the log storage factories
How about trying to pull the knowledge about 'to which log storage factory to route the destruction request on start' higher (to raft server, for example?). We could introduce GroupStoragesDestructionIntent
with one method like toIntentIdBytes()
. For CMG and MS there would be one implementation (a class with just groupId in it); for partition it would be another implementation (with groupId and isVolatile
flag in it). We would then register resolvers with raft server; the resolvers would resolve intent ID bytes to GroupStorageDestructionIntents
and to the corresponding destroyers. Raft server would just iterate the destruction event storage, obtain destroyers and call them.
@@ -113,6 +114,9 @@ | |||
public class JraftServerImpl implements RaftServer { | |||
private static final IgniteLogger LOG = Loggers.forClass(JraftServerImpl.class); | |||
|
|||
/** Prefix to save destroy storage intents in {@link DestroyStorageIntentStorage}. */ | |||
private static final String DESTROY_PREFIX = "jraftServer"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
private static final String DESTROY_PREFIX = "jraftServer"; | |
private static final String META_AND_SNAPSHOT_DESTROY_INTENT = "metaAndSnapshot"; |
Because it is about destroying Raft meta and snapshot storages
@@ -573,15 +584,18 @@ public boolean stopRaftNodes(ReplicationGroupId groupId) { | |||
|
|||
@Override | |||
public void destroyRaftNodeStorages(RaftNodeId nodeId, RaftGroupOptions groupOptions) { | |||
// TODO: IGNITE-23079 - improve on what we do if it was not possible to destroy any of the storages. | |||
try { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Both intents should be saved in the very beginning, and this should be done atomically, in one operation. Please use something like vault.putAll()
under the hood
modules/raft/src/main/java/org/apache/ignite/internal/raft/server/impl/JraftServerImpl.java
Outdated
Show resolved
Hide resolved
modules/raft/src/main/java/org/apache/ignite/internal/raft/server/impl/JraftServerImpl.java
Show resolved
Hide resolved
modules/raft/src/main/java/org/apache/ignite/internal/raft/server/impl/JraftServerImpl.java
Outdated
Show resolved
Hide resolved
...rc/main/java/org/apache/ignite/raft/jraft/storage/impl/VaultDestroyStorageIntentStorage.java
Outdated
Show resolved
Hide resolved
return new String(key, offset, key.length - offset, UTF_8); | ||
} | ||
|
||
private static ByteArray prefixByFactoryName(String factoryName) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is it about factory name? It seems to be an abstract prefix/identifier (as it now accepts not just factory names, but also server data path constant)
...rc/main/java/org/apache/ignite/raft/jraft/storage/impl/VaultDestroyStorageIntentStorage.java
Outdated
Show resolved
Hide resolved
...rc/main/java/org/apache/ignite/raft/jraft/storage/impl/VaultDestroyStorageIntentStorage.java
Outdated
Show resolved
Hide resolved
modules/vault/src/main/java/org/apache/ignite/internal/vault/VaultManager.java
Show resolved
Hide resolved
import org.apache.ignite.internal.raft.server.RaftGroupOptions; | ||
import org.apache.ignite.internal.replicator.ReplicationGroupId; | ||
|
||
/** Persists and retrieves intent to complete storages destruction on node start. */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/** Persists and retrieves intent to complete storages destruction on node start. */ | |
/** Persists and retrieves intents to destroy Raft group storages. */ |
/** Add configurer for CMG or metastorage raft storages. */ | ||
void addGroupOptionsConfigurer(ReplicationGroupId groupId, RaftGroupOptionsConfigurer groupOptionsConfigurer); | ||
|
||
/** Add configurer for partitions raft storages. */ | ||
void addPartitionGroupOptionsConfigurer(RaftGroupOptionsConfigurer partitionRaftConfigurer); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These methods should not be on the interface. Ideally, the 3 configurers should be passed via constructor. If it's not possible, then just the implementation based on Vault should have some way to pass the configurers.
void addPartitionGroupOptionsConfigurer(RaftGroupOptionsConfigurer partitionRaftConfigurer); | ||
|
||
/** Save intent to destroy raft storages. */ | ||
void saveDestroyStorageIntent(RaftNodeId nodeId, RaftGroupOptions groupOptions); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From groupOptions
, the implementation of this method only uses isVolatile
flag, so it seems that the whole object should not be passed in
void removeDestroyStorageIntent(String nodeId); | ||
|
||
/** Returns group options needed to destroy raft storages, mapped by node id represented by String. */ | ||
Map<String, RaftGroupOptions> readGroupOptionsByNodeIdForDestruction(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could it happen that the number of groups to destroy is huge? Whould it make sense to have a streaming API here?
import org.apache.ignite.internal.vault.VaultEntry; | ||
import org.apache.ignite.internal.vault.VaultManager; | ||
|
||
/** Uses VaultManager to destroy raft group storages durably, using vault to store destruction intents. */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/** Uses VaultManager to destroy raft group storages durably, using vault to store destruction intents. */ | |
/** Uses VaultManager to store destruction intents. */ |
|
||
private static final String PARTITION_GROUP_NAME = "partition"; | ||
|
||
private static final int RAFT_GROUPS = 3; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These are not groups but something like spaces
|
||
@Override | ||
public void saveDestroyStorageIntent(RaftNodeId nodeId, RaftGroupOptions groupOptions) { | ||
String configurerName = nodeId.groupId() instanceof PartitionGroupId ? PARTITION_GROUP_NAME : nodeId.groupId().toString(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems too much for this class to know a difference between partitions and not partitions; it should not know about partitions at all. Probably the method could accept an object that would serialize the 'intent' (or just build the intent object)
|
||
String nodeId = raftNodeIdFromKey(next.key().bytes()); | ||
|
||
// todo add serializer |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unresolved TODO
// todo add serializer | ||
DestroyStorageIntent intent = ByteUtils.fromBytes(next.value()); | ||
|
||
RaftGroupOptions groupOptions = intent.isVolatile |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't like that we build group options just to conform to the 'group options configurer' abstraction, we are making a loop.. But mybe we should change this later (probably with revisiting the 'group options configurer' abstraction itself)
# Conflicts: # modules/raft/build.gradle
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Almost there!
@@ -21,6 +21,9 @@ | |||
* A {@link ReplicationGroupId} which corresponds to partition of a partitioned object. | |||
*/ | |||
public interface PartitionGroupId extends ReplicationGroupId { | |||
/** Used for durable destruction purposes. */ | |||
String PARTITION_GROUP_NAME = "partition"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure that 'partition group ID' should know anything about destruction. Can we declare it elsewhere?
new NoOpFailureManager() | ||
new NoOpFailureManager(), | ||
new NoopGroupStoragesDestructionIntents(), | ||
new GroupStoragesContextResolver(Objects::toString, Map.of(), Map.of()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it make sense to hide this in a no-arg constructor (test-only) to avoid copying this over and over?
verify(logStorageFactories.get(0), times(0)).destroyLogStorage(anyString()); | ||
} | ||
|
||
private void destroyStorageFails( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
private void destroyStorageFails( | |
private void attemptDestroyStoragesAndExpectFailure( |
|
||
restartServer(); | ||
|
||
assertFalse(Files.exists(nodeDataPath)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's assert that this directory exists before an attempt to destroy storages - just to make sure we are looking at the correct directory
Map<String, LogStorageFactory> logStorageFactoryByGroupName = Map.of( | ||
PARTITION_GROUP_NAME, partitionsLogStorageFactory, | ||
CmgGroupId.INSTANCE.toString(), partitionsLogStorageFactory, | ||
MetastorageGroupId.INSTANCE.toString(), partitionsLogStorageFactory | ||
); | ||
|
||
Map<String, Path> serverDataPathByGroupName = Map.of( | ||
PARTITION_GROUP_NAME, workingDir.basePath(), | ||
CmgGroupId.INSTANCE.toString(), workingDir.basePath(), | ||
MetastorageGroupId.INSTANCE.toString(), workingDir.basePath() | ||
); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need this 'partitions, metastorage, cmg' specifics in a test that is about Raft and nothing more?
// This destroys both meta storage and snapshots storage as they are stored under serverDataPath. | ||
IgniteUtils.deleteIfExists(serverDataPath); | ||
// This destroys both meta storage and snapshots storage as they are stored under nodeDataPath. | ||
IgniteUtils.deleteIfExists(getServerDataPath(context.serverDataPath(), nodeId)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We might oversee an exception here. We probably need to do the following:
- If the directory doesn't exist, we do nothing
- Otherwise, we try to delete it
- If an exception happens, we rethrow it
import org.jetbrains.annotations.Nullable; | ||
|
||
/** Contains {@link DestroyStorageIntent}, server data path and {@link LogStorageFactory} for storage destruction. */ | ||
public class DestroyStorageContext { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
public class DestroyStorageContext { | |
public class StoragesDestructionContext { |
import org.apache.ignite.internal.tostring.S; | ||
|
||
/** Intent to destroy raft node's group storages. */ | ||
public class DestroyStorageIntent { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
public class DestroyStorageIntent { | |
public class StoragesDestructionIntent { |
} | ||
|
||
private static byte[] toStorageBytes(DestroyStorageIntent intent) { | ||
try (IgniteUnsafeDataOutput out = new IgniteUnsafeDataOutput(INITIAL_BUFFER_CAPACITY)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's use a versioned serializer
@@ -1186,6 +1200,26 @@ public class IgniteImpl implements Ignite { | |||
publicCatalog = new PublicApiThreadingIgniteCatalog(new IgniteCatalogSqlImpl(sql, distributedTblMgr), asyncContinuationExecutor); | |||
} | |||
|
|||
private GroupStoragesContextResolver groupStoragesContextResolver() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
private GroupStoragesContextResolver groupStoragesContextResolver() { | |
private GroupStoragesContextResolver createGroupStoragesContextResolver() { |
https://issues.apache.org/jira/browse/IGNITE-23079