Redis replication is a master-slave asynchronous replication, which is fast and simple to implement but has the following drawbacks
- When master crashes, there can be some replies which are processed at master and replied to clients but not sent to slaves yet. This means that acked changes of the data can be lost even though there are multiple slave replicators
- Slaves can be used for read offloading but there is a possibility of reading stale data
nbase-arc implemented a new generic consensus based state machine replication layer called SMR (State Machine Replicator) with following features
- Generic
- Replication is decomposed into replicator processes and client library. So SMR replication is not confined to Redis in application
- Consensus based
- Commands are committed to client for execution by replicator only when the commands are replicated to the configured number of slaves. SMR replication protocol is equivalent to multi-paxos in steady state. When master replicator fails, external oracle component called configuration master safely elects a new master.
- State machine replication
- A write operation that changes database are sent via replication log. Replication log is generated by the master replicator and sent to slave replicators.
Management Management
| |
+-----+------+ +-----+------+
| (Slave) +---------->+ (Master) |
| Replicator | | Replicator |
| | ------>+ |
+-----+------+ / | +-----+------+
LOGs | / | | LOGs
+-----+------+ / | +-----+------+
| |/ | | |
| Redis(lib) + +--+ Redis(lib) |
| | | |
+------------+ +------------+
There are four types of connections that a replicator can have.
- management : The clients are configuration master or management tools. This type of connection is used for liveness check and configuration change
- local : The client is a local Redis. This type of connection is used to notify commit sequence or configuration change
- slave : The clients are other slave replicators. This type of connection is used to transfer replication log file from master to slave
- client : The clients are all Redis instances. This type of connection is used for a client to send replication requests
Replication logs are fixed length files that contain replication stream sequentially. A data byte in the log stream is identified by a log sequence number (LSN) that starts with 0 and grows forever (up to 2**64). The name of the log file is the LSN of the first byte of the replication stream this log file contains.
A log file is composed of data part and metadata part. The data part is actual replication stream data that is composed of replication requests and other message (e.g. commit messages). The metadata part has log file a offset and data checksums.
A Redis request is replicated and processed through the following steps.
- Send request
- A Redis process sends a replication request to the master replicator via library call.
Redis(lib) Master Slave
| | | |
X--------->| | |
| | | |
- Transfer log
- The master replicator appends the replication request to the local log file and send it to the slave replicators.
Redis(lib) Master Slave
| | | |
| X-------->|->|
| | | |
- Log ack.
- Slave replicator gets the log stream and save the received portion to the local log file. LSN of the last message in the local log file is acknowledged to the master replicator.
Redis(lib) Master Slave
| | | |
| |<--------X--X
| | | |
- Commit
- Master replicator tracks the LSN of the replicators and determines the maximum log sequence number which is safe to commit. Commit policy is determined by the quorum value which means the number of slaves who have the log too. The commit log sequence number is sent to slaves as a message in the replication stream.
( Learner )
Redis(lib) Master Slave
| | | |
| X-------->|->|
| | | |
- Learn commit
- When a replicator learns the committed log sequence number it notifies the number to the local Reids via local connection.
( Learner )
Redis(lib) Replicator
| |
|<-------------X
| |