[fix #20271] protect backend usage from concurrent `applySnapshot` and `defrag` #20553

silentred · 2025-08-24T09:38:05Z

Please read https://github.com/etcd-io/etcd/blob/main/CONTRIBUTING.md#contribution-flow.

for #20271

k8s-ci-robot · 2025-08-24T09:38:09Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: silentred
Once this PR has been reviewed and has the lgtm label, please assign siyuanfoundation for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

k8s-ci-robot · 2025-08-24T09:38:16Z

Hi @silentred. Thanks for your PR.

I'm waiting for a etcd-io member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

ahrtr · 2025-08-26T15:59:08Z

This isn't correct. It can't prevent defragmentation from renaming the db file concurrently.

ahrtr · 2025-08-26T16:10:30Z

The defragmentation also needs to acquire this lock.

etcd/server/etcdserver/api/v3rpc/maintenance.go

Lines 109 to 120 in 84ac605

    
           func (ms *maintenanceServer) Defragment(ctx context.Context, sr *pb.DefragmentRequest) (*pb.DefragmentResponse, error) { 
        
           	ms.lg.Info("starting defragment") 
        
           	ms.healthNotifier.defragStarted() 
        
           	defer ms.healthNotifier.defragFinished() 
        
           	err := ms.bg.Backend().Defrag() 
        
           	if err != nil { 
        
           		ms.lg.Warn("failed to defragment", zap.Error(err)) 
        
           		return nil, togRPCError(err) 
        
           	} 
        
           	ms.lg.Info("finished defragment") 
        
           	return &pb.DefragmentResponse{}, nil 
        
           }

The (s *EtcdServer) Backend() is problematic, it just protects getting the backend instance; it can't protect the following operation (i.e. defragmentation).

etcd/server/etcdserver/server.go

Lines 2427 to 2431 in 84ac605

    
           func (s *EtcdServer) Backend() backend.Backend { 
        
           	s.bemu.RLock() 
        
           	defer s.bemu.RUnlock() 
        
           	return s.be 
        
           }

silentred · 2025-08-27T05:38:55Z

it just protects getting the backend instance

You are right.

I was thinking bemu should protect not only changes of variable be but also underlying resources of the backend, e.g. db file, readTx, batchTx. So if we use a lock to protect

from OpenSnapshotBackend to the end of applySnapshot
and whole defrag process

is that supposed to work?
Please correct me if I am wrong, thank you.

ahrtr · 2025-08-27T10:14:15Z

and whole defrag process

I think we can add a Defragment method for Etcdserver and acquire the lock bemu, the rough implementation is something like below,

func (s *Etcdserver) Defragment(...) {
     	s.bemu.RLock() 
 	defer s.bemu.RUnlock() 
 	return s.Backend().Defrag()

The (ms *maintenanceServer) Defragment should cal (s *Etcdserver) Defragment instead of the Backend().Defrag().

also cc @fuweid

fuweid · 2025-08-28T18:19:14Z

server/etcdserver/server.go

@@ -1029,6 +1029,9 @@ func (s *EtcdServer) applySnapshot(ep *etcdProgress, toApply *toApply) {
 	// wait for raftNode to persist snapshot onto the disk
 	<-toApply.notifyc

+	// protect the backend from concurrent defrag
+	s.bemu.Lock()


Just have concern about increasing that scope of critical section. Yeah. it's simple fix. Just in case we run into dead lock. We probably need e2e test to cover that.

I am not sure what this E2E test should do. My thought is this test might include:

your e2e DONOTMERGE: reproduce #20271 #20503 , which verifies concurrent defrag

verify other concurrent API layer write, like @ahrtr mentioned in [fix #20271] protect backend usage from concurrent applySnapshot and defrag #20553 (comment)

Any advice? Big thanks.

BTW, can I ask why should we not use SIGSTOP to pause the etcd process?

silentred · 2025-09-01T14:45:14Z

Hi @ahrtr , thanks for your detailed explanation #20585 (comment) .
According to your comment, I think you might prefer this PR's solution. I have already modified the code in the direction of this suggestion #20553 (comment) .

I think we still need to align our understandings.

The problem is that during the above period, there are two different backend instances.

Yes, there are 2 backend instancesnewbe and s.be.

but both of which point to the same db file.

I think they have different inode .s.be points to old db file, newbe points to new db file, renamed from xxx.snap.db

Note that normally the old backend instance still reads from the old db file even after the renaming, until you restart etcdserver or perform a defragment right after the renaming or setting the new backend (s.be = newbe).

I think after s.kv.Restore(newbe), etcdserver starts reading from newbe to serve client's range requests.

We should block any write to both backends during the above process

Execpt defrag, I am not seeing any writes during applySnapshot, because it is in the apply loop, in which there is no concurrent user writes.

#20585 this PR attempts to rename snap.db file in the backend, so there is only one backedn instance. It locks the whole process of renaming xxx.snap.db to db, prevents concurrent defrag.

ahrtr · 2025-09-01T16:52:14Z

Hi @ahrtr , thanks for your detailed explanation #20585 (comment) .
According to your comment, I think you might prefer this PR's solution. I have already modified the code in the direction of this suggestion #20553 (comment) .

I didn't realize that you raised two separate PRs, pls close #20585 to avoid confusion.

I think they have different inode .s.be points to old db file, newbe points to new db file, renamed from xxx.snap.db

It's the reason why I said Note that normally the old backend instance still reads from the old db file even after the renaming, until you restart etcdserver or perform a defragment right after the renaming or setting the new backend (s.be = newbe).

Execpt defrag, I am not seeing any writes during applySnapshot, because it is in the apply loop, in which there is no concurrent user writes.

it isn't true. There is also API layer write, i.e. Storage version update.

etcd/server/etcdserver/version/monitor.go

Line 107 in 963a72e

func (m *Monitor) UpdateStorageVersionIfNeeded() {

server/etcdserver/server.go

silentred · 2025-09-02T02:35:21Z

it isn't true. There is also API layer write, i.e. Storage version update.

Thanks. I didn't notice.

I will squash this PR's commits when it is ready to merge.

…frag Signed-off-by: shenmu.wy <[email protected]>

k8s-ci-robot added the needs-ok-to-test label Aug 24, 2025

k8s-ci-robot added the size/XS label Aug 24, 2025

silentred mentioned this pull request Aug 24, 2025

panic: assertion failed: Page expected to be: 5 #20271

Open

4 tasks

fuweid reviewed Aug 28, 2025

View reviewed changes

silentred mentioned this pull request Aug 30, 2025

[NotMerge] another fix of 20271 #20585

Closed

k8s-ci-robot added size/S and removed size/XS labels Aug 30, 2025

ahrtr reviewed Sep 1, 2025

View reviewed changes

server/etcdserver/server.go Outdated Show resolved Hide resolved

silentred changed the title ~~[NotMerge] simple fix for 20271~~ [fix #20271] protect backend usage from concurrent applySnapshot and defrag Sep 2, 2025

fix 20271: protect backend usage from concurrent applySnapshot and de…

a0a4351

…frag Signed-off-by: shenmu.wy <[email protected]>

silentred force-pushed the fix-20271 branch from c0ac546 to a0a4351 Compare September 2, 2025 03:01

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[fix #20271] protect backend usage from concurrent `applySnapshot` and `defrag` #20553

[fix #20271] protect backend usage from concurrent `applySnapshot` and `defrag` #20553

silentred commented Aug 24, 2025 •

edited

Loading

Uh oh!

k8s-ci-robot commented Aug 24, 2025

Uh oh!

k8s-ci-robot commented Aug 24, 2025

Uh oh!

ahrtr commented Aug 26, 2025

Uh oh!

ahrtr commented Aug 26, 2025

Uh oh!

silentred commented Aug 27, 2025 •

edited

Loading

Uh oh!

ahrtr commented Aug 27, 2025

Uh oh!

fuweid Aug 28, 2025

Uh oh!

silentred Sep 2, 2025 •

edited

Loading

Uh oh!

silentred commented Sep 1, 2025 •

edited

Loading

Uh oh!

ahrtr commented Sep 1, 2025

Uh oh!

Uh oh!

silentred commented Sep 2, 2025

Uh oh!

Uh oh!

[fix #20271] protect backend usage from concurrent applySnapshot and defrag #20553

Are you sure you want to change the base?

[fix #20271] protect backend usage from concurrent applySnapshot and defrag #20553

Conversation

silentred commented Aug 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

k8s-ci-robot commented Aug 24, 2025

Uh oh!

k8s-ci-robot commented Aug 24, 2025

Uh oh!

ahrtr commented Aug 26, 2025

Uh oh!

ahrtr commented Aug 26, 2025

Uh oh!

silentred commented Aug 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ahrtr commented Aug 27, 2025

Uh oh!

fuweid Aug 28, 2025

Choose a reason for hiding this comment

Uh oh!

silentred Sep 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

silentred commented Sep 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ahrtr commented Sep 1, 2025

Uh oh!

Uh oh!

silentred commented Sep 2, 2025

Uh oh!

Uh oh!

[fix #20271] protect backend usage from concurrent `applySnapshot` and `defrag` #20553

[fix #20271] protect backend usage from concurrent `applySnapshot` and `defrag` #20553

silentred commented Aug 24, 2025 •

edited

Loading

silentred commented Aug 27, 2025 •

edited

Loading

silentred Sep 2, 2025 •

edited

Loading

silentred commented Sep 1, 2025 •

edited

Loading