Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

core: fix data races in unit tests #609

Merged
merged 5 commits into from
Oct 22, 2024
Merged

Conversation

minh-bq
Copy link
Contributor

@minh-bq minh-bq commented Oct 21, 2024

  • core: fix data race in TestInsertChainWithSidecars
    We get this data race when running tests
WARNING: DATA RACE
Write at 0x00c0117bbf40 by goroutine 77211:
  github.com/ethereum/go-ethereum/core.TestInsertChainWithSidecars()
      /home/runner/work/ronin/ronin/core/blockchain_test.go:4215 +0x49fd
  testing.tRunner()
      /opt/hostedtoolcache/go/1.20.10/x64/src/testing/testing.go:1576 +0x216
  testing.(*T).Run.func1()
      /opt/hostedtoolcache/go/1.20.10/x64/src/testing/testing.go:1629 +0x47

Previous write at 0x00c0117bbf40 by goroutine 82133:
  github.com/ethereum/go-ethereum/core.TestInsertChainWithSidecars.func6.1()
      /home/runner/work/ronin/ronin/core/blockchain_test.go:4179 +0x5a4
  testing.tRunner()
      /opt/hostedtoolcache/go/1.20.10/x64/src/testing/testing.go:1576 +0x216
  testing.(*T).Run.func1()
      /opt/hostedtoolcache/go/1.20.10/x64/src/testing/testing.go:1629 +0x47

The data race happens because 2 goroutines use the same variables. This commit
makes them use their own variables.

  • core: fix data race in cacheConfig.TriesInMemory access
    We get this data race when running tests
WARNING: DATA RACE
Read at 0x000001faa090 by goroutine 23:
  github.com/ethereum/go-ethereum/core.NewBlockChain()
      /home/runner/work/ronin/ronin/core/blockchain.go:247 +0xc4
  github.com/ethereum/go-ethereum/eth/protocols/eth.newTestBackendWithGenerator()
      /home/runner/work/ronin/ronin/eth/protocols/eth/handler_test.go:74 +0x4c4
  github.com/ethereum/go-ethereum/eth/protocols/eth.testGetBlockReceipts()
      /home/runner/work/ronin/ronin/eth/protocols/eth/handler_test.go:569 +0x1f5
  github.com/ethereum/go-ethereum/eth/protocols/eth.TestGetBlockReceipts66()
      /home/runner/work/ronin/ronin/eth/protocols/eth/handler_test.go:528 +0x33
  testing.tRunner()
      /opt/hostedtoolcache/go/1.20.10/x64/src/testing/testing.go:1576 +0x216
  testing.(*T).Run.func1()
      /opt/hostedtoolcache/go/1.20.10/x64/src/testing/testing.go:1629 +0x47

Previous write at 0x000001faa090 by goroutine 19:
  github.com/ethereum/go-ethereum/core.NewBlockChain()
      /home/runner/work/ronin/ronin/core/blockchain.go:248 +0xe4
  github.com/ethereum/go-ethereum/eth/protocols/eth.newTestBackendWithGenerator()
      /home/runner/work/ronin/ronin/eth/protocols/eth/handler_test.go:74 +0x4c4
  github.com/ethereum/go-ethereum/eth/protocols/eth.newTestBackend()
      /home/runner/work/ronin/ronin/eth/protocols/eth/handler_test.go:61 +0x64
  github.com/ethereum/go-ethereum/eth/protocols/eth.testGetBlockHeaders()
      /home/runner/work/ronin/ronin/eth/protocols/eth/handler_test.go:126 +0x53
  github.com/ethereum/go-ethereum/eth/protocols/eth.TestGetBlockHeaders66()
      /home/runner/work/ronin/ronin/eth/protocols/eth/handler_test.go:121 +0x33
  testing.tRunner()
      /opt/hostedtoolcache/go/1.20.10/x64/src/testing/testing.go:1576 +0x216
  testing.(*T).Run.func1()
      /opt/hostedtoolcache/go/1.20.10/x64/src/testing/testing.go:1629 +0x47

In the tests, the cacheConfig passed to NewBlockChain is nil so the
defaultCacheConfig is used. However, the defaultCacheConfig.TriesInMemory is 0,
so we fall into the path that set the cacheConfig.TriesInMemory to
DefaultTriesInMemory which is 128. This set is made to the global
defaultCacheConfig which causes data race with other accesses. This commit set
the defaultCacheConfig.TriesInMemory to DefaultTriesInMemory.

  • core/vote: hold read lock while getting vote pool stats
    We get multiples data races while running vote pool tests
WARNING: DATA RACE
Read at 0x00c0002a3ce0 by goroutine 19:
  github.com/ethereum/go-ethereum/core/vote.(*VotePool).verifyStructureSizeOfVotePool()
      /home/runner/work/ronin/ronin/core/vote/vote_pool_test.go:77 +0xaa
  github.com/ethereum/go-ethereum/core/vote.testVotePool()
      /home/runner/work/ronin/ronin/core/vote/vote_pool_test.go:160 +0xe84
  github.com/ethereum/go-ethereum/core/vote.TestValidVotePool()
      /home/runner/work/ronin/ronin/core/vote/vote_pool_test.go:85 +0x33
  testing.tRunner()
      /opt/hostedtoolcache/go/1.20.10/x64/src/testing/testing.go:1576 +0x216
  testing.(*T).Run.func1()
      /opt/hostedtoolcache/go/1.20.10/x64/src/testing/testing.go:1629 +0x47

Previous write at 0x00c0002a3ce0 by goroutine 23:
  runtime.mapassign()
      /opt/hostedtoolcache/go/1.20.10/x64/src/runtime/map.go:578 +0x0
  github.com/ethereum/go-ethereum/core/vote.(*VotePool).putVote()
      /home/runner/work/ronin/ronin/core/vote/vote_pool.go:236 +0x346
  github.com/ethereum/go-ethereum/core/vote.(*VotePool).putIntoVotePool()
      /home/runner/work/ronin/ronin/core/vote/vote_pool.go:212 +0x978
  github.com/ethereum/go-ethereum/core/vote.(*VotePool).loop()
      /home/runner/work/ronin/ronin/core/vote/vote_pool.go:123 +0x504
  github.com/ethereum/go-ethereum/core/vote.NewVotePool.func1()
      /home/runner/work/ronin/ronin/core/vote/vote_pool.go:98 +0x39

Currently, when getting vote pool's stats for checking in unit test we access
these fields directly without holding lock. This commit creates some helper
funtion to get these stats safely with read lock hold.

  • internal/cmdtest: increase the cli command test timeout
    The current timeout is 5s which is not enough when running tests with race
    detector. Increase that timeout to 15s.

  • p2p/discover: use synchronous version of addSeenNode
    We get this data race when running TestTable_BucketIPLimit

Write at 0x00c00029c000 by goroutine 64:
  github.com/ethereum/go-ethereum/p2p/discover.(*Table).addSeenNodeSync()
      /home/runner/work/ronin/ronin/p2p/discover/table.go:570 +0x70a
  github.com/ethereum/go-ethereum/p2p/discover.(*Table).addSeenNode.func1()
      /home/runner/work/ronin/ronin/p2p/discover/table.go:527 +0x47
Previous read at 0x00c00029c000 by goroutine 57:
  github.com/ethereum/go-ethereum/p2p/discover.checkIPLimitInvariant()
      /home/runner/work/ronin/ronin/p2p/discover/table_test.go:187 +0x105
  github.com/ethereum/go-ethereum/p2p/discover.TestTable_BucketIPLimit()
      /home/runner/work/ronin/ronin/p2p/discover/table_test.go:177 +0x2e4
  testing.tRunner()
      /opt/hostedtoolcache/go/1.20.10/x64/src/testing/testing.go:1576 +0x216
  testing.(*T).Run.func1()
      /opt/hostedtoolcache/go/1.20.10/x64/src/testing/testing.go:1629 +0x47

This commit changes TestTable_BucketIPLimit to use synchronous version of
addSeenNode to avoid the data race.

We get this data race when running tests

WARNING: DATA RACE
Write at 0x00c0117bbf40 by goroutine 77211:
  github.com/ethereum/go-ethereum/core.TestInsertChainWithSidecars()
      /home/runner/work/ronin/ronin/core/blockchain_test.go:4215 +0x49fd
  testing.tRunner()
      /opt/hostedtoolcache/go/1.20.10/x64/src/testing/testing.go:1576 +0x216
  testing.(*T).Run.func1()
      /opt/hostedtoolcache/go/1.20.10/x64/src/testing/testing.go:1629 +0x47

Previous write at 0x00c0117bbf40 by goroutine 82133:
  github.com/ethereum/go-ethereum/core.TestInsertChainWithSidecars.func6.1()
      /home/runner/work/ronin/ronin/core/blockchain_test.go:4179 +0x5a4
  testing.tRunner()
      /opt/hostedtoolcache/go/1.20.10/x64/src/testing/testing.go:1576 +0x216
  testing.(*T).Run.func1()
      /opt/hostedtoolcache/go/1.20.10/x64/src/testing/testing.go:1629 +0x47

The data race happens because 2 goroutines use the same variables. This commit
makes them use their own variables.
We get this data race when running tests

WARNING: DATA RACE
Read at 0x000001faa090 by goroutine 23:
  github.com/ethereum/go-ethereum/core.NewBlockChain()
      /home/runner/work/ronin/ronin/core/blockchain.go:247 +0xc4
  github.com/ethereum/go-ethereum/eth/protocols/eth.newTestBackendWithGenerator()
      /home/runner/work/ronin/ronin/eth/protocols/eth/handler_test.go:74 +0x4c4
  github.com/ethereum/go-ethereum/eth/protocols/eth.testGetBlockReceipts()
      /home/runner/work/ronin/ronin/eth/protocols/eth/handler_test.go:569 +0x1f5
  github.com/ethereum/go-ethereum/eth/protocols/eth.TestGetBlockReceipts66()
      /home/runner/work/ronin/ronin/eth/protocols/eth/handler_test.go:528 +0x33
  testing.tRunner()
      /opt/hostedtoolcache/go/1.20.10/x64/src/testing/testing.go:1576 +0x216
  testing.(*T).Run.func1()
      /opt/hostedtoolcache/go/1.20.10/x64/src/testing/testing.go:1629 +0x47

Previous write at 0x000001faa090 by goroutine 19:
  github.com/ethereum/go-ethereum/core.NewBlockChain()
      /home/runner/work/ronin/ronin/core/blockchain.go:248 +0xe4
  github.com/ethereum/go-ethereum/eth/protocols/eth.newTestBackendWithGenerator()
      /home/runner/work/ronin/ronin/eth/protocols/eth/handler_test.go:74 +0x4c4
  github.com/ethereum/go-ethereum/eth/protocols/eth.newTestBackend()
      /home/runner/work/ronin/ronin/eth/protocols/eth/handler_test.go:61 +0x64
  github.com/ethereum/go-ethereum/eth/protocols/eth.testGetBlockHeaders()
      /home/runner/work/ronin/ronin/eth/protocols/eth/handler_test.go:126 +0x53
  github.com/ethereum/go-ethereum/eth/protocols/eth.TestGetBlockHeaders66()
      /home/runner/work/ronin/ronin/eth/protocols/eth/handler_test.go:121 +0x33
  testing.tRunner()
      /opt/hostedtoolcache/go/1.20.10/x64/src/testing/testing.go:1576 +0x216
  testing.(*T).Run.func1()
      /opt/hostedtoolcache/go/1.20.10/x64/src/testing/testing.go:1629 +0x47

In the tests, the cacheConfig passed to NewBlockChain is nil so the
defaultCacheConfig is used. However, the defaultCacheConfig.TriesInMemory is 0,
so we fall into the path that set the cacheConfig.TriesInMemory to
DefaultTriesInMemory which is 128. This set is made to the global
defaultCacheConfig which causes data race with other accesses. This commit set
the defaultCacheConfig.TriesInMemory to DefaultTriesInMemory.
We get multiples data races while running vote pool tests

WARNING: DATA RACE
Read at 0x00c0002a3ce0 by goroutine 19:
  github.com/ethereum/go-ethereum/core/vote.(*VotePool).verifyStructureSizeOfVotePool()
      /home/runner/work/ronin/ronin/core/vote/vote_pool_test.go:77 +0xaa
  github.com/ethereum/go-ethereum/core/vote.testVotePool()
      /home/runner/work/ronin/ronin/core/vote/vote_pool_test.go:160 +0xe84
  github.com/ethereum/go-ethereum/core/vote.TestValidVotePool()
      /home/runner/work/ronin/ronin/core/vote/vote_pool_test.go:85 +0x33
  testing.tRunner()
      /opt/hostedtoolcache/go/1.20.10/x64/src/testing/testing.go:1576 +0x216
  testing.(*T).Run.func1()
      /opt/hostedtoolcache/go/1.20.10/x64/src/testing/testing.go:1629 +0x47

Previous write at 0x00c0002a3ce0 by goroutine 23:
  runtime.mapassign()
      /opt/hostedtoolcache/go/1.20.10/x64/src/runtime/map.go:578 +0x0
  github.com/ethereum/go-ethereum/core/vote.(*VotePool).putVote()
      /home/runner/work/ronin/ronin/core/vote/vote_pool.go:236 +0x346
  github.com/ethereum/go-ethereum/core/vote.(*VotePool).putIntoVotePool()
      /home/runner/work/ronin/ronin/core/vote/vote_pool.go:212 +0x978
  github.com/ethereum/go-ethereum/core/vote.(*VotePool).loop()
      /home/runner/work/ronin/ronin/core/vote/vote_pool.go:123 +0x504
  github.com/ethereum/go-ethereum/core/vote.NewVotePool.func1()
      /home/runner/work/ronin/ronin/core/vote/vote_pool.go:98 +0x39

Currently, when getting vote pool's stats for checking in unit test we access
these fields directly without holding lock. This commit creates some helper
funtion to get these stats safely with read lock hold.
The current timeout is 5s which is not enough when running tests with race
detector. Increase that timeout to 15s.
We get this data race when running TestTable_BucketIPLimit

Write at 0x00c00029c000 by goroutine 64:
  github.com/ethereum/go-ethereum/p2p/discover.(*Table).addSeenNodeSync()
      /home/runner/work/ronin/ronin/p2p/discover/table.go:570 +0x70a
  github.com/ethereum/go-ethereum/p2p/discover.(*Table).addSeenNode.func1()
      /home/runner/work/ronin/ronin/p2p/discover/table.go:527 +0x47
Previous read at 0x00c00029c000 by goroutine 57:
  github.com/ethereum/go-ethereum/p2p/discover.checkIPLimitInvariant()
      /home/runner/work/ronin/ronin/p2p/discover/table_test.go:187 +0x105
  github.com/ethereum/go-ethereum/p2p/discover.TestTable_BucketIPLimit()
      /home/runner/work/ronin/ronin/p2p/discover/table_test.go:177 +0x2e4
  testing.tRunner()
      /opt/hostedtoolcache/go/1.20.10/x64/src/testing/testing.go:1576 +0x216
  testing.(*T).Run.func1()
      /opt/hostedtoolcache/go/1.20.10/x64/src/testing/testing.go:1629 +0x47

This commit changes TestTable_BucketIPLimit to use synchronous version of
addSeenNode to avoid the data race.
@minh-bq
Copy link
Contributor Author

minh-bq commented Oct 21, 2024

Pass all tests with race detector enabled: https://github.com/axieinfinity/ronin/actions/runs/11437046076/job/31815765570

@minh-bq minh-bq merged commit 782bf20 into axieinfinity:master Oct 22, 2024
1 check passed
@minh-bq minh-bq deleted the fix/tests branch October 22, 2024 03:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants