Skip to content

Conversation

hwdef
Copy link
Contributor

@hwdef hwdef commented Aug 22, 2025

fix: #18667

The code comes from #18667 (comment)

Performance Testing:

txn-mixed

Original

Details:
Summary:
  Total:        7.8784 secs.
  Slowest:      0.0074 secs.
  Fastest:      0.0001 secs.
  Average:      0.0002 secs.
  Stddev:       0.0001 secs.
  Requests/sec: 630.5875

Response time histogram:
  0.0001 [1]    |
  0.0008 [4958] |∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎
  0.0015 [1]    |
  0.0023 [3]    |
  0.0030 [2]    |
  0.0037 [2]    |
  0.0045 [0]    |
  0.0052 [0]    |
  0.0059 [0]    |
  0.0066 [0]    |
  0.0074 [1]    |

Latency distribution:
  10% in 0.0001 secs.
  25% in 0.0001 secs.
  50% in 0.0002 secs.
  75% in 0.0002 secs.
  90% in 0.0002 secs.
  95% in 0.0002 secs.
  99% in 0.0003 secs.
  99.9% in 0.0030 secs.

Total Write Ops: 5032
Details:
Summary:
  Total:        7.8783 secs.
  Slowest:      0.0077 secs.
  Fastest:      0.0010 secs.
  Average:      0.0014 secs.
  Stddev:       0.0005 secs.
  Requests/sec: 638.7148

Response time histogram:
  0.0010 [1]    |
  0.0016 [3920] |∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎
  0.0023 [1029] |∎∎∎∎∎∎∎∎∎∎
  0.0030 [9]    |
  0.0037 [27]   |
  0.0043 [25]   |
  0.0050 [13]   |
  0.0057 [1]    |
  0.0064 [0]    |
  0.0070 [2]    |
  0.0077 [5]    |

Latency distribution:
  10% in 0.0010 secs.
  25% in 0.0011 secs.
  50% in 0.0013 secs.
  75% in 0.0016 secs.
  90% in 0.0018 secs.
  95% in 0.0019 secs.
  99% in 0.0035 secs.
  99.9% in 0.0071 secs.

Modified

Details:
Summary:
  Total:        8.2452 secs.
  Slowest:      0.0022 secs.
  Fastest:      0.0001 secs.
  Average:      0.0002 secs.
  Stddev:       0.0001 secs.
  Requests/sec: 611.3825

Response time histogram:
  0.0001 [1]    |
  0.0003 [4971] |∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎
  0.0005 [52]   |
  0.0007 [9]    |
  0.0009 [0]    |
  0.0011 [0]    |
  0.0014 [0]    |
  0.0016 [0]    |
  0.0018 [0]    |
  0.0020 [2]    |
  0.0022 [6]    |

Latency distribution:
  10% in 0.0001 secs.
  25% in 0.0002 secs.
  50% in 0.0002 secs.
  75% in 0.0002 secs.
  90% in 0.0002 secs.
  95% in 0.0002 secs.
  99% in 0.0004 secs.
  99.9% in 0.0020 secs.

Total Write Ops: 4959
Details:
Summary:
  Total:        8.2452 secs.
  Slowest:      0.0231 secs.
  Fastest:      0.0010 secs.
  Average:      0.0015 secs.
  Stddev:       0.0006 secs.
  Requests/sec: 601.4372

Response time histogram:
  0.0010 [1]    |
  0.0032 [4909] |∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎
  0.0054 [40]   |
  0.0076 [6]    |
  0.0098 [1]    |
  0.0120 [1]    |
  0.0143 [0]    |
  0.0165 [0]    |
  0.0187 [0]    |
  0.0209 [0]    |
  0.0231 [1]    |

Latency distribution:
  10% in 0.0011 secs.
  25% in 0.0011 secs.
  50% in 0.0013 secs.
  75% in 0.0018 secs.
  90% in 0.0019 secs.
  95% in 0.0019 secs.
  99% in 0.0032 secs.
  99.9% in 0.0072 secs.

txn-put

Original

Summary:
  Total:        15.8790 secs.
  Slowest:      0.0150 secs.
  Fastest:      0.0010 secs.
  Average:      0.0016 secs.
  Stddev:       0.0005 secs.
  Requests/sec: 629.7614

Response time histogram:
  0.0010 [1]    |
  0.0024 [9813] |∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎
  0.0038 [48]   |
  0.0052 [114]  |
  0.0066 [3]    |
  0.0080 [15]   |
  0.0094 [4]    |
  0.0108 [1]    |
  0.0122 [0]    |
  0.0136 [0]    |
  0.0150 [1]    |

Latency distribution:
  10% in 0.0011 secs.
  25% in 0.0015 secs.
  50% in 0.0016 secs.
  75% in 0.0016 secs.
  90% in 0.0018 secs.
  95% in 0.0018 secs.
  99% in 0.0040 secs.
  99.9% in 0.0072 secs.

Modified

Summary:
  Total:        15.7357 secs.
  Slowest:      0.0189 secs.
  Fastest:      0.0010 secs.
  Average:      0.0016 secs.
  Stddev:       0.0005 secs.
  Requests/sec: 635.4970

Response time histogram:
  0.0010 [1]    |
  0.0028 [9816] |∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎
  0.0046 [159]  |
  0.0064 [6]    |
  0.0082 [15]   |
  0.0100 [1]    |
  0.0118 [1]    |
  0.0136 [0]    |
  0.0154 [0]    |
  0.0171 [0]    |
  0.0189 [1]    |

Latency distribution:
  10% in 0.0011 secs.
  25% in 0.0015 secs.
  50% in 0.0016 secs.
  75% in 0.0017 secs.
  90% in 0.0018 secs.
  95% in 0.0019 secs.
  99% in 0.0033 secs.
  99.9% in 0.0071 secs.

After the modification, there are some performance differences. I think this is just noise, as the results varied across multiple test runs.

cc @ahrtr @siyuanfoundation

…he same validation for TXN

Signed-off-by: hwdef <[email protected]>
Co-authored-by: Benjamin Wang <[email protected]>
@k8s-ci-robot
Copy link

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: hwdef
Once this PR has been reviewed and has the lgtm label, please assign ahrtr for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Copy link

codecov bot commented Aug 22, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 69.09%. Comparing base (929e947) to head (6a93200).
⚠️ Report is 15 commits behind head on main.

Additional details and impacted files
Files with missing lines Coverage Δ
server/etcdserver/server.go 82.99% <ø> (-0.24%) ⬇️

... and 23 files with indirect coverage changes

@@            Coverage Diff             @@
##             main   #20545      +/-   ##
==========================================
- Coverage   69.11%   69.09%   -0.02%     
==========================================
  Files         420      420              
  Lines       34776    34763      -13     
==========================================
- Hits        24034    24021      -13     
- Misses       9342     9345       +3     
+ Partials     1400     1397       -3     

Continue to review full report in Codecov by Sentry.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 929e947...6a93200. Read the comment docs.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@serathius
Copy link
Member

I don't think you tests results match up with my expectations. What's the cluster size in your setup? What command you run to execute benchmarks? Is your benchmark tool connected to all members or just one as by default.

@hwdef
Copy link
Contributor Author

hwdef commented Aug 23, 2025

I don't think you tests results match up with my expectations. What's the cluster size in your setup? What command you run to execute benchmarks? Is your benchmark tool connected to all members or just one as by default.

I follow this document

go install -v ./tools/benchmark
make build
goreman start 
benchmark txn-mixed
stop etcd server
rm -rf infra*
goreman start 
benchmark put
stop etcd server
rm -rf infra*
modify code
make build
goreman start 
benchmark txn-mixed
stop etcd server
rm -rf infra*
goreman start 
benchmark put

Are my steps correct?

@serathius
Copy link
Member

serathius commented Aug 23, 2025

Thanks for command, that makes it clear. While goreman start runs 3 member cluster, the benchmark command will only connect to one member. The change in your PR affects how transactions are processed by all members, excluding the member that accepted the connection. To notice any difference and properly measure impact of the change we need to ensure benchmark connects to all members.

This can be done by adding --endpoints 127.0.0.1:2379,127.0.0.1:22379,127.0.0.1:32379 --conns 3 --clients 3 flag to all benchmark commands

EDIT: added --conns 3 --clients 3

@serathius
Copy link
Member

serathius commented Aug 23, 2025

Also noticed a strange behavior in benchmark tool. Just specifying endpoints for all members will not change the fact that the benchmark will connect to just the first member. You need to also remember to specify --conns 3 --clients 3 to force creation of 3 connections and clients so benchmark is connects to each of them instead of just picking the first endpoint.

This is very unituitive and should be treated as a bug. cc @ahrtr @fuweid @kishen-v

@hwdef
Copy link
Contributor Author

hwdef commented Aug 23, 2025

@serathius
Hi, here is the new benchmark result.

txn-mixed

Original

$ benchmark --endpoints http://127.0.0.1:2379,http://127.0.0.1:22379,http://127.0.0.1:32379 --conns 3 --clients 3 txn-mixed
bench with linearizable range
2025/08/23 21:49:55 INFO: [core] original dial target is: "etcd-endpoints://0xc000146600/127.0.0.1:22379"
2025/08/23 21:49:55 INFO: [core] [Channel #3]Channel created
2025/08/23 21:49:55 INFO: [core] [Channel #3]parsed dial target is: resolver.Target{URL:url.URL{Scheme:"etcd-endpoints", Opaque:"", User:(*url.Userinfo)(nil), Host:"0xc000146600", Path:"/127.0.0.1:22379", RawPath:"", OmitHost:false, ForceQuery:false, RawQuery:"", Fragment:"", RawFragment:""}}
2025/08/23 21:49:55 INFO: [core] [Channel #3]Channel authority set to "127.0.0.1:22379"
2025/08/23 21:49:55 INFO: [core] [Channel #3]Resolver state updated: {
  "Addresses": null,
  "Endpoints": [
    {
      "Addresses": [
        {
          "Addr": "127.0.0.1:22379",
          "ServerName": "127.0.0.1:22379",
          "Attributes": null,
          "BalancerAttributes": null,
          "Metadata": null
        }
      ],
      "Attributes": null
    }
  ],
  "ServiceConfig": {
    "Config": {
      "Config": null,
      "Methods": {}
    },
    "Err": null
  },
  "Attributes": null
} (service config updated)
2025/08/23 21:49:55 INFO: [core] [Channel #1 SubChannel #2]Subchannel Connectivity change to CONNECTING
2025/08/23 21:49:55 INFO: [core] [Channel #3]Channel switches to new LB policy "round_robin"
2025/08/23 21:49:55 INFO: [core] [Channel #1 SubChannel #2]Subchannel picks a new address "127.0.0.1:2379" to connect
2025/08/23 21:49:55 INFO: [roundrobin] [0xc00024e510] Created
2025/08/23 21:49:55 INFO: [core] [Channel #3 SubChannel #4]Subchannel created
2025/08/23 21:49:55 INFO: [core] [Channel #3]Channel Connectivity change to CONNECTING
2025/08/23 21:49:55 INFO: [core] [Channel #3]Channel exiting idle mode
2025/08/23 21:49:55 INFO: [core] original dial target is: "etcd-endpoints://0xc000146800/127.0.0.1:32379"
2025/08/23 21:49:55 INFO: [core] [Channel #5]Channel created
2025/08/23 21:49:55 INFO: [core] [Channel #5]parsed dial target is: resolver.Target{URL:url.URL{Scheme:"etcd-endpoints", Opaque:"", User:(*url.Userinfo)(nil), Host:"0xc000146800", Path:"/127.0.0.1:32379", RawPath:"", OmitHost:false, ForceQuery:false, RawQuery:"", Fragment:"", RawFragment:""}}
2025/08/23 21:49:55 INFO: [core] [Channel #5]Channel authority set to "127.0.0.1:32379"
2025/08/23 21:49:55 INFO: [core] [Channel #3 SubChannel #4]Subchannel Connectivity change to CONNECTING
2025/08/23 21:49:55 INFO: [core] [Channel #3 SubChannel #4]Subchannel picks a new address "127.0.0.1:22379" to connect
2025/08/23 21:49:55 INFO: [core] [Channel #5]Resolver state updated: {
  "Addresses": null,
  "Endpoints": [
    {
      "Addresses": [
        {
          "Addr": "127.0.0.1:32379",
          "ServerName": "127.0.0.1:32379",
          "Attributes": null,
          "BalancerAttributes": null,
          "Metadata": null
        }
      ],
      "Attributes": null
    }
  ],
  "ServiceConfig": {
    "Config": {
      "Config": null,
      "Methods": {}
    },
    "Err": null
  },
  "Attributes": null
} (service config updated)
2025/08/23 21:49:55 INFO: [core] [Channel #5]Channel switches to new LB policy "round_robin"
2025/08/23 21:49:55 INFO: [roundrobin] [0xc00024e1b0] Created
2025/08/23 21:49:55 INFO: [core] [Channel #5 SubChannel #8]Subchannel created
2025/08/23 21:49:55 INFO: [core] [Channel #5]Channel Connectivity change to CONNECTING
2025/08/23 21:49:55 INFO: [core] [Channel #5]Channel exiting idle mode
2025/08/23 21:49:55 INFO: [core] [Channel #5 SubChannel #8]Subchannel Connectivity change to CONNECTING
2025/08/23 21:49:55 INFO: [core] [Channel #5 SubChannel #8]Subchannel picks a new address "127.0.0.1:32379" to connect
2025/08/23 21:49:55 INFO: [core] [Channel #3 SubChannel #4]Subchannel Connectivity change to READY
2025/08/23 21:49:55 INFO: [core] [Channel #3]Channel Connectivity change to READY
2025/08/23 21:49:55 INFO: [core] [Channel #1 SubChannel #2]Subchannel Connectivity change to READY
2025/08/23 21:49:55 INFO: [core] [Channel #1]Channel Connectivity change to READY
2025/08/23 21:49:55 INFO: [core] [Channel #5 SubChannel #8]Subchannel Connectivity change to READY
2025/08/23 21:49:55 INFO: [core] [Channel #5]Channel Connectivity change to READY
10000 / 10000 [-------------------------------------------------------------------------------------------------------------------------------------------------------------------------] 100.00% 1727 p/s
Total Read Ops: 4959
Details:
Summary:
  Total:        5.9889 secs.
  Slowest:      0.0279 secs.
  Fastest:      0.0001 secs.
  Average:      0.0011 secs.
  Stddev:       0.0011 secs.
  Requests/sec: 828.0251

Response time histogram:
  0.0001 [1]    |
  0.0029 [4577] |∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎
  0.0057 [367]  |∎∎∎
  0.0084 [12]   |
  0.0112 [1]    |
  0.0140 [0]    |
  0.0168 [0]    |
  0.0196 [0]    |
  0.0224 [0]    |
  0.0251 [0]    |
  0.0279 [1]    |

Latency distribution:
  10% in 0.0002 secs.
  25% in 0.0002 secs.
  50% in 0.0010 secs.
  75% in 0.0014 secs.
  90% in 0.0026 secs.
  95% in 0.0035 secs.
  99% in 0.0048 secs.
  99.9% in 0.0080 secs.

Total Write Ops: 5041
Details:
Summary:
  Total:        5.9890 secs.
  Slowest:      0.0279 secs.
  Fastest:      0.0008 secs.
  Average:      0.0024 secs.
  Stddev:       0.0013 secs.
  Requests/sec: 841.7159

Response time histogram:
  0.0008 [1]    |
  0.0035 [4223] |∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎
  0.0062 [780]  |∎∎∎∎∎∎∎
  0.0089 [23]   |
  0.0117 [8]    |
  0.0144 [2]    |
  0.0171 [2]    |
  0.0198 [0]    |
  0.0225 [0]    |
  0.0252 [0]    |
  0.0279 [2]    |

Latency distribution:
  10% in 0.0011 secs.
  25% in 0.0015 secs.
  50% in 0.0023 secs.
  75% in 0.0028 secs.
  90% in 0.0039 secs.
  95% in 0.0048 secs.
  99% in 0.0061 secs.
  99.9% in 0.0143 secs.

Modified

$ benchmark --endpoints http://127.0.0.1:2379,http://127.0.0.1:22379,http://127.0.0.1:32379 --conns 3 --clients 3 txn-mixed
bench with linearizable range
2025/08/23 21:47:04 INFO: [core] [Channel #1 SubChannel #2]Subchannel Connectivity change to CONNECTING
2025/08/23 21:47:04 INFO: [core] [Channel #1 SubChannel #2]Subchannel picks a new address "127.0.0.1:2379" to connect
2025/08/23 21:47:04 INFO: [core] original dial target is: "etcd-endpoints://0xc000122800/127.0.0.1:22379"
2025/08/23 21:47:04 INFO: [core] [Channel #3]Channel created
2025/08/23 21:47:04 INFO: [core] [Channel #3]parsed dial target is: resolver.Target{URL:url.URL{Scheme:"etcd-endpoints", Opaque:"", User:(*url.Userinfo)(nil), Host:"0xc000122800", Path:"/127.0.0.1:22379", RawPath:"", OmitHost:false, ForceQuery:false, RawQuery:"", Fragment:"", RawFragment:""}}
2025/08/23 21:47:04 INFO: [core] [Channel #3]Channel authority set to "127.0.0.1:22379"
2025/08/23 21:47:04 INFO: [core] [Channel #3]Resolver state updated: {
  "Addresses": null,
  "Endpoints": [
    {
      "Addresses": [
        {
          "Addr": "127.0.0.1:22379",
          "ServerName": "127.0.0.1:22379",
          "Attributes": null,
          "BalancerAttributes": null,
          "Metadata": null
        }
      ],
      "Attributes": null
    }
  ],
  "ServiceConfig": {
    "Config": {
      "Config": null,
      "Methods": {}
    },
    "Err": null
  },
  "Attributes": null
} (service config updated)
2025/08/23 21:47:04 INFO: [core] [Channel #3]Channel switches to new LB policy "round_robin"
2025/08/23 21:47:04 INFO: [roundrobin] [0xc0001e24e0] Created
2025/08/23 21:47:04 INFO: [core] [Channel #3 SubChannel #4]Subchannel created
2025/08/23 21:47:04 INFO: [core] [Channel #3]Channel Connectivity change to CONNECTING
2025/08/23 21:47:04 INFO: [core] [Channel #3]Channel exiting idle mode
2025/08/23 21:47:04 INFO: [core] original dial target is: "etcd-endpoints://0xc000122a00/127.0.0.1:32379"
2025/08/23 21:47:04 INFO: [core] [Channel #5]Channel created
2025/08/23 21:47:04 INFO: [core] [Channel #3 SubChannel #4]Subchannel Connectivity change to CONNECTING
2025/08/23 21:47:04 INFO: [core] [Channel #5]parsed dial target is: resolver.Target{URL:url.URL{Scheme:"etcd-endpoints", Opaque:"", User:(*url.Userinfo)(nil), Host:"0xc000122a00", Path:"/127.0.0.1:32379", RawPath:"", OmitHost:false, ForceQuery:false, RawQuery:"", Fragment:"", RawFragment:""}}
2025/08/23 21:47:04 INFO: [core] [Channel #5]Channel authority set to "127.0.0.1:32379"
2025/08/23 21:47:04 INFO: [core] [Channel #3 SubChannel #4]Subchannel picks a new address "127.0.0.1:22379" to connect
2025/08/23 21:47:04 INFO: [core] [Channel #1 SubChannel #2]Subchannel Connectivity change to READY
2025/08/23 21:47:04 INFO: [core] [Channel #5]Resolver state updated: {
  "Addresses": null,
  "Endpoints": [
    {
      "Addresses": [
        {
          "Addr": "127.0.0.1:32379",
          "ServerName": "127.0.0.1:32379",
          "Attributes": null,
          "BalancerAttributes": null,
          "Metadata": null
        }
      ],
      "Attributes": null
    }
  ],
  "ServiceConfig": {
    "Config": {
      "Config": null,
      "Methods": {}
    },
    "Err": null
  },
  "Attributes": null
} (service config updated)
2025/08/23 21:47:04 INFO: [core] [Channel #1]Channel Connectivity change to READY
2025/08/23 21:47:04 INFO: [core] [Channel #5]Channel switches to new LB policy "round_robin"
2025/08/23 21:47:04 INFO: [roundrobin] [0xc000623b00] Created
2025/08/23 21:47:04 INFO: [core] [Channel #5 SubChannel #8]Subchannel created
2025/08/23 21:47:04 INFO: [core] [Channel #5]Channel Connectivity change to CONNECTING
2025/08/23 21:47:04 INFO: [core] [Channel #5]Channel exiting idle mode
2025/08/23 21:47:04 INFO: [core] [Channel #3 SubChannel #4]Subchannel Connectivity change to READY
2025/08/23 21:47:04 INFO: [core] [Channel #3]Channel Connectivity change to READY
2025/08/23 21:47:04 INFO: [core] [Channel #5 SubChannel #8]Subchannel Connectivity change to CONNECTING
2025/08/23 21:47:04 INFO: [core] [Channel #5 SubChannel #8]Subchannel picks a new address "127.0.0.1:32379" to connect
2025/08/23 21:47:04 INFO: [core] [Channel #5 SubChannel #8]Subchannel Connectivity change to READY
2025/08/23 21:47:04 INFO: [core] [Channel #5]Channel Connectivity change to READY
10000 / 10000 [-------------------------------------------------------------------------------------------------------------------------------------------------------------------------] 100.00% 1729 p/s
Total Read Ops: 5010
Details:
Summary:
  Total:        5.9850 secs.
  Slowest:      0.0280 secs.
  Fastest:      0.0001 secs.
  Average:      0.0011 secs.
  Stddev:       0.0012 secs.
  Requests/sec: 837.0906

Response time histogram:
  0.0001 [1]    |
  0.0029 [4594] |∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎
  0.0057 [400]  |∎∎∎
  0.0085 [11]   |
  0.0113 [1]    |
  0.0140 [1]    |
  0.0168 [0]    |
  0.0196 [0]    |
  0.0224 [0]    |
  0.0252 [1]    |
  0.0280 [1]    |

Latency distribution:
  10% in 0.0002 secs.
  25% in 0.0002 secs.
  50% in 0.0009 secs.
  75% in 0.0014 secs.
  90% in 0.0027 secs.
  95% in 0.0036 secs.
  99% in 0.0048 secs.
  99.9% in 0.0080 secs.

Total Write Ops: 4990
Details:
Summary:
  Total:        5.9850 secs.
  Slowest:      0.0281 secs.
  Fastest:      0.0008 secs.
  Average:      0.0024 secs.
  Stddev:       0.0014 secs.
  Requests/sec: 833.7483

Response time histogram:
  0.0008 [1]    |
  0.0035 [4205] |∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎
  0.0063 [740]  |∎∎∎∎∎∎∎
  0.0090 [37]   |
  0.0117 [3]    |
  0.0144 [0]    |
  0.0172 [0]    |
  0.0199 [0]    |
  0.0226 [0]    |
  0.0253 [2]    |
  0.0281 [2]    |

Latency distribution:
  10% in 0.0011 secs.
  25% in 0.0015 secs.
  50% in 0.0023 secs.
  75% in 0.0028 secs.
  90% in 0.0039 secs.
  95% in 0.0048 secs.
  99% in 0.0062 secs.
  99.9% in 0.0234 secs.

txn-put

Original

$ benchmark --endpoints http://127.0.0.1:2379,http://127.0.0.1:22379,http://127.0.0.1:32379 --conns 3 --clients 3 txn-put  
2025/08/23 21:50:47 INFO: [core] original dial target is: "etcd-endpoints://0xc0001a2a00/127.0.0.1:22379"
2025/08/23 21:50:47 INFO: [core] [Channel #3]Channel created
2025/08/23 21:50:47 INFO: [core] [Channel #3]parsed dial target is: resolver.Target{URL:url.URL{Scheme:"etcd-endpoints", Opaque:"", User:(*url.Userinfo)(nil), Host:"0xc0001a2a00", Path:"/127.0.0.1:22379", RawPath:"", OmitHost:false, ForceQuery:false, RawQuery:"", Fragment:"", RawFragment:""}}
2025/08/23 21:50:47 INFO: [core] [Channel #3]Channel authority set to "127.0.0.1:22379"
2025/08/23 21:50:47 INFO: [core] [Channel #1 SubChannel #2]Subchannel Connectivity change to CONNECTING
2025/08/23 21:50:47 INFO: [core] [Channel #1 SubChannel #2]Subchannel picks a new address "127.0.0.1:2379" to connect
2025/08/23 21:50:47 INFO: [core] [Channel #3]Resolver state updated: {
  "Addresses": null,
  "Endpoints": [
    {
      "Addresses": [
        {
          "Addr": "127.0.0.1:22379",
          "ServerName": "127.0.0.1:22379",
          "Attributes": null,
          "BalancerAttributes": null,
          "Metadata": null
        }
      ],
      "Attributes": null
    }
  ],
  "ServiceConfig": {
    "Config": {
      "Config": null,
      "Methods": {}
    },
    "Err": null
  },
  "Attributes": null
} (service config updated)
2025/08/23 21:50:47 INFO: [core] [Channel #3]Channel switches to new LB policy "round_robin"
2025/08/23 21:50:47 INFO: [roundrobin] [0xc0001f6090] Created
2025/08/23 21:50:47 INFO: [core] [Channel #3 SubChannel #4]Subchannel created
2025/08/23 21:50:47 INFO: [core] [Channel #3]Channel Connectivity change to CONNECTING
2025/08/23 21:50:47 INFO: [core] [Channel #3]Channel exiting idle mode
2025/08/23 21:50:47 INFO: [core] original dial target is: "etcd-endpoints://0xc0001a2c00/127.0.0.1:32379"
2025/08/23 21:50:47 INFO: [core] [Channel #5]Channel created
2025/08/23 21:50:47 INFO: [core] [Channel #5]parsed dial target is: resolver.Target{URL:url.URL{Scheme:"etcd-endpoints", Opaque:"", User:(*url.Userinfo)(nil), Host:"0xc0001a2c00", Path:"/127.0.0.1:32379", RawPath:"", OmitHost:false, ForceQuery:false, RawQuery:"", Fragment:"", RawFragment:""}}
2025/08/23 21:50:47 INFO: [core] [Channel #5]Channel authority set to "127.0.0.1:32379"
2025/08/23 21:50:47 INFO: [core] [Channel #3 SubChannel #4]Subchannel Connectivity change to CONNECTING
2025/08/23 21:50:47 INFO: [core] [Channel #3 SubChannel #4]Subchannel picks a new address "127.0.0.1:22379" to connect
2025/08/23 21:50:47 INFO: [core] [Channel #5]Resolver state updated: {
  "Addresses": null,
  "Endpoints": [
    {
      "Addresses": [
        {
          "Addr": "127.0.0.1:32379",
          "ServerName": "127.0.0.1:32379",
          "Attributes": null,
          "BalancerAttributes": null,
          "Metadata": null
        }
      ],
      "Attributes": null
    }
  ],
  "ServiceConfig": {
    "Config": {
      "Config": null,
      "Methods": {}
    },
    "Err": null
  },
  "Attributes": null
} (service config updated)
2025/08/23 21:50:47 INFO: [core] [Channel #1 SubChannel #2]Subchannel Connectivity change to READY
2025/08/23 21:50:47 INFO: [core] [Channel #5]Channel switches to new LB policy "round_robin"
2025/08/23 21:50:47 INFO: [core] [Channel #1]Channel Connectivity change to READY
2025/08/23 21:50:47 INFO: [roundrobin] [0xc000245a40] Created
2025/08/23 21:50:47 INFO: [core] [Channel #5 SubChannel #7]Subchannel created
2025/08/23 21:50:47 INFO: [core] [Channel #5]Channel Connectivity change to CONNECTING
2025/08/23 21:50:47 INFO: [core] [Channel #5]Channel exiting idle mode
2025/08/23 21:50:47 INFO: [core] [Channel #5 SubChannel #7]Subchannel Connectivity change to CONNECTING
2025/08/23 21:50:47 INFO: [core] [Channel #5 SubChannel #7]Subchannel picks a new address "127.0.0.1:32379" to connect
2025/08/23 21:50:47 INFO: [core] [Channel #3 SubChannel #4]Subchannel Connectivity change to READY
2025/08/23 21:50:47 INFO: [core] [Channel #3]Channel Connectivity change to READY
2025/08/23 21:50:47 INFO: [core] [Channel #5 SubChannel #7]Subchannel Connectivity change to READY
2025/08/23 21:50:47 INFO: [core] [Channel #5]Channel Connectivity change to READY
10000 / 10000 [--------------------------------------------------------------------------------------------------------------------------------------------------------------------------] 100.00% 888 p/s

Summary:
  Total:        11.4617 secs.
  Slowest:      0.0347 secs.
  Fastest:      0.0011 secs.
  Average:      0.0034 secs.
  Stddev:       0.0016 secs.
  Requests/sec: 872.4682

Response time histogram:
  0.0011 [1]    |
  0.0044 [7460] |∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎
  0.0078 [2469] |∎∎∎∎∎∎∎∎∎∎∎∎∎
  0.0112 [45]   |
  0.0145 [18]   |
  0.0179 [1]    |
  0.0212 [0]    |
  0.0246 [3]    |
  0.0280 [0]    |
  0.0313 [2]    |
  0.0347 [1]    |

Latency distribution:
  10% in 0.0022 secs.
  25% in 0.0024 secs.
  50% in 0.0026 secs.
  75% in 0.0046 secs.
  90% in 0.0060 secs.
  95% in 0.0061 secs.
  99% in 0.0071 secs.
  99.9% in 0.0127 secs.

Modify

$ benchmark --endpoints http://127.0.0.1:2379,http://127.0.0.1:22379,http://127.0.0.1:32379 --conns 3 --clients 3 txn-put  
2025/08/23 21:45:56 INFO: [core] original dial target is: "etcd-endpoints://0xc0003b6400/127.0.0.1:22379"
2025/08/23 21:45:56 INFO: [core] [Channel #3]Channel created
2025/08/23 21:45:56 INFO: [core] [Channel #3]parsed dial target is: resolver.Target{URL:url.URL{Scheme:"etcd-endpoints", Opaque:"", User:(*url.Userinfo)(nil), Host:"0xc0003b6400", Path:"/127.0.0.1:22379", RawPath:"", OmitHost:false, ForceQuery:false, RawQuery:"", Fragment:"", RawFragment:""}}
2025/08/23 21:45:56 INFO: [core] [Channel #3]Channel authority set to "127.0.0.1:22379"
2025/08/23 21:45:56 INFO: [core] [Channel #3]Resolver state updated: {
  "Addresses": null,
  "Endpoints": [
    {
      "Addresses": [
        {
          "Addr": "127.0.0.1:22379",
          "ServerName": "127.0.0.1:22379",
          "Attributes": null,
          "BalancerAttributes": null,
          "Metadata": null
        }
      ],
      "Attributes": null
    }
  ],
  "ServiceConfig": {
    "Config": {
      "Config": null,
      "Methods": {}
    },
    "Err": null
  },
  "Attributes": null
} (service config updated)
2025/08/23 21:45:56 INFO: [core] [Channel #3]Channel switches to new LB policy "round_robin"
2025/08/23 21:45:56 INFO: [core] [Channel #1 SubChannel #2]Subchannel Connectivity change to CONNECTING
2025/08/23 21:45:56 INFO: [roundrobin] [0xc0005385a0] Created
2025/08/23 21:45:56 INFO: [core] [Channel #1 SubChannel #2]Subchannel picks a new address "127.0.0.1:2379" to connect
2025/08/23 21:45:56 INFO: [core] [Channel #3 SubChannel #4]Subchannel created
2025/08/23 21:45:56 INFO: [core] [Channel #3]Channel Connectivity change to CONNECTING
2025/08/23 21:45:56 INFO: [core] [Channel #3]Channel exiting idle mode
2025/08/23 21:45:56 INFO: [core] [Channel #3 SubChannel #4]Subchannel Connectivity change to CONNECTING
2025/08/23 21:45:56 INFO: [core] [Channel #3 SubChannel #4]Subchannel picks a new address "127.0.0.1:22379" to connect
2025/08/23 21:45:56 INFO: [core] original dial target is: "etcd-endpoints://0xc0003b6600/127.0.0.1:32379"
2025/08/23 21:45:56 INFO: [core] [Channel #5]Channel created
2025/08/23 21:45:56 INFO: [core] [Channel #5]parsed dial target is: resolver.Target{URL:url.URL{Scheme:"etcd-endpoints", Opaque:"", User:(*url.Userinfo)(nil), Host:"0xc0003b6600", Path:"/127.0.0.1:32379", RawPath:"", OmitHost:false, ForceQuery:false, RawQuery:"", Fragment:"", RawFragment:""}}
2025/08/23 21:45:56 INFO: [core] [Channel #5]Channel authority set to "127.0.0.1:32379"
2025/08/23 21:45:56 INFO: [core] [Channel #5]Resolver state updated: {
  "Addresses": null,
  "Endpoints": [
    {
      "Addresses": [
        {
          "Addr": "127.0.0.1:32379",
          "ServerName": "127.0.0.1:32379",
          "Attributes": null,
          "BalancerAttributes": null,
          "Metadata": null
        }
      ],
      "Attributes": null
    }
  ],
  "ServiceConfig": {
    "Config": {
      "Config": null,
      "Methods": {}
    },
    "Err": null
  },
  "Attributes": null
} (service config updated)
2025/08/23 21:45:56 INFO: [core] [Channel #5]Channel switches to new LB policy "round_robin"
2025/08/23 21:45:56 INFO: [roundrobin] [0xc000135b30] Created
2025/08/23 21:45:56 INFO: [core] [Channel #3 SubChannel #4]Subchannel Connectivity change to READY
2025/08/23 21:45:56 INFO: [core] [Channel #5 SubChannel #8]Subchannel created
2025/08/23 21:45:56 INFO: [core] [Channel #5 SubChannel #8]Subchannel Connectivity change to CONNECTING
2025/08/23 21:45:56 INFO: [core] [Channel #1 SubChannel #2]Subchannel Connectivity change to READY
2025/08/23 21:45:56 INFO: [core] [Channel #5]Channel Connectivity change to CONNECTING
2025/08/23 21:45:56 INFO: [core] [Channel #3]Channel Connectivity change to READY
2025/08/23 21:45:56 INFO: [core] [Channel #5 SubChannel #8]Subchannel picks a new address "127.0.0.1:32379" to connect
2025/08/23 21:45:56 INFO: [core] [Channel #5]Channel exiting idle mode
2025/08/23 21:45:56 INFO: [core] [Channel #1]Channel Connectivity change to READY
2025/08/23 21:45:56 INFO: [core] [Channel #5 SubChannel #8]Subchannel Connectivity change to READY
2025/08/23 21:45:56 INFO: [core] [Channel #5]Channel Connectivity change to READY
10000 / 10000 [--------------------------------------------------------------------------------------------------------------------------------------------------------------------------] 100.00% 892 p/s

Summary:
  Total:        11.4159 secs.
  Slowest:      0.0164 secs.
  Fastest:      0.0011 secs.
  Average:      0.0034 secs.
  Stddev:       0.0015 secs.
  Requests/sec: 875.9685

Response time histogram:
  0.0011 [1]    |
  0.0026 [4521] |∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎
  0.0041 [2689] |∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎
  0.0056 [1753] |∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎
  0.0072 [938]  |∎∎∎∎∎∎∎∎
  0.0087 [49]   |
  0.0102 [24]   |
  0.0118 [14]   |
  0.0133 [8]    |
  0.0148 [1]    |
  0.0164 [2]    |

Latency distribution:
  10% in 0.0020 secs.
  25% in 0.0024 secs.
  50% in 0.0027 secs.
  75% in 0.0046 secs.
  90% in 0.0059 secs.
  95% in 0.0061 secs.
  99% in 0.0071 secs.
  99.9% in 0.0119 secs.

@hwdef
Copy link
Contributor Author

hwdef commented Aug 23, 2025

I feel like the results haven't changed much. I don't know why.

@siyuanfoundation siyuanfoundation requested a review from ahrtr August 28, 2025 15:10
@fuweid
Copy link
Member

fuweid commented Aug 28, 2025

You need to also remember to specify --conns 3 --clients 3 to force creation of 3 connections and clients so benchmark is connects to each of them instead of just picking the first endpoint.

Agree it's bug. From my perspective, mustCreateConn should use --endpoints slice as input instead of picking up one of them

@hwdef
Copy link
Contributor Author

hwdef commented Aug 29, 2025

You need to also remember to specify --conns 3 --clients 3 to force creation of 3 connections and clients so benchmark is connects to each of them instead of just picking the first endpoint.

Agree it's bug. From my perspective, mustCreateConn should use --endpoints slice as input instead of picking up one of them

Yes, I'll create a PR for this.

@ahrtr
Copy link
Member

ahrtr commented Sep 1, 2025

The fix is definitely correct, but the concern is the impact on performance (memory usage), but the problem is that it might be hard to verify/test. When there is a large range request in TXN, only the member which the client connects to will have bigger memory usage, after the PR, all members will have bigger memory usage.

When there are large number of clients/connections, we can load balance them across multiple members. After this PR, the overall memory usage will increase when there is large range in TXNs.

Have you tried the other solution (all members execute the same validation but only the member the client connects to execute the range request) mentioned in #18667 (comment)?

The third approach is to document the known issue and leave it as it's for now. I have never seen a real use case to trigger the issue, at least Kubernetes isn't affected currently.

@hwdef
Copy link
Contributor Author

hwdef commented Sep 2, 2025

@ahrtr

Have you tried the other solution (all members execute the same validation but only the member the client connects to execute the range request) mentioned in #18667 (comment)?

I haven’t tried the second approach yet, but I plan to.

BTW, I think it would be better to first improve the benchmark tests. No matter which solution we end up with, if the benchmark results aren’t convincing, the review process could still be blocked.

@ahrtr
Copy link
Member

ahrtr commented Sep 2, 2025

BTW, I think it would be better to first improve the benchmark tests

Please raise a separate issue to clarify what's the issue and how to improve it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging this pull request may close these issues.

Specifying a revision for a range request in a transaction may cause data inconsistency
5 participants