Improve watch latency benchmark #17562

serathius · 2024-03-10T10:41:24Z

Support prevKV
Support multiple watchers per stream
Allow continious puts without waiting for event

serathius · 2024-03-10T11:49:20Z

Implemented for #16839

jmhbnz

Initial look I think this is looking good, thanks @serathius. Small question about defaults below.

For one scenario below is quick comparison at 10,000 between new and old showing that the existing most basic functionality seems to be behaving ok in case it sames time for other reviewers.

`main` currently

 ~  D  e  t  benchmark  ?6  benchmark-put-laten..  ./benchmark watch-latency --put-total 10000

Summary:
  Total:	99.0015 secs.
  Slowest:	0.0003 secs.
  Fastest:	0.0000 secs.
  Average:	0.0000 secs.
  Stddev:	0.0000 secs.
  Requests/sec:	101.0086

Response time histogram:
  0.0000 [1]	|
  0.0000 [6595]	|∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎
  0.0001 [2384]	|∎∎∎∎∎∎∎∎∎∎∎∎∎∎
  0.0001 [796]	|∎∎∎∎
  0.0001 [183]	|∎
  0.0001 [33]	|
  0.0002 [2]	|
  0.0002 [2]	|
  0.0002 [2]	|
  0.0003 [0]	|
  0.0003 [2]	|

Latency distribution:
  10% in 0.0000 secs.
  25% in 0.0000 secs.
  50% in 0.0000 secs.
  75% in 0.0000 secs.
  90% in 0.0001 secs.
  95% in 0.0001 secs.
  99% in 0.0001 secs.
  99.9% in 0.0001 secs.

This pull request

Put summary:

Summary:
  Total:	99.0020 secs.
  Slowest:	0.0100 secs.
  Fastest:	0.0010 secs.
  Average:	0.0015 secs.
  Stddev:	0.0006 secs.
  Requests/sec:	101.0080

Response time histogram:
  0.0010 [1]	|
  0.0019 [9255]	|∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎
  0.0028 [130]	|
  0.0037 [465]	|∎∎
  0.0046 [59]	|
  0.0055 [75]	|
  0.0064 [4]	|
  0.0073 [4]	|
  0.0082 [2]	|
  0.0091 [2]	|
  0.0100 [3]	|

Latency distribution:
  10% in 0.0012 secs.
  25% in 0.0012 secs.
  50% in 0.0013 secs.
  75% in 0.0014 secs.
  90% in 0.0015 secs.
  95% in 0.0032 secs.
  99% in 0.0045 secs.
  99.9% in 0.0066 secs.

Watch events summary:

Summary:
  Total:	99.0812 secs.
  Slowest:	0.0011 secs.
  Fastest:	0.0000 secs.
  Average:	0.0002 secs.
  Stddev:	0.0001 secs.
  Requests/sec:	10092.7286

Response time histogram:
  0.0000 [1862]	|
  0.0001 [95833]	|∎∎∎∎
  0.0002 [792736]	|∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎
  0.0003 [98306]	|∎∎∎∎
  0.0005 [3380]	|
  0.0006 [1745]	|
  0.0007 [3836]	|
  0.0008 [1788]	|
  0.0009 [307]	|
  0.0010 [161]	|
  0.0011 [46]	|

Latency distribution:
  10% in 0.0001 secs.
  25% in 0.0001 secs.
  50% in 0.0002 secs.
  75% in 0.0002 secs.
  90% in 0.0002 secs.
  95% in 0.0003 secs.
  99% in 0.0004 secs.
  99.9% in 0.0007 secs.

jmhbnz · 2024-03-11T06:32:54Z

tools/benchmark/cmd/watch_latency.go

+	watchLatencyCmd.Flags().IntVar(&watchLWatchPerStream, "watch-per-stream", 10, "Total watchers per stream")
+	watchLatencyCmd.Flags().BoolVar(&watchLPrevKV, "prevkv", false, "PrevKV enabled on watch requests")
+
+	watchLatencyCmd.Flags().IntVar(&watchLPutTotal, "put-total", 1000, "Total number of put requests")


Should we be trying to align the defaults across the board at 10,000?

james  ~  D  e  t  benchmark  ?6  benchmark-put-..   19:30:38 ➜ grep -Ri "1000" | grep -i intvar cmd/put.go: putCmd.Flags().IntVar(&putTotal, "total", 10000, "Total number of put requests") cmd/stm.go: stmCmd.Flags().IntVar(&stmTotal, "total", 10000, "Total number of completed STM transactions") cmd/txn_put.go: txnPutCmd.Flags().IntVar(&txnPutTotal, "total", 10000, "Total number of txn requests") cmd/watch.go: watchCmd.Flags().IntVar(&watchPutTotal, "put-total", 1000, "Number of put requests") cmd/range.go: rangeCmd.Flags().IntVar(&rangeTotal, "total", 10000, "Total number of range requests") cmd/watch_get.go: watchGetCmd.Flags().IntVar(&watchGetTotalWatchers, "watchers", 10000, "Total number of watchers") cmd/lease.go: leaseKeepaliveCmd.Flags().IntVar(&leaseKeepaliveTotal, "total", 10000, "Total number of lease keepalive requests") cmd/watch_latency.go: watchLatencyCmd.Flags().IntVar(&watchLPutTotal, "put-total", 1000, "Total number of put requests") cmd/mvcc.go: mvccCmd.PersistentFlags().IntVar(&batchLimit, "batch-limit", 10000, "A limit of batched transaction") cmd/txn_mixed.go: mixedTxnCmd.Flags().IntVar(&mixedTxnTotal, "total", 10000, "Total number of txn requests")

Benchmark code is not very organized. I planned to make a direct improvement that we require for testing prevKV, and leave cleanups for later.

tools/benchmark/cmd/watch_latency.go

ahrtr · 2024-03-11T11:03:36Z

tools/benchmark/cmd/watch_latency.go

+func setupWatchChannels(key string) []clientv3.WatchChan {
+	clients := mustCreateClients(totalClients, totalConns)
+
+	streams := make([]clientv3.Watcher, watchLStreams)


It seems that it doesn't really create watchLStreams gRPC streams. Instead it just create totalConns gRPC streams?

by streams here I mean watch streams aka Watch RPC. I based the code on tools/benchmark/cmd/watch.go where I discovered that you can separate watch requests into different stream by creating a new watcher. This reuses same grpc connection as you pass it the same etcd client, however creates a separate stream as those are stored on watcher. Seems like a better way to separate streams approach I proposed in Kubernetes via grpc context metadata.

I discovered that you can separate watch requests into different stream by creating a new watcher.

Yes, it's true.

Each gRPC connection may be reused by multiple gRPC streams, and each gRPC stream can be reused by multiple etcd watchers.

ahrtr · 2024-03-11T11:03:52Z

tools/benchmark/cmd/watch_latency.go

+		wch := wch
+		i := i
+		eventTimes[i] = make([]time.Time, watchLPutTotal)
+		wg.Add(1)


Where is the wg defined?

He he, didn't notice that, it's a global wg for all benchmarks. Would prefer to leave it as it is to at least be consistent with other benchmarks.

* Support prevKV * Support multiple watchers per stream * Allow continious puts without waiting for event Signed-off-by: Marek Siarkowicz <[email protected]>

serathius · 2024-03-11T20:30:34Z

/retest

serathius · 2024-03-12T11:01:46Z

ping @ahrtr

ahrtr · 2024-03-13T12:57:05Z

tools/benchmark/cmd/watch_latency.go

+			start := putTimes[j]
+			end := eventTimes[i][j]
+			if end.Before(start) {
+				start = end
+			}
+			watchReport.Results() <- report.Result{Start: start, End: end}


The logic is too fragile. The startTime is recorded in the main goroutine when putting the key, while the endTime is recorded in the watch gouroutines, and you correlate the startTime with the endTime using an index.

Suggest to use WithPrefix when watching, and put a different key each time when putting, e.g. "foo1", "foo2", etc. And we use the exact key to correlate the startTime with the endTime.

Fragile? Puts are done sequentially and etcd guarantees that responses are sequential to writes. Not sure I would call it fragile or at least up to the etcd correctness, whoever I don't think we should care about this in benchmarks.

The "fragile" isn't about the etcd's correctness, it's about the readability, it's about the code itself. There may be network jitter, the put may fail. Currently it looks OK because the tool just exits directly when put fails. I think we shouldn't exit in such case, we are running benchmark test, why do we care about some occasional failures due to env problems (e.g. network jitter)? Some time later, other contributors may even update the implementation not exiting in such case.

Also depending on index (correlate the times by an index) reduces the readability, also increase review burden. We need super carefulness to ensure there is no +1/-1 problem.

Anyway we should make the implementation more robustness & less error prone by itself.

etcd/tools/benchmark/cmd/watch_latency.go

Lines 96 to 99 in 8292553

if _, err := putClient.Put(context.TODO(), key, value); err != nil {

fmt.Fprintf(os.Stderr, "Failed to Put for watch latency benchmark: %v\n", err)

os.Exit(1)

}

I won't insist on this for now, because it works for now as mentioned above "Currently it looks OK because the tool just exits directly when put fails.".

Please let me know if you want to keep it as it's or plan to change it based on the proposed above.

Don't plan to change it for now. I think it could be improved, however I'm not as concerned with making benchmark code super readable, as it's not often changed nor critical for etcd functionality (at least until we start to really depend in automatic performance regression detection by @jmhbnz). I think we could just merge it to make sure the improvements are available for everyone to use.

@ahrtr do you see any blockers?

ahrtr · 2024-03-13T12:59:30Z

tools/benchmark/cmd/watch_latency.go

+func setupWatchChannels(key string) []clientv3.WatchChan {
+	clients := mustCreateClients(totalClients, totalConns)
+
+	streams := make([]clientv3.Watcher, watchLStreams)


I discovered that you can separate watch requests into different stream by creating a new watcher.

Yes, it's true.

Each gRPC connection may be reused by multiple gRPC streams, and each gRPC stream can be reused by multiple etcd watchers.

serathius force-pushed the benchmark-put-latency branch from ea05b11 to da4db52 Compare March 10, 2024 10:43

serathius mentioned this pull request Mar 10, 2024

[Draft] Reuse events used for syncing watchers #17563

Draft

jmhbnz reviewed Mar 11, 2024

View reviewed changes

ahrtr reviewed Mar 11, 2024

View reviewed changes

Improve watch latency benchmark

7a84cbb

* Support prevKV * Support multiple watchers per stream * Allow continious puts without waiting for event Signed-off-by: Marek Siarkowicz <[email protected]>

serathius force-pushed the benchmark-put-latency branch from da4db52 to 7a84cbb Compare March 11, 2024 19:30

ahrtr reviewed Mar 13, 2024

View reviewed changes

ahrtr approved these changes Mar 24, 2024

View reviewed changes

serathius merged commit b643f1b into etcd-io:main Mar 24, 2024
39 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve watch latency benchmark #17562

Improve watch latency benchmark #17562

serathius commented Mar 10, 2024

serathius commented Mar 10, 2024

jmhbnz left a comment

jmhbnz Mar 11, 2024

serathius Mar 11, 2024

ahrtr Mar 11, 2024

serathius Mar 11, 2024 •

edited

Loading

ahrtr Mar 13, 2024

ahrtr Mar 11, 2024

serathius Mar 11, 2024

serathius commented Mar 11, 2024

serathius commented Mar 12, 2024

ahrtr Mar 13, 2024

serathius Mar 13, 2024

ahrtr Mar 13, 2024 •

edited

Loading

ahrtr Mar 24, 2024

serathius Mar 24, 2024 •

edited

Loading

ahrtr Mar 13, 2024

	if _, err := putClient.Put(context.TODO(), key, value); err != nil {
	fmt.Fprintf(os.Stderr, "Failed to Put for watch latency benchmark: %v\n", err)
	os.Exit(1)
	}

Improve watch latency benchmark #17562

Improve watch latency benchmark #17562

Conversation

serathius commented Mar 10, 2024

serathius commented Mar 10, 2024

jmhbnz left a comment

Choose a reason for hiding this comment

main currently

This pull request

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

serathius Mar 11, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

serathius commented Mar 11, 2024

serathius commented Mar 12, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ahrtr Mar 13, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

serathius Mar 24, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

`main` currently

serathius Mar 11, 2024 •

edited

Loading

ahrtr Mar 13, 2024 •

edited

Loading

serathius Mar 24, 2024 •

edited

Loading