-
Notifications
You must be signed in to change notification settings - Fork 9.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve watch latency benchmark #17562
Conversation
ea05b11
to
da4db52
Compare
Implemented for #16839 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Initial look I think this is looking good, thanks @serathius. Small question about defaults below.
For one scenario below is quick comparison at 10,000
between new and old showing that the existing most basic functionality seems to be behaving ok in case it sames time for other reviewers.
main
currently
~ D e t benchmark ?6 benchmark-put-laten.. ./benchmark watch-latency --put-total 10000
Summary:
Total: 99.0015 secs.
Slowest: 0.0003 secs.
Fastest: 0.0000 secs.
Average: 0.0000 secs.
Stddev: 0.0000 secs.
Requests/sec: 101.0086
Response time histogram:
0.0000 [1] |
0.0000 [6595] |∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎
0.0001 [2384] |∎∎∎∎∎∎∎∎∎∎∎∎∎∎
0.0001 [796] |∎∎∎∎
0.0001 [183] |∎
0.0001 [33] |
0.0002 [2] |
0.0002 [2] |
0.0002 [2] |
0.0003 [0] |
0.0003 [2] |
Latency distribution:
10% in 0.0000 secs.
25% in 0.0000 secs.
50% in 0.0000 secs.
75% in 0.0000 secs.
90% in 0.0001 secs.
95% in 0.0001 secs.
99% in 0.0001 secs.
99.9% in 0.0001 secs.
This pull request
Put summary:
Summary:
Total: 99.0020 secs.
Slowest: 0.0100 secs.
Fastest: 0.0010 secs.
Average: 0.0015 secs.
Stddev: 0.0006 secs.
Requests/sec: 101.0080
Response time histogram:
0.0010 [1] |
0.0019 [9255] |∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎
0.0028 [130] |
0.0037 [465] |∎∎
0.0046 [59] |
0.0055 [75] |
0.0064 [4] |
0.0073 [4] |
0.0082 [2] |
0.0091 [2] |
0.0100 [3] |
Latency distribution:
10% in 0.0012 secs.
25% in 0.0012 secs.
50% in 0.0013 secs.
75% in 0.0014 secs.
90% in 0.0015 secs.
95% in 0.0032 secs.
99% in 0.0045 secs.
99.9% in 0.0066 secs.
Watch events summary:
Summary:
Total: 99.0812 secs.
Slowest: 0.0011 secs.
Fastest: 0.0000 secs.
Average: 0.0002 secs.
Stddev: 0.0001 secs.
Requests/sec: 10092.7286
Response time histogram:
0.0000 [1862] |
0.0001 [95833] |∎∎∎∎
0.0002 [792736] |∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎
0.0003 [98306] |∎∎∎∎
0.0005 [3380] |
0.0006 [1745] |
0.0007 [3836] |
0.0008 [1788] |
0.0009 [307] |
0.0010 [161] |
0.0011 [46] |
Latency distribution:
10% in 0.0001 secs.
25% in 0.0001 secs.
50% in 0.0002 secs.
75% in 0.0002 secs.
90% in 0.0002 secs.
95% in 0.0003 secs.
99% in 0.0004 secs.
99.9% in 0.0007 secs.
watchLatencyCmd.Flags().IntVar(&watchLWatchPerStream, "watch-per-stream", 10, "Total watchers per stream") | ||
watchLatencyCmd.Flags().BoolVar(&watchLPrevKV, "prevkv", false, "PrevKV enabled on watch requests") | ||
|
||
watchLatencyCmd.Flags().IntVar(&watchLPutTotal, "put-total", 1000, "Total number of put requests") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we be trying to align the defaults across the board at 10,000
?
james ~ D e t benchmark ?6 benchmark-put-.. 19:30:38
➜ grep -Ri "1000" | grep -i intvar
cmd/put.go: putCmd.Flags().IntVar(&putTotal, "total", 10000, "Total number of put requests")
cmd/stm.go: stmCmd.Flags().IntVar(&stmTotal, "total", 10000, "Total number of completed STM transactions")
cmd/txn_put.go: txnPutCmd.Flags().IntVar(&txnPutTotal, "total", 10000, "Total number of txn requests")
cmd/watch.go: watchCmd.Flags().IntVar(&watchPutTotal, "put-total", 1000, "Number of put requests")
cmd/range.go: rangeCmd.Flags().IntVar(&rangeTotal, "total", 10000, "Total number of range requests")
cmd/watch_get.go: watchGetCmd.Flags().IntVar(&watchGetTotalWatchers, "watchers", 10000, "Total number of watchers")
cmd/lease.go: leaseKeepaliveCmd.Flags().IntVar(&leaseKeepaliveTotal, "total", 10000, "Total number of lease keepalive requests")
cmd/watch_latency.go: watchLatencyCmd.Flags().IntVar(&watchLPutTotal, "put-total", 1000, "Total number of put requests")
cmd/mvcc.go: mvccCmd.PersistentFlags().IntVar(&batchLimit, "batch-limit", 10000, "A limit of batched transaction")
cmd/txn_mixed.go: mixedTxnCmd.Flags().IntVar(&mixedTxnTotal, "total", 10000, "Total number of txn requests")
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Benchmark code is not very organized. I planned to make a direct improvement that we require for testing prevKV, and leave cleanups for later.
func setupWatchChannels(key string) []clientv3.WatchChan { | ||
clients := mustCreateClients(totalClients, totalConns) | ||
|
||
streams := make([]clientv3.Watcher, watchLStreams) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems that it doesn't really create watchLStreams
gRPC streams. Instead it just create totalConns
gRPC streams?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
by streams here I mean watch streams
aka Watch RPC
. I based the code on tools/benchmark/cmd/watch.go where I discovered that you can separate watch requests into different stream by creating a new watcher. This reuses same grpc connection as you pass it the same etcd client, however creates a separate stream as those are stored on watcher. Seems like a better way to separate streams approach I proposed in Kubernetes via grpc context metadata.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I discovered that you can separate watch requests into different stream by creating a new watcher.
Yes, it's true.
Each gRPC connection may be reused by multiple gRPC streams, and each gRPC stream can be reused by multiple etcd watchers.
wch := wch | ||
i := i | ||
eventTimes[i] = make([]time.Time, watchLPutTotal) | ||
wg.Add(1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where is the wg
defined?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
He he, didn't notice that, it's a global wg
for all benchmarks. Would prefer to leave it as it is to at least be consistent with other benchmarks.
* Support prevKV * Support multiple watchers per stream * Allow continious puts without waiting for event Signed-off-by: Marek Siarkowicz <[email protected]>
da4db52
to
7a84cbb
Compare
/retest |
ping @ahrtr |
start := putTimes[j] | ||
end := eventTimes[i][j] | ||
if end.Before(start) { | ||
start = end | ||
} | ||
watchReport.Results() <- report.Result{Start: start, End: end} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The logic is too fragile. The startTime is recorded in the main goroutine when putting the key, while the endTime is recorded in the watch gouroutines, and you correlate the startTime with the endTime using an index.
Suggest to use WithPrefix
when watching, and put a different key each time when putting, e.g. "foo1", "foo2", etc. And we use the exact key to correlate the startTime with the endTime.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fragile? Puts are done sequentially and etcd guarantees that responses are sequential to writes. Not sure I would call it fragile or at least up to the etcd correctness, whoever I don't think we should care about this in benchmarks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The "fragile" isn't about the etcd's correctness, it's about the readability, it's about the code itself. There may be network jitter, the put may fail. Currently it looks OK because the tool just exits directly when put fails. I think we shouldn't exit in such case, we are running benchmark test, why do we care about some occasional failures due to env problems (e.g. network jitter)? Some time later, other contributors may even update the implementation not exiting in such case.
Also depending on index (correlate the times by an index) reduces the readability, also increase review burden. We need super carefulness to ensure there is no +1/-1 problem.
Anyway we should make the implementation more robustness & less error prone by itself.
etcd/tools/benchmark/cmd/watch_latency.go
Lines 96 to 99 in 8292553
if _, err := putClient.Put(context.TODO(), key, value); err != nil { | |
fmt.Fprintf(os.Stderr, "Failed to Put for watch latency benchmark: %v\n", err) | |
os.Exit(1) | |
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I won't insist on this for now, because it works for now as mentioned above "Currently it looks OK because the tool just exits directly when put fails.
".
Please let me know if you want to keep it as it's or plan to change it based on the proposed above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't plan to change it for now. I think it could be improved, however I'm not as concerned with making benchmark code super readable, as it's not often changed nor critical for etcd functionality (at least until we start to really depend in automatic performance regression detection by @jmhbnz). I think we could just merge it to make sure the improvements are available for everyone to use.
@ahrtr do you see any blockers?
func setupWatchChannels(key string) []clientv3.WatchChan { | ||
clients := mustCreateClients(totalClients, totalConns) | ||
|
||
streams := make([]clientv3.Watcher, watchLStreams) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I discovered that you can separate watch requests into different stream by creating a new watcher.
Yes, it's true.
Each gRPC connection may be reused by multiple gRPC streams, and each gRPC stream can be reused by multiple etcd watchers.
cc @ahrtr @jmhbnz @chaochn47