Skip to content

Commit ba2b1d8

Browse files
authored
run benchmarks nightly (#224)
1 parent e8a4431 commit ba2b1d8

File tree

5 files changed

+107
-173
lines changed

5 files changed

+107
-173
lines changed

.github/workflows/bench.yml

Lines changed: 9 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,9 @@
11
name: bench
22

3-
on: workflow_dispatch
3+
on:
4+
workflow_dispatch:
5+
schedule:
6+
- cron: "42 9 * * *"
47

58
jobs:
69
benchee:
@@ -41,10 +44,8 @@ jobs:
4144
4245
- run: mix deps.get --only $MIX_ENV
4346
- run: mix compile --warnings-as-errors
44-
- run: mkdir results
45-
- run: mix run bench/insert.exs | tee results/insert.txt
46-
- run: mix run bench/stream.exs | tee results/stream.txt
47-
- uses: actions/upload-artifact@v4
48-
with:
49-
name: results
50-
path: results/*.txt
47+
48+
# - run: mix run bench/cast.exs
49+
- run: mix run bench/encode.exs
50+
- run: mix run bench/insert.exs
51+
- run: mix run bench/stream.exs

README.md

Lines changed: 3 additions & 118 deletions
Original file line numberDiff line numberDiff line change
@@ -270,121 +270,6 @@ taipei = DateTime.shift_zone!(utc, "Asia/Taipei")
270270
Ch.query!(pid, "INSERT INTO ch_datetimes(datetime) FORMAT RowBinary", [[naive], [utc], [taipei]], types: ["DateTime"])
271271
```
272272

273-
## Benchmarks
274-
275-
<details>
276-
<summary><code>INSERT</code> 1 million rows <a href="https://github.com/ClickHouse/clickhouse-go#benchmark">(original)</a></summary>
277-
278-
<pre><code>
279-
$ MIX_ENV=bench mix run bench/insert.exs
280-
281-
This benchmark is based on https://github.com/ClickHouse/clickhouse-go#benchmark
282-
283-
Operating System: macOS
284-
CPU Information: Apple M1
285-
Number of Available Cores: 8
286-
Available memory: 8 GB
287-
Elixir 1.14.4
288-
Erlang 25.3
289-
290-
Benchmark suite executing with the following configuration:
291-
warmup: 2 s
292-
time: 5 s
293-
memory time: 0 ns
294-
reduction time: 0 ns
295-
parallel: 1
296-
inputs: 1_000_000 rows
297-
Estimated total run time: 28 s
298-
299-
Benchmarking encode with input 1_000_000 rows ...
300-
Benchmarking encode stream with input 1_000_000 rows ...
301-
Benchmarking insert with input 1_000_000 rows ...
302-
Benchmarking insert stream with input 1_000_000 rows ...
303-
304-
##### With input 1_000_000 rows #####
305-
Name ips average deviation median 99th %
306-
encode stream 1.63 612.96 ms ±11.30% 583.03 ms 773.01 ms
307-
insert stream 1.22 819.82 ms ±9.41% 798.94 ms 973.45 ms
308-
encode 1.09 915.75 ms ±44.13% 750.98 ms 1637.02 ms
309-
insert 0.73 1373.84 ms ±31.01% 1331.86 ms 1915.76 ms
310-
311-
Comparison:
312-
encode stream 1.63
313-
insert stream 1.22 - 1.34x slower +206.87 ms
314-
encode 1.09 - 1.49x slower +302.79 ms
315-
insert 0.73 - 2.24x slower +760.88 ms</code>
316-
</pre>
317-
318-
</details>
319-
320-
<details>
321-
<summary><code>SELECT</code> 500, 500 thousand, and 500 million rows <a href="https://github.com/ClickHouse/ch-bench">(original)</a></summary>
322-
323-
<pre><code>
324-
$ MIX_ENV=bench mix run bench/stream.exs
325-
326-
This benchmark is based on https://github.com/ClickHouse/ch-bench
327-
328-
Operating System: macOS
329-
CPU Information: Apple M1
330-
Number of Available Cores: 8
331-
Available memory: 8 GB
332-
Elixir 1.14.4
333-
Erlang 25.3
334-
335-
Benchmark suite executing with the following configuration:
336-
warmup: 2 s
337-
time: 5 s
338-
memory time: 0 ns
339-
reduction time: 0 ns
340-
parallel: 1
341-
inputs: 500 rows, 500_000 rows, 500_000_000 rows
342-
Estimated total run time: 1.05 min
343-
344-
Benchmarking stream with decode with input 500 rows ...
345-
Benchmarking stream with decode with input 500_000 rows ...
346-
Benchmarking stream with decode with input 500_000_000 rows ...
347-
Benchmarking stream with manual decode with input 500 rows ...
348-
Benchmarking stream with manual decode with input 500_000 rows ...
349-
Benchmarking stream with manual decode with input 500_000_000 rows ...
350-
Benchmarking stream without decode with input 500 rows ...
351-
Benchmarking stream without decode with input 500_000 rows ...
352-
Benchmarking stream without decode with input 500_000_000 rows ...
353-
354-
##### With input 500 rows #####
355-
Name ips average deviation median 99th %
356-
stream with decode 4.69 K 213.34 μs ±12.49% 211.38 μs 290.94 μs
357-
stream with manual decode 4.69 K 213.43 μs ±17.40% 210.96 μs 298.75 μs
358-
stream without decode 4.65 K 215.08 μs ±10.79% 213.79 μs 284.66 μs
359-
360-
Comparison:
361-
stream with decode 4.69 K
362-
stream with manual decode 4.69 K - 1.00x slower +0.0838 μs
363-
stream without decode 4.65 K - 1.01x slower +1.74 μs
364-
365-
##### With input 500_000 rows #####
366-
Name ips average deviation median 99th %
367-
stream without decode 234.58 4.26 ms ±13.99% 4.04 ms 5.95 ms
368-
stream with manual decode 64.26 15.56 ms ±8.36% 15.86 ms 17.97 ms
369-
stream with decode 41.03 24.37 ms ±6.27% 24.39 ms 26.60 ms
370-
371-
Comparison:
372-
stream without decode 234.58
373-
stream with manual decode 64.26 - 3.65x slower +11.30 ms
374-
stream with decode 41.03 - 5.72x slower +20.11 ms
375-
376-
##### With input 500_000_000 rows #####
377-
Name ips average deviation median 99th %
378-
stream without decode 0.32 3.17 s ±0.20% 3.17 s 3.17 s
379-
stream with manual decode 0.0891 11.23 s ±0.00% 11.23 s 11.23 s
380-
stream with decode 0.0462 21.66 s ±0.00% 21.66 s 21.66 s
381-
382-
Comparison:
383-
stream without decode 0.32
384-
stream with manual decode 0.0891 - 3.55x slower +8.06 s
385-
stream with decode 0.0462 - 6.84x slower +18.50 s</code>
386-
</pre>
387-
388-
</details>
389-
390-
[CI Results](https://github.com/plausible/ch/actions/workflows/bench.yml) (click the latest workflow run and scroll down to "Artifacts")
273+
## [Benchmarks](./bench)
274+
275+
See nightly [CI runs](https://github.com/plausible/ch/actions/workflows/bench.yml) for latest results.

bench/encode.exs

Lines changed: 27 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,27 @@
1+
IO.puts("""
2+
This benchmark measures the performance of encoding rows in RowBinary format.
3+
""")
4+
5+
alias Ch.RowBinary
6+
7+
types = ["UInt64", "String", "Array(UInt8)", "DateTime"]
8+
9+
rows = fn count ->
10+
Enum.map(1..count, fn i ->
11+
[i, "Golang SQL database driver", [1, 2, 3, 4, 5, 6, 7, 8, 9], DateTime.utc_now()]
12+
end)
13+
end
14+
15+
Benchee.run(
16+
%{
17+
"RowBinary" => fn rows -> RowBinary.encode_rows(rows, types) end,
18+
"RowBinary stream" => fn rows ->
19+
Stream.chunk_every(rows, 60_000)
20+
|> Stream.each(fn chunk -> RowBinary.encode_rows(chunk, types) end)
21+
|> Stream.run()
22+
end
23+
},
24+
inputs: %{
25+
"1_000_000 (UInt64, String, Array(UInt8), DateTime) rows" => rows.(1_000_000)
26+
}
27+
)

bench/insert.exs

Lines changed: 37 additions & 32 deletions
Original file line numberDiff line numberDiff line change
@@ -1,54 +1,59 @@
1-
IO.puts("This benchmark is based on https://github.com/ClickHouse/clickhouse-go#benchmark\n")
1+
IO.puts("""
2+
This benchmark is based on https://github.com/ClickHouse/clickhouse-go#benchmark
3+
4+
It tests how quickly a client can insert one million rows of the following schema:
5+
- col1 UInt64
6+
- col2 String
7+
- col3 Array(UInt8)
8+
- col4 DateTime
9+
""")
210

311
port = String.to_integer(System.get_env("CH_PORT") || "8123")
412
hostname = System.get_env("CH_HOSTNAME") || "localhost"
513
scheme = System.get_env("CH_SCHEME") || "http"
614
database = System.get_env("CH_DATABASE") || "ch_bench"
715

8-
{:ok, conn} = Ch.start_link(scheme: scheme, hostname: hostname, port: port)
9-
Ch.query!(conn, "CREATE DATABASE IF NOT EXISTS {$0:Identifier}", [database])
10-
11-
Ch.query!(conn, """
12-
CREATE TABLE IF NOT EXISTS #{database}.benchmark (
13-
col1 UInt64,
14-
col2 String,
15-
col3 Array(UInt8),
16-
col4 DateTime
17-
) Engine Null
18-
""")
19-
20-
types = [Ch.Types.u64(), Ch.Types.string(), Ch.Types.array(Ch.Types.u8()), Ch.Types.datetime()]
21-
statement = "INSERT INTO #{database}.benchmark FORMAT RowBinary"
16+
alias Ch.RowBinary
2217

2318
rows = fn count ->
2419
Enum.map(1..count, fn i ->
25-
[i, "Golang SQL database driver", [1, 2, 3, 4, 5, 6, 7, 8, 9], NaiveDateTime.utc_now()]
20+
[i, "Golang SQL database driver", [1, 2, 3, 4, 5, 6, 7, 8, 9], DateTime.utc_now()]
2621
end)
2722
end
2823

29-
alias Ch.RowBinary
24+
statement = "INSERT INTO #{database}.benchmark FORMAT RowBinary"
25+
types = ["UInt64", "String", "Array(UInt8)", "DateTime"]
3026

3127
Benchee.run(
3228
%{
33-
# "control" => fn rows -> Enum.each(rows, fn _row -> :ok end) end,
34-
"encode" => fn rows -> RowBinary.encode_rows(rows, types) end,
35-
"insert" => fn rows -> Ch.query!(conn, statement, rows, types: types) end,
36-
# "control stream" => fn rows -> rows |> Stream.chunk_every(60_000) |> Stream.run() end,
37-
"encode stream" => fn rows ->
38-
rows
39-
|> Stream.chunk_every(60_000)
40-
|> Stream.map(fn chunk -> RowBinary.encode_rows(chunk, types) end)
41-
|> Stream.run()
29+
"Ch.query" => fn %{pool: pool, rows: rows} ->
30+
Ch.query!(pool, statement, rows, types: types)
4231
end,
43-
"insert stream" => fn rows ->
44-
stream =
45-
rows
46-
|> Stream.chunk_every(60_000)
32+
"Ch.stream" => fn %{pool: pool, rows: rows} ->
33+
DBConnection.run(pool, fn conn ->
34+
Stream.chunk_every(rows, 100_000)
4735
|> Stream.map(fn chunk -> RowBinary.encode_rows(chunk, types) end)
48-
49-
Ch.query!(conn, statement, stream, encode: false)
36+
|> Stream.into(Ch.stream(conn, statement, [], encode: false))
37+
|> Stream.run()
38+
end)
5039
end
5140
},
41+
before_scenario: fn rows ->
42+
{:ok, pool} = Ch.start_link(scheme: scheme, hostname: hostname, port: port, pool_size: 1)
43+
44+
Ch.query!(pool, "CREATE DATABASE IF NOT EXISTS {$0:Identifier}", [database])
45+
46+
Ch.query!(pool, """
47+
CREATE TABLE IF NOT EXISTS #{database}.benchmark (
48+
col1 UInt64,
49+
col2 String,
50+
col3 Array(UInt8),
51+
col4 DateTime
52+
) Engine Null
53+
""")
54+
55+
%{pool: pool, rows: rows}
56+
end,
5257
inputs: %{
5358
"1_000_000 rows" => rows.(1_000_000)
5459
}

bench/stream.exs

Lines changed: 31 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -1,16 +1,34 @@
1-
IO.puts("This benchmark is based on https://github.com/ClickHouse/ch-bench\n")
1+
IO.puts("""
2+
This benchmark is based on https://github.com/ClickHouse/ch-bench
3+
4+
It tests how quickly a client can select N rows from the system.numbers_mt table:
5+
6+
SELECT number FROM system.numbers_mt LIMIT {limit:UInt64} FORMAT RowBinary
7+
""")
28

39
port = String.to_integer(System.get_env("CH_PORT") || "8123")
410
hostname = System.get_env("CH_HOSTNAME") || "localhost"
511
scheme = System.get_env("CH_SCHEME") || "http"
612

7-
{:ok, conn} = Ch.start_link(scheme: scheme, hostname: hostname, port: port)
13+
limits = fn limits ->
14+
Map.new(limits, fn limit ->
15+
{"limit=#{limit}", limit}
16+
end)
17+
end
818

919
Benchee.run(
1020
%{
11-
"RowBinary stream without decode" => fn limit ->
21+
# "Ch.query" => fn %{pool: pool, limit: limit} ->
22+
# Ch.query!(
23+
# pool,
24+
# "SELECT number FROM system.numbers_mt LIMIT {limit:UInt64}",
25+
# %{"limit" => limit},
26+
# timeout: :infinity
27+
# )
28+
# end,
29+
"Ch.stream w/o decoding (i.e. pass-through)" => fn %{pool: pool, limit: limit} ->
1230
DBConnection.run(
13-
conn,
31+
pool,
1432
fn conn ->
1533
conn
1634
|> Ch.stream(
@@ -22,29 +40,27 @@ Benchee.run(
2240
timeout: :infinity
2341
)
2442
end,
25-
"RowBinary stream with manual decode" => fn limit ->
43+
"Ch.stream with manual RowBinary decoding" => fn %{pool: pool, limit: limit} ->
2644
DBConnection.run(
27-
conn,
45+
pool,
2846
fn conn ->
2947
conn
3048
|> Ch.stream(
3149
"SELECT number FROM system.numbers_mt LIMIT {limit:UInt64} FORMAT RowBinary",
3250
%{"limit" => limit}
3351
)
34-
|> Stream.map(fn %Ch.Result{data: data} ->
35-
data
36-
|> IO.iodata_to_binary()
37-
|> Ch.RowBinary.decode_rows([:u64])
52+
|> Stream.each(fn %Ch.Result{data: data} ->
53+
data |> IO.iodata_to_binary() |> Ch.RowBinary.decode_rows([:u64])
3854
end)
3955
|> Stream.run()
4056
end,
4157
timeout: :infinity
4258
)
4359
end
4460
},
45-
inputs: %{
46-
"500 rows" => 500,
47-
"500_000 rows" => 500_000,
48-
"500_000_000 rows" => 500_000_000
49-
}
61+
before_scenario: fn limit ->
62+
{:ok, pool} = Ch.start_link(scheme: scheme, hostname: hostname, port: port, pool_size: 1)
63+
%{pool: pool, limit: limit}
64+
end,
65+
inputs: limits.([500, 500_000, 500_000_000])
5066
)

0 commit comments

Comments
 (0)