This package is a collection of Julia benchmarks used to track the performance of the Julia language.
BaseBenchmarks is written using the BenchmarkTools package. I highly suggest at least skimming the BenchmarkTools manual before using BaseBenchmarks locally.
BaseBenchmarks contains a large amount of code, not all of which is suitable for precompilation. Loading all of this code at once can take an annoyingly long time if you only need to run one or two benchmarks. To solve this problem, BaseBenchmarks allows you to dynamically load benchmark suites when you need them:
julia> using BaseBenchmarks
# This is the top-level BenchmarkGroup. It's empty until you load child groups into it.
julia> BaseBenchmarks.SUITE
0-element BenchmarkTools.BenchmarkGroup:
tags: []
# Here's an example of how to load the "linalg" group into BaseBenchmarks.SUITE. You can
# optionally pass in a different BenchmarkGroup as the first argument to load "linalg"
# into it.
julia> BaseBenchmarks.load!("linalg")
1-element BenchmarkTools.BenchmarkGroup:
tags: []
"linalg" => 3-element BenchmarkGroup(["array"])
# Here's an example of how to load all the benchmarks into BaseBenchmarks.SUITE. Once again,
# you can pass in a different BenchmarkGroup as the first argument to load the benchmarks
# there instead.
julia> BaseBenchmarks.loadall!();
loading group "string"...done (took 0.379868963 seconds)
loading group "linalg"...done (took 5.4598628 seconds)
loading group "parallel"...done (took 0.086358304 seconds)
loading group "tuple"...done (took 0.651417342 seconds)
loading group "micro"...done (took 0.377109301 seconds)
loading group "io"...done (took 0.068647882 seconds)
loading group "scalar"...done (took 16.922505539 seconds)
loading group "sparse"...done (took 3.750095955 seconds)
loading group "simd"...done (took 2.542815776 seconds)
loading group "problem"...done (took 2.002920499 seconds)
loading group "array"...done (took 6.072152907 seconds)
loading group "sort"...done (took 3.308745574 seconds)
loading group "shootout"...done (took 0.72022176 seconds)
Now that the benchmarks are loaded, you can run them just like any other BenchmarkTools.BenchmarkGroup
:
# run benchmarks matching a tag query
run(BaseBenchmarks.SUITE[@tagged ("array" || "linalg") && !("simd")]);
# run a specific benchmark group
run(BaseBenchmarks.SUITE["linalg"]["arithmetic"]);
# run a single benchmark
run(BaseBenchmarks.SUITE["scalar"]["fastmath"]["add", "Complex{Float64}"])
# equivalent to the above, but this form makes it
# easy to copy and paste IDs from benchmark reports
run(BaseBenchmarks.SUITE[["scalar", "fastmath", ("add", "Complex{Float64}")]]);
See the BenchmarkTools
repository for documentation of BenchmarkTools.BenchmarkGroup
features (e.g.
regression classification and filtering, parameter tuning, leaf iteration,
higher order mapping/filtering, etc.).
If you're a collaborator, you can trigger Julia's @nanosoldier bot to automatically test the performance of your PR vs. Julia's master branch. However, this bot's purpose isn't to have the final say on performance matters, but rather to identify areas which require local performance testing. Here's a procedure for testing your Julia PR locally:
- Run benchmarks and save results using master Julia build
- Run benchmarks and save results using PR Julia build
- Load and compare the results, looking for regressions
- Profile any regressions to find opportunities for performance improvements
For steps 1 and 2, first build Julia on the appropriate branch. Then, you can run the
following code to execute all benchmarks and save the results (replacing filename
with
an actual unique file name):
using BenchmarkTools, BaseBenchmarks
BaseBenchmarks.loadall!() # load all benchmarks
results = run(BaseBenchmarks.SUITE; verbose = true) # run all benchmarks
BenchmarkTools.save("filename.json", results) # save results to JSON file
Next, you can load the results and check for regressions (once again replacing the JSON file names used here with the actual file names):
using BenchmarkTools, BaseBenchmarks
master = BenchmarkTools.load("master.json")[1]
pr = BenchmarkTools.load("pr.json")[1]
regs = regressions(judge(minimum(pr), minimum(master))) # a BenchmarkGroup containing the regressions
pairs = leaves(regs) # an array of (ID, `TrialJudgement`) pairs
This will show which tests resulted in regressions and to what magnitude. Here's an
example showing what pairs
might look like:
2-element Array{Any,1}:
(Any["string","join"],BenchmarkTools.TrialJudgement:
time: +41.13% => regression (1.00% tolerance)
memory: +0.00% => invariant (1.00% tolerance))
(Any["io","read","readstring"],BenchmarkTools.TrialJudgement:
time: +13.85% => regression (3.00% tolerance)
memory: +0.00% => invariant (1.00% tolerance))
Each pair above is structured as (benchmark ID, TrialJudgement for benchmark)
. You can
now examine these benchmarks in detail and try to fix the regressions. Let's use the
["io","read","readstring"]
ID shown above as an example.
To examine this benchmark on your currently-built branch, first make sure you've loaded
the benchmark's parent group (the first element in the ID, "io"
):
julia> using BenchmarkTools, BaseBenchmarks
julia> showall(BaseBenchmarks.load!("io"))
1-element BenchmarkTools.BenchmarkGroup:
tags: []
"io" => 1-element BenchmarkTools.BenchmarkGroup:
tags: []
"read" => 2-element BenchmarkTools.BenchmarkGroup:
tags: ["buffer", "stream", "string"]
"readstring" => BenchmarkTools.Benchmark...
"read" => BenchmarkTools.Benchmark...
You can now run the benchmark by calling
run(BaseBenchmarks.SUITE[["io","read","readstring"]])
, or profile it using @profile
:
@profile run(BaseBenchmarks.SUITE[["io","read","readstring"]])
After profiling the benchmark, you can use Profile.print()
or ProfileView.view()
to
analyze the bottlenecks that led to that regression.
Our performance tracker could always benefit from more benchmarks! If you have
a benchmark that depends only on Base
Julia code, it is welcome here - just
open a PR against the master branch.
Here are some contribution tips and guidelines:
- All benchmarks should only depend on base Julia.
- You'll need to use BenchmarkTools to write the benchmarks (feel free to open a WIP PR if you'd like help with this).
- Newly defined functions whose calls are measured should have
perf_
prepended to their name. This makes it easier to find a given benchmark's "entry point" in the code. - Try to reuse existing tags when possible. Tags should be lowercase and singular.
- If your benchmark requires a significant amount of code, wrap it in a module.
New benchmarks added to BaseBenchmarks won't be present via CI right away, as
their execution parameters must be tuned and
cached
on Nanosoldier (our benchmark
cluster) before they are suitable for running. This process is performed
periodically and upon request, after which the master
branch is merged into
the nanosoldier
branch. Nanosoldier pulls down the nanosoldier
branch before running every
benchmark job, so whatever's currently on the nanosoldier
branch is what's
being used in CI.