Replies: 2 comments 8 replies
-
Laziness can be a performance problem for low-latency networks, but these tests are being performed on local disk, which can respond to many requests for subsets of the data about as fast as one request for all the data, as long as those subsets are reasonably sequential. Checking to see if a cache is still valid also shouldn't be expensive (or it would undermine the value of the cache), but which cache are you talking about? The OS's virtual memory or something in the Uproot or UnROOT implementation? |
Beta Was this translation helpful? Give feedback.
-
Lock has a limited impact in realistic application# without lock
> julia --project -t4 UnROOT_loop.jl
[ Info: 1st run
10.942170 seconds (26.97 M allocations: 17.499 GiB, 25.29% gc time, 1.51% compilation time)
[ Info: 2nd run
10.022244 seconds (26.16 M allocations: 17.458 GiB, 22.72% gc time)
# with lock
> julia --project -t4 UnROOT_loop.jl
[ Info: 1st run
11.743019 seconds (26.97 M allocations: 17.499 GiB, 23.01% gc time, 1.40% compilation time)
[ Info: 2nd run
11.145259 seconds (26.16 M allocations: 17.458 GiB, 21.00% gc time) at most (tried a few times sometimes it's smaller) it's about 10%. Overhead of reading a new basketPossibly because the overhead of "reading a new basket" is large enough to dominate if you look at the entire workload: julia> f() = LazyTree("./Run2012BC_DoubleMuParked_Muons.root", "Events");
julia> const tt = f();
julia> function g()
tt[8000].Muon_pt # flush cache
tt
end
julia> function h(tt)
tt[9200].Muon_pt # trigger basket I/O
end
julia> @be g() h evals=1
Benchmark: 385 samples with 1 evaluation
min 94.098 μs (82 allocs: 122.547 KiB)
median 96.783 μs (82 allocs: 122.547 KiB)
mean 123.332 μs (82 allocs: 122.547 KiB, 1.76% gc time)
max 4.658 ms (82 allocs: 122.547 KiB, 96.20% gc time) Summary:
|
Beta Was this translation helpful? Give feedback.
-
I also tried running
before each timing and don't see significant change:
I'm curious as to what @jpivarski thinks -- because my time is ~consistent with
uproot
I'd like to think they're reasonable, but then I don't understand why lazy loop is so fast -- because I also know it has pretty big overhead just from the "check if cache is still valid" overhead, any other benchmark ideas?Beta Was this translation helpful? Give feedback.
All reactions