You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi, I'm on Julia 1.8.2 and I'm trying to use exponential solvers on a moderately-sized problem. When I construct the linear operator via Diagonal, I notice some performance issues. I reported this in the past in SciML/OrdinaryDiffEq.jl#1193. It's been a while, but now I have some renewed interest, so I decided to dig a little deeper and managed to localize the problem here.
This is the benchmark I used in the past, where the nonlinear part is zero so that I can test against the exact solution. The accuracy problems are gone, but performance issues remain.
using LinearAlgebra
using DifferentialEquations
using DiffEqOperators
using OrdinaryDiffEq
using ExponentialUtilities
using BenchmarkTools
using Profile
N =256^2
Arr =rand(N)
A =DiffEqArrayOperator(Diagonal(Arr))
Amat =convert(AbstractMatrix, A)
f!(du, u, p, t) =nothing
t1 =1.0
u0 =ones(eltype(Arr), N)
u1 =exp.(Arr*t1) .* u0 # Exact solution.
prob! =SplitODEProblem(A, f!, u0, t1, nothing)
def_kwargs =Dict(:save_everystep=>false, :save_start=>false, :dt=>0.05)
ϵ(u_sol) =√sum(abs2.(u_sol - u1)) # RMS error to check the accuracy.functiontest(prob, alg; kwargs...)
sol =solve(prob, alg; kwargs...)
@showϵ(sol[end])
@benchmarksolve($prob, $alg; $kwargs...)
end;
Here's a comparison between LawsonEuler and ETDRK4.
test(prob!, LawsonEuler(); def_kwargs...)
ϵ(sol[end]) = 5.769346133408569e-13
BenchmarkTools.Trial: 573 samples with 1 evaluation.
Range (min … max): 6.919 ms … 19.627 ms ┊ GC (min … max): 0.00% … 61.09%
Time (median): 8.102 ms ┊ GC (median): 0.00%
Time (mean ± σ): 8.711 ms ± 2.148 ms ┊ GC (mean ± σ): 5.39% ± 11.74%
▂▇██▇▅▄▄▂▂▁
▄▆████████████▆▇▆▇▄▅▄▄▆▄▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▄▄▁▆█▅▆▄▄▄▁▄▄ ▇
6.92 ms Histogram: log(frequency) by time 18.6 ms <
Memory estimate: 6.50 MiB, allocs estimate: 69.
test(prob!, ETDRK4(); def_kwargs...)
ϵ(sol[end]) = 1.8824825842372246e-13
BenchmarkTools.Trial: 2 samples with 1 evaluation.
Range (min … max): 4.813 s … 4.831 s ┊ GC (min … max): 0.33% … 0.37%
Time (median): 4.822 s ┊ GC (median): 0.35%
Time (mean ± σ): 4.822 s ± 12.724 ms ┊ GC (mean ± σ): 0.35% ± 0.03%
█ █
█▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁█ ▁
4.81 s Histogram: frequency by time 4.83 s <
Memory estimate: 171.01 MiB, allocs estimate: 1310838.
So LawsonEuler wins by orders of magnitude, both in memory allocations and time of execution. I know ETDRK4 is a much more complex method, but I wouldn't expect such a gap in performance.
I managed to identify a part of the problem with memory. It turns out that while scalar phi and exponential! can use caches to save memory, these caches are reallocated for each matrix element. This can be fixed by creating the caches already in phi!. So I monkey-patched these two functions
function ExponentialUtilities.phi(z::T, k::Integer; cache =nothing,
expmethod = ExponentialUtilities.ExpMethodHigham2005(), cache2 =nothing) where {T <:Number}
# Construct the matrixif cache ==nothing
cache =fill(zero(T), k +1, k +1)
elsefill!(cache, zero(T))
end
cache[1, 1] = z
for i in1:k
cache[i, i +1] =one(T)
endif cache2 ==nothing
cache2 = ExponentialUtilities.alloc_mem(cache, expmethod)
end
P =exponential!(cache, expmethod, cache2)
return P[1, :]
endfunction ExponentialUtilities.phi!(out::Vector{Diagonal{T, V}}, A::Diagonal{T, V}, k::Integer;
caches =nothing, expmethod = ExponentialUtilities.ExpMethodHigham2005()) where {T <:Number, V <:AbstractVector{T}}
cache =fill(zero(T), k +1, k +1)
cache2 = ExponentialUtilities.alloc_mem(cache, expmethod)
for i in1:size(A, 1)
phiz =phi(A[i, i], k; cache = cache, expmethod = expmethod, cache2 = cache2)
for j in1:(k +1)
out[j][i, i] = phiz[j]
endendreturn out
end
and got the following result
test(prob!, ETDRK4(); def_kwargs...)
ϵ(sol[end]) = 1.8832000766134335e-13
BenchmarkTools.Trial: 2 samples with 1 evaluation.
Range (min … max): 4.923 s … 5.065 s ┊ GC (min … max): 0.00% … 0.00%
Time (median): 4.994 s ┊ GC (median): 0.00%
Time (mean ± σ): 4.994 s ± 100.383 ms ┊ GC (mean ± σ): 0.00% ± 0.00%
█ █
█▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁█ ▁
4.92 s Histogram: frequency by time 5.06 s <
Memory estimate: 40.01 MiB, allocs estimate: 262278.
Memory usage went down noticeably, but the method is as slow as before. Running the profiler on phi! I found that most of the time is spent on LU decompositions in ldiv_for_generated!, this line specifically:
F =lu!(A) # This allocation is unavoidable, due to the interface of LinearAlgebra
At this point I'm not familiar enough with the package to continue. Can anything be done to speed up this method? Perhaps the comment about unavoidable LU is not true anymore? Or have I run into the limits of performance of exponential solvers and they simply do not scale to problems of this size? Should I use another method? LawsonEuler, which seems to work fine in this simple toy example, unfortunately isn't accurate enough in the real problem. I also tried ETDRK4(krylov=true), but it's only 50% faster and seems to allocate slightly more memory, which maybe points to issues with caching there as well.
Sorry for such a long post, I kinda want this to work :)
The text was updated successfully, but these errors were encountered:
At this point I'm not familiar enough with the package to continue. Can anything be done to speed up this method? Perhaps the comment about unavoidable LU is not true anymore?
You can use LinearSolve.jl here and see if there's a faster LU for your computer.
Diagonal matrices could probably have a specialization that avoid most of the computation since diagonal matrices should do all of this element-wise, so it could have a very quick specialization.
Hi, I'm on Julia 1.8.2 and I'm trying to use exponential solvers on a moderately-sized problem. When I construct the linear operator via
Diagonal
, I notice some performance issues. I reported this in the past in SciML/OrdinaryDiffEq.jl#1193. It's been a while, but now I have some renewed interest, so I decided to dig a little deeper and managed to localize the problem here.This is the benchmark I used in the past, where the nonlinear part is zero so that I can test against the exact solution. The accuracy problems are gone, but performance issues remain.
Here's a comparison between
LawsonEuler
andETDRK4
.So
LawsonEuler
wins by orders of magnitude, both in memory allocations and time of execution. I knowETDRK4
is a much more complex method, but I wouldn't expect such a gap in performance.I managed to identify a part of the problem with memory. It turns out that while scalar
phi
andexponential!
can use caches to save memory, these caches are reallocated for each matrix element. This can be fixed by creating the caches already inphi!
. So I monkey-patched these two functionsand got the following result
Memory usage went down noticeably, but the method is as slow as before. Running the profiler on
phi!
I found that most of the time is spent on LU decompositions inldiv_for_generated!
, this line specifically:ExponentialUtilities.jl/src/exp_noalloc.jl
Line 29 in 380125a
At this point I'm not familiar enough with the package to continue. Can anything be done to speed up this method? Perhaps the comment about unavoidable LU is not true anymore? Or have I run into the limits of performance of exponential solvers and they simply do not scale to problems of this size? Should I use another method?
LawsonEuler
, which seems to work fine in this simple toy example, unfortunately isn't accurate enough in the real problem. I also triedETDRK4(krylov=true)
, but it's only 50% faster and seems to allocate slightly more memory, which maybe points to issues with caching there as well.Sorry for such a long post, I kinda want this to work :)
The text was updated successfully, but these errors were encountered: