Skip to content

Commit

Permalink
LSMR further updates (#111)
Browse files Browse the repository at this point in the history
* bring LSMR implementation of #46 up to date

* formatting

* use VectorInterface and update tests

* add docs for LSMR

* add newline at end of `lsmr.jl` to pass formatting check

* update LSMR implementation

* fix docs, tests and formatting

* actually add docs page

---------

Co-authored-by: victor <[email protected]>
  • Loading branch information
Jutho and VictorVanthilt authored Jan 6, 2025
1 parent 4d2a06f commit 478db80
Show file tree
Hide file tree
Showing 15 changed files with 418 additions and 36 deletions.
2 changes: 1 addition & 1 deletion Project.toml
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,7 @@ Printf = "1"
Random = "1"
Test = "1"
TestExtras = "0.2,0.3"
VectorInterface = "0.4"
VectorInterface = "0.4,0.5"
Zygote = "0.6"
julia = "1.6"

Expand Down
1 change: 1 addition & 0 deletions docs/make.jl
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@ makedocs(; modules=[KrylovKit],
pages=["Home" => "index.md",
"Manual" => ["man/intro.md",
"man/linear.md",
"man/leastsquares.md",
"man/eig.md",
"man/svd.md",
"man/matfun.md",
Expand Down
3 changes: 2 additions & 1 deletion docs/src/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@ objects with vector like behavior (see below) as vectors.

The high level interface of KrylovKit is provided by the following functions:
* [`linsolve`](@ref): solve linear systems `A*x = b`
* [`lssolve`](@ref): solve least square problems `A*x ≈ b`
* [`eigsolve`](@ref): find a few eigenvalues and corresponding eigenvectors of an
eigenvalue problem `A*x = λ x`
* [`geneigsolve`](@ref): find a few eigenvalues and corresponding vectors of a
Expand Down Expand Up @@ -113,12 +114,12 @@ Here follows a wish list / to-do list for the future. Any help is welcomed and a

* More algorithms, including biorthogonal methods:
- for `linsolve`: L-GMRES, MINRES, BiCG, IDR(s), ...
- for `lssolve`: LSQR, ...
- for `eigsolve`: BiLanczos, Jacobi-Davidson JDQR/JDQZ, subspace iteration (?), ...
- for `geneigsolve`: trace minimization, ...
* Support both in-place / mutating and out-of-place functions as linear maps
* Reuse memory for storing vectors when restarting algorithms (related to previous)
* Support non-BLAS scalar types using GeneralLinearAlgebra.jl and GeneralSchur.jl
* Least square problems
* Nonlinear eigenvalue problems
* Preconditioners
* Refined Ritz vectors, Harmonic Ritz values and vectors
Expand Down
1 change: 1 addition & 0 deletions docs/src/man/algorithms.md
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,7 @@ KrylovKit.MINRES
GMRES
KrylovKit.BiCG
BiCGStab
LSMR
```
## Specific algorithms for generalized eigenvalue problems
```@docs
Expand Down
10 changes: 10 additions & 0 deletions docs/src/man/leastsquares.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
# Least squares problems

Least square problems take the form of finding `x` that minimises `norm(b - A*x)` where
`A` should be a linear map. As opposed to linear systems, the input and output of the linear
map do not need to be the same, so that `x` (input) and `b` (output) can live in different
vector spaces. Such problems can be solved using the function `lssolve`:

```@docs
lssolve
```
8 changes: 6 additions & 2 deletions src/KrylovKit.jl
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,7 @@ using Random
using PackageExtensionCompat
const IndexRange = AbstractRange{Int}

export linsolve, reallinsolve
export linsolve, reallinsolve, lssolve
export eigsolve, geneigsolve, realeigsolve, schursolve, svdsolve
export exponentiate, expintegrator
export orthogonalize, orthogonalize!!, orthonormalize, orthonormalize!!
Expand All @@ -37,7 +37,7 @@ export initialize, initialize!, expand!, shrink!
export ClassicalGramSchmidt, ClassicalGramSchmidt2, ClassicalGramSchmidtIR
export ModifiedGramSchmidt, ModifiedGramSchmidt2, ModifiedGramSchmidtIR
export LanczosIterator, ArnoldiIterator, GKLIterator
export CG, GMRES, BiCGStab, Lanczos, Arnoldi, GKL, GolubYe
export CG, GMRES, BiCGStab, Lanczos, Arnoldi, GKL, GolubYe, LSMR
export KrylovDefaults, EigSorter
export RecursiveVec, InnerProductVec

Expand Down Expand Up @@ -236,6 +236,10 @@ include("linsolve/cg.jl")
include("linsolve/gmres.jl")
include("linsolve/bicgstab.jl")

# lssolve
include("lssolve/lssolve.jl")
include("lssolve/lsmr.jl")

# eigsolve and svdsolve
include("eigsolve/eigsolve.jl")
include("eigsolve/lanczos.jl")
Expand Down
74 changes: 52 additions & 22 deletions src/algorithms.jl
Original file line number Diff line number Diff line change
Expand Up @@ -84,16 +84,16 @@ abstract type KrylovAlgorithm end

# General purpose; good for linear systems, eigensystems and matrix functions
"""
Lanczos(; krylovdim = KrylovDefaults.krylovdim, maxiter = KrylovDefaults.maxiter,
tol = KrylovDefaults.tol, orth = KrylovDefaults.orth, eager = false, verbosity = 0)
Lanczos(; krylovdim=KrylovDefaults.krylovdim, maxiter=KrylovDefaults.maxiter,
tol=KrylovDefaults.tol, orth=KrylovDefaults.orth, eager=false, verbosity=0)
Represents the Lanczos algorithm for building the Krylov subspace; assumes the linear
operator is real symmetric or complex Hermitian. Can be used in `eigsolve` and
`exponentiate`. The corresponding algorithms will build a Krylov subspace of size at most
`krylovdim`, which will be repeated at most `maxiter` times and will stop when the norm of
the residual of the Lanczos factorization is smaller than `tol`. The orthogonalizer `orth`
will be used to orthogonalize the different Krylov vectors. Eager mode, as selected by
`eager = true`, means that the algorithm that uses this Lanczos process (e.g. `eigsolve`)
`eager=true`, means that the algorithm that uses this Lanczos process (e.g. `eigsolve`)
can try to finish its computation before the total Krylov subspace of dimension `krylovdim`
is constructed. Default verbosity level `verbosity` is zero, meaning that no output will be
printed.
Expand Down Expand Up @@ -121,8 +121,8 @@ function Lanczos(;
end

"""
GKL(; krylovdim = KrylovDefaults.krylovdim, maxiter = KrylovDefaults.maxiter,
tol = KrylovDefaults.tol, orth = KrylovDefaults.orth, verbosity = 0)
GKL(; krylovdim=KrylovDefaults.krylovdim, maxiter=KrylovDefaults.maxiter,
tol=KrylovDefaults.tol, orth=KrylovDefaults.orth, verbosity=0)
Represents the Golub-Kahan-Lanczos bidiagonalization algorithm for sequentially building a
Krylov-like factorization of a general matrix or linear operator with a bidiagonal reduced
Expand Down Expand Up @@ -153,15 +153,15 @@ function GKL(;
end

"""
Arnoldi(; krylovdim = KrylovDefaults.krylovdim, maxiter = KrylovDefaults.maxiter,
tol = KrylovDefaults.tol, orth = KrylovDefaults.orth, eager = false, verbosity = 0)
Arnoldi(; krylovdim=KrylovDefaults.krylovdim, maxiter=KrylovDefaults.maxiter,
tol=KrylovDefaults.tol, orth=KrylovDefaults.orth, eager=false, verbosity=0)
Represents the Arnoldi algorithm for building the Krylov subspace for a general matrix or
linear operator. Can be used in `eigsolve` and `exponentiate`. The corresponding algorithms
will build a Krylov subspace of size at most `krylovdim`, which will be repeated at most
`maxiter` times and will stop when the norm of the residual of the Arnoldi factorization is
smaller than `tol`. The orthogonalizer `orth` will be used to orthogonalize the different
Krylov vectors. Eager mode, as selected by `eager = true`, means that the algorithm that
Krylov vectors. Eager mode, as selected by `eager=true`, means that the algorithm that
uses this Arnoldi process (e.g. `eigsolve`) can try to finish its computation before the
total Krylov subspace of dimension `krylovdim` is constructed. Default verbosity level
`verbosity` is zero, meaning that no output will be printed.
Expand Down Expand Up @@ -190,16 +190,16 @@ function Arnoldi(;
end

"""
GolubYe(; krylovdim = KrylovDefaults.krylovdim, maxiter = KrylovDefaults.maxiter,
tol = KrylovDefaults.tol, orth = KrylovDefaults.orth, verbosity = 0)
GolubYe(; krylovdim=KrylovDefaults.krylovdim, maxiter=KrylovDefaults.maxiter,
tol=KrylovDefaults.tol, orth=KrylovDefaults.orth, verbosity=0)
Represents the Golub-Ye algorithm for solving hermitian (symmetric) generalized eigenvalue
problems `A x = λ B x` with positive definite `B`, without the need for inverting `B`.
Builds a Krylov subspace of size `krylovdim` starting from an estimate `x` by acting with
`(A - ρ(x) B)`, where `ρ(x) = dot(x, A*x)/dot(x, B*x)`, and employing the Lanczos
algorithm. This process is repeated at most `maxiter` times. In every iteration `k>1`, the
subspace will also be expanded to size `krylovdim+1` by adding ``x_k - x_{k-1}``, which is
known as the LOPCG correction and was suggested by Money and Ye. With `krylovdim = 2`, this
known as the LOPCG correction and was suggested by Money and Ye. With `krylovdim=2`, this
algorithm becomes equivalent to `LOPCG`.
"""
struct GolubYe{O<:Orthogonalizer,S<:Real} <: KrylovAlgorithm
Expand All @@ -222,7 +222,7 @@ end
abstract type LinearSolver <: KrylovAlgorithm end

"""
CG(; maxiter = KrylovDefaults.maxiter, tol = KrylovDefaults.tol)
CG(; maxiter=KrylovDefaults.maxiter, tol=KrylovDefaults.tol, verbosity=0)
Construct an instance of the conjugate gradient algorithm with specified parameters, which
can be passed to `linsolve` in order to iteratively solve a linear system with a positive
Expand All @@ -231,7 +231,7 @@ will search for the optimal `x` in a Krylov subspace of maximal size `maxiter`,
`norm(A*x - b) < tol`. Default verbosity level `verbosity` is zero, meaning that no output
will be printed.
See also: [`linsolve`](@ref), [`MINRES`](@ref), [`GMRES`](@ref), [`BiCG`](@ref),
See also: [`linsolve`](@ref), [`MINRES`](@ref), [`GMRES`](@ref), [`BiCG`](@ref), [`LSMR`](@ref),
[`BiCGStab`](@ref)
"""
struct CG{S<:Real} <: LinearSolver
Expand All @@ -247,8 +247,8 @@ function CG(;
end

"""
GMRES(; krylovdim = KrylovDefaults.krylovdim, maxiter = KrylovDefaults.maxiter,
tol = KrylovDefaults.tol, orth::Orthogonalizer = KrylovDefaults.orth)
GMRES(; krylovdim=KrylovDefaults.krylovdim, maxiter=KrylovDefaults.maxiter,
tol=KrylovDefaults.tol, orth::Orthogonalizer=KrylovDefaults.orth)
Construct an instance of the GMRES algorithm with specified parameters, which can be passed
to `linsolve` in order to iteratively solve a linear system. The `GMRES` method will search
Expand All @@ -262,7 +262,7 @@ to as the restart parameter, and `maxiter` is the number of outer iterations, i.
cycles. The total iteration count, i.e. the number of expansion steps, is roughly
`krylovdim` times the number of iterations.
See also: [`linsolve`](@ref), [`BiCG`](@ref), [`BiCGStab`](@ref), [`CG`](@ref),
See also: [`linsolve`](@ref), [`BiCG`](@ref), [`BiCGStab`](@ref), [`CG`](@ref), [`LSMR`](@ref),
[`MINRES`](@ref)
"""
struct GMRES{O<:Orthogonalizer,S<:Real} <: LinearSolver
Expand All @@ -283,7 +283,7 @@ end

# TODO
"""
MINRES(; maxiter = KrylovDefaults.maxiter, tol = KrylovDefaults.tol)
MINRES(; maxiter=KrylovDefaults.maxiter, tol=KrylovDefaults.tol)
!!! warning "Not implemented yet"
Expand All @@ -295,7 +295,7 @@ end
orthogonalizer `orth`. Default verbosity level `verbosity` is zero, meaning that no
output will be printed.
See also: [`linsolve`](@ref), [`CG`](@ref), [`GMRES`](@ref), [`BiCG`](@ref),
See also: [`linsolve`](@ref), [`CG`](@ref), [`GMRES`](@ref), [`BiCG`](@ref), [`LSMR`](@ref),
[`BiCGStab`](@ref)
"""
struct MINRES{S<:Real} <: LinearSolver
Expand All @@ -311,7 +311,7 @@ function MINRES(;
end

"""
BiCG(; maxiter = KrylovDefaults.maxiter, tol = KrylovDefaults.tol)
BiCG(; maxiter=KrylovDefaults.maxiter, tol=KrylovDefaults.tol)
!!! warning "Not implemented yet"
Expand All @@ -322,7 +322,7 @@ end
b) < tol`. Default verbosity level `verbosity` is zero, meaning that no output will be
printed.
See also: [`linsolve`](@ref), [`GMRES`](@ref), [`CG`](@ref), [`BiCGStab`](@ref),
See also: [`linsolve`](@ref), [`GMRES`](@ref), [`CG`](@ref), [`BiCGStab`](@ref), [`LSMR`](@ref),
[`MINRES`](@ref)
"""
struct BiCG{S<:Real} <: LinearSolver
Expand All @@ -338,15 +338,15 @@ function BiCG(;
end

"""
BiCGStab(; maxiter = KrylovDefaults.maxiter, tol = KrylovDefaults.tol)
BiCGStab(; maxiter=KrylovDefaults.maxiter, tol=KrylovDefaults.tol)
Construct an instance of the Biconjugate gradient algorithm with specified parameters,
which can be passed to `linsolve` in order to iteratively solve a linear system general
linear map. The `BiCGStab` method will search for the optimal `x` in a Krylov subspace
of maximal size `maxiter`, or stop when `norm(A*x - b) < tol`. Default verbosity level
`verbosity` is zero, meaning that no output will be printed.
See also: [`linsolve`](@ref), [`GMRES`](@ref), [`CG`](@ref), [`BiCG`](@ref),
See also: [`linsolve`](@ref), [`GMRES`](@ref), [`CG`](@ref), [`BiCG`](@ref), [`LSMR`](@ref),
[`MINRES`](@ref)
"""
struct BiCGStab{S<:Real} <: LinearSolver
Expand All @@ -361,6 +361,36 @@ function BiCGStab(;
return BiCGStab(maxiter, tol, verbosity)
end

# Solving least squares problems
abstract type LeastSquaresSolver <: KrylovAlgorithm end
"""
LSMR(; maxiter=KrylovDefaults.maxiter, tol=KrylovDefaults.tol, verbosity=0)
Represents the LSMR algorithm, which minimizes ``\\|Ax - b\\|^2 + \\|λx\\|^2`` in the Euclidean norm.
If multiple solutions exists the minimum norm solution is returned.
The method is based on the Golub-Kahan bidiagonalization process. It is
algebraically equivalent to applying MINRES to the normal equations
``(A^*A + λ^2I)x = A^*b``, but has better numerical properties,
especially if ``A`` is ill-conditioned.
The `LSMR` method will search for the optimal ``x`` in a Krylov subspace of maximal size
`maxiter`, or stop when ``norm(A'*(A*x - b) + λ^2 * x) < tol``. Default verbosity level
`verbosity` is zero, meaning that no output will be printed.
See also: [`lssolve`](@ref)
"""
struct LSMR{S<:Real} <: LeastSquaresSolver
maxiter::Int
tol::S
verbosity::Int
end
function LSMR(;
maxiter::Integer=KrylovDefaults.maxiter,
tol::Real=KrylovDefaults.tol,
verbosity::Int=0)
return LSMR(maxiter, tol, verbosity)
end

# Solving eigenvalue systems specifically
abstract type EigenSolver <: KrylovAlgorithm end

Expand Down
4 changes: 2 additions & 2 deletions src/eigsolve/arnoldi.jl
Original file line number Diff line number Diff line change
Expand Up @@ -58,7 +58,7 @@ currently available.
The return value is always of the form `T, vecs, vals, info = schursolve(...)` with
- `T`: a `Matrix` containing the partial Schur decomposition of the linear map, i.e. it's
elements are given by `T[i,j] = dot(vecs[i], f(vecs[j]))`. It is of Schur form, i.e.
elements are given by `T[i,j] = inner(vecs[i], f(vecs[j]))`. It is of Schur form, i.e.
upper triangular in case of complex arithmetic, and block upper triangular (with at most
2x2 blocks) in case of real arithmetic.
- `vecs`: a `Vector` of corresponding Schur vectors, of the same length as `vals`. Note
Expand All @@ -67,7 +67,7 @@ The return value is always of the form `T, vecs, vals, info = schursolve(...)` w
objects that are typically similar to the starting guess `x₀`, up to a possibly
different `eltype`. When the linear map is a simple `AbstractMatrix`, `vecs` will be
`Vector{Vector{<:Number}}`. Schur vectors are by definition orthogonal, i.e.
`dot(vecs[i],vecs[j]) = I[i,j]`. Note that Schur vectors are real if the problem (i.e.
`inner(vecs[i],vecs[j]) = I[i,j]`. Note that Schur vectors are real if the problem (i.e.
the linear map and the initial guess) are real.
- `vals`: a `Vector` of eigenvalues, i.e. the diagonal elements of `T` in case of complex
arithmetic, or extracted from the diagonal blocks in case of real arithmetic. Note that
Expand Down
21 changes: 18 additions & 3 deletions src/linsolve/cg.jl
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,14 @@ function linsolve(operator, b, x₀, alg::CG, a₀::Real=0, a₁::Real=1; alg_rr
numiter = 0

# Check for early return
normr < tol && return (x, ConvergenceInfo(1, r, normr, numiter, numops))
if normr < tol
if alg.verbosity > 0
@info """CG linsolve converged without any iterations:
* norm of residual = $normr
* number of operations = 1"""
end
return (x, ConvergenceInfo(1, r, normr, numiter, numops))
end

# First iteration
ρ = normr^2
Expand All @@ -34,6 +41,14 @@ function linsolve(operator, b, x₀, alg::CG, a₀::Real=0, a₁::Real=1; alg_rr
β = ρ / ρold
numops += 1
numiter += 1
if normr < tol
if alg.verbosity > 0
@info """CG linsolve converged at iteration $numiter:
* norm of residual = $normr
* number of operations = $numops"""
end
return (x, ConvergenceInfo(1, r, normr, numiter, numops))
end
if alg.verbosity > 1
msg = "CG linsolve in iter $numiter: "
msg *= "normres = "
Expand Down Expand Up @@ -62,6 +77,8 @@ function linsolve(operator, b, x₀, alg::CG, a₀::Real=0, a₁::Real=1; alg_rr
ρ = normr^2
β = ρ / ρold
end
numops += 1
numiter += 1
if normr < tol
if alg.verbosity > 0
@info """CG linsolve converged at iteration $numiter:
Expand All @@ -70,8 +87,6 @@ function linsolve(operator, b, x₀, alg::CG, a₀::Real=0, a₁::Real=1; alg_rr
end
return (x, ConvergenceInfo(1, r, normr, numiter, numops))
end
numops += 1
numiter += 1
if alg.verbosity > 1
msg = "CG linsolve in iter $numiter: "
msg *= "normres = "
Expand Down
9 changes: 5 additions & 4 deletions src/linsolve/gmres.jl
Original file line number Diff line number Diff line change
Expand Up @@ -32,10 +32,11 @@ function linsolve(operator, b, x₀, alg::GMRES, a₀::Number=0, a₁::Number=1;
gs = Vector{Givens{T}}(undef, krylovdim)
R = fill(zero(T), (krylovdim, krylovdim))
numiter = 0
numops = 1 # operator has been applied once to determine T
numops = 1 # operator has been applied once to determine T and r

iter = ArnoldiIterator(operator, r, alg.orth)
fact = initialize(iter)
fact = initialize(iter; verbosity=alg.verbosity - 2)
sizehint!(fact, alg.krylovdim)
numops += 1 # start applies operator once

while numiter < maxiter # restart loop
Expand All @@ -56,7 +57,7 @@ function linsolve(operator, b, x₀, alg::GMRES, a₀::Number=0, a₁::Number=1;
end

while> tol && length(fact) < krylovdim) # inner arnoldi loop
fact = expand!(iter, fact)
fact = expand!(iter, fact; verbosity=alg.verbosity - 2)
numops += 1 # expand! applies the operator once
k = length(fact)
H = rayleighquotient(fact)
Expand Down Expand Up @@ -133,7 +134,7 @@ function linsolve(operator, b, x₀, alg::GMRES, a₀::Number=0, a₁::Number=1;

# Restart Arnoldi factorization with new r
iter = ArnoldiIterator(operator, r, alg.orth)
fact = initialize!(iter, fact)
fact = initialize!(iter, fact; verbosity=alg.verbosity - 2)
end

if alg.verbosity > 0
Expand Down
2 changes: 1 addition & 1 deletion src/linsolve/linsolve.jl
Original file line number Diff line number Diff line change
Expand Up @@ -85,7 +85,7 @@ efficiently. Check the documentation for more information on the possible values
The final (expert) method, without default values and keyword arguments, is the one that is
finally called, and can also be used directly. Here, one specifies the algorithm explicitly.
Currently, only [`CG`](@ref), [`GMRES`](@ref) and [`BiCGStab`](@ref) are implemented, where
Currently, only [`CG`](@ref), [`GMRES`](@ref), [`BiCGStab`](@ref) and [`LSMR`](@ref) are implemented, where
`CG` is chosen if `isposdef == true` and `GMRES` is chosen otherwise. Note that in standard
`GMRES` terminology, our parameter `krylovdim` is referred to as the *restart* parameter,
and our `maxiter` parameter counts the number of outer iterations, i.e. restart cycles. In
Expand Down
Loading

0 comments on commit 478db80

Please sign in to comment.