Skip to content

Commit

Permalink
Add examples
Browse files Browse the repository at this point in the history
  • Loading branch information
tansongchen committed Feb 21, 2024
1 parent 1ca63ca commit 4f7b477
Show file tree
Hide file tree
Showing 3 changed files with 416 additions and 0 deletions.
264 changes: 264 additions & 0 deletions examples/halley.ipynb
Original file line number Diff line number Diff line change
@@ -0,0 +1,264 @@
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# The Jacobian- and Hessian-free Halley's Method\n",
"\n",
"Say we have a system of $n$ equations with $n$ unknowns\n",
"\n",
"$$\n",
"f(x)=0\n",
"$$\n",
"\n",
"and $f\\in \\mathbb R^n\\to\\mathbb R^n$ is sufficiently smooth.\n",
"\n",
"Given a initial guess $x_0$, Halley's method specifies a series of points approximating the solution, where each iteration is\n",
"\n",
"$$\n",
"x^{(i+1)}=x^{(i)}+\\frac{a^{(i)}a^{(i)}}{a^{(i)}+b^{(i)}/2}\n",
"$$\n",
"\n",
"where the vector multiplication and division $ab, a/b$ is defined in Banach algebra, and the vectors $a^{(i)}, b^{(i)}$ are defined as\n",
"\n",
"$$\n",
"J(x^{(i)})a^{(i)} = -f(x^{(i)})\n",
"$$\n",
"\n",
"and\n",
"\n",
"$$\n",
"J(x^{(i)})b^{(i)} = H(x^{(i)})a^{(i)}a^{(i)}\n",
"$$\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The full Jacobian (a matrix) and full Hessian (a 3-tensor) representation can be avoided by using forward-mode automatic differentiation. It is well known that a forward evaluation on a dual number $(x, v)$ gives the Jacobian-vector product,\n",
"\n",
"$$\n",
"f(x,v)=(f(x),Jv)\n",
"$$\n",
"\n",
"and similarly a forward evaluation on a second order Taylor expansion gives the Hessian-vector-vector product,\n",
"\n",
"$$\n",
"f(x,v,0)=f(x,Jv,Hvv)\n",
"$$\n",
"\n",
"Below, we demonstrate this possibility with TaylorDiff.jl."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Jacobian-free Newton Krylov\n",
"\n",
"To get started we first get familiar with the JFNK:"
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"newton (generic function with 1 method)"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"# The Jacobi- and Hessian-free Halley method for solving nonlinear equations\n",
"\n",
"using TaylorDiff\n",
"using LinearAlgebra\n",
"using LinearSolve\n",
"\n",
"function newton(f, x0, p; tol=1e-10, maxiter=100)\n",
" x = x0\n",
" for i in 1:maxiter\n",
" fx = f(x, p)\n",
" error = norm(fx)\n",
" println(\"Iteration $i: x = $x, f(x) = $fx, error = $error\")\n",
" if error < tol\n",
" return x\n",
" end\n",
" get_derivative = (v, u, a, b) -> v .= derivative(x -> f(x, p), x, u, 1)\n",
" operator = FunctionOperator(get_derivative, similar(x), similar(x))\n",
" problem = LinearProblem(operator, -fx)\n",
" sol = solve(problem, KrylovJL_GMRES())\n",
" x += sol.u\n",
" end\n",
" return x\n",
"end"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Jacobian- and Hessian-free Halley\n",
"\n",
"This naturally follows, only difference is replacing the rhs by Hessian-vector-vector product:"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"halley (generic function with 1 method)"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"function halley(f, x0, p; tol=1e-10, maxiter=100)\n",
" x = x0\n",
" for i in 1:maxiter\n",
" fx = f(x, p)\n",
" error = norm(fx)\n",
" println(\"Iteration $i: x = $x, f(x) = $fx, error = $error\")\n",
" if error < tol\n",
" return x\n",
" end\n",
" get_derivative = (v, u, a, b) -> v .= derivative(x -> f(x, p), x, u, 1)\n",
" operator = FunctionOperator(get_derivative, similar(x), similar(x))\n",
" problem = LinearProblem(operator, -fx)\n",
" a = solve(problem, KrylovJL_GMRES()).u\n",
" Haa = derivative(x -> f(x, p), x, a, 2)\n",
" problem2 = LinearProblem(operator, Haa)\n",
" b = solve(problem2, KrylovJL_GMRES()).u\n",
" x += (a .* a) ./ (a .+ b ./ 2)\n",
" end\n",
" return x\n",
"end"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"f (generic function with 1 method)"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"# Testing with simple examples:\n",
"\n",
"f(x, p) = x .* x - p"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Iteration 1: x = [1.0, 1.0], f(x) = [-1.0, -1.0], error = 1.4142135623730951\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Iteration 2: x = [1.5, 1.5], f(x) = [0.25, 0.25], error = 0.3535533905932738\n",
"Iteration 3: x = [1.4166666666666667, 1.4166666666666667], f(x) = [0.006944444444444642, 0.006944444444444642], error = 0.009820927516480105\n",
"Iteration 4: x = [1.4142156862745099, 1.4142156862745099], f(x) = [6.007304882871267e-6, 6.007304882871267e-6], error = 8.495612038666664e-6\n",
"Iteration 5: x = [1.4142135623746899, 1.4142135623746899], f(x) = [4.510614104447086e-12, 4.510614104447086e-12], error = 6.378971641140442e-12\n"
]
},
{
"data": {
"text/plain": [
"2-element Vector{Float64}:\n",
" 1.4142135623746899\n",
" 1.4142135623746899"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"newton(f, [1., 1.], [2., 2.])"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Iteration 1: x = [1.0, 1.0], f(x) = [-1.0, -1.0], error = 1.4142135623730951\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Iteration 2: x = [1.4000000000000001, 1.4000000000000001], f(x) = [-0.03999999999999959, -0.03999999999999959], error = 0.05656854249492323\n",
"Iteration 3: x = [1.4142131979695431, 1.4142131979695431], f(x) = [-1.0306887576749801e-6, -1.0306887576749801e-6], error = 1.4576140196894333e-6\n",
"Iteration 4: x = [1.414213562373142, 1.414213562373142], f(x) = [1.3278267374516872e-13, 1.3278267374516872e-13], error = 1.877830580585795e-13\n"
]
},
{
"data": {
"text/plain": [
"2-element Vector{Float64}:\n",
" 1.414213562373142\n",
" 1.414213562373142"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"halley(f, [1., 1.], [2., 2.])"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Julia 1.10.1",
"language": "julia",
"name": "julia-1.10"
},
"language_info": {
"file_extension": ".jl",
"mimetype": "application/julia",
"name": "julia",
"version": "1.10.1"
}
},
"nbformat": 4,
"nbformat_minor": 2
}
46 changes: 46 additions & 0 deletions examples/halley.jl
Original file line number Diff line number Diff line change
@@ -0,0 +1,46 @@
# The Jacobi- and Hessian-free Halley method for solving nonlinear equations

using TaylorDiff
using LinearAlgebra
using LinearSolve

function newton(f, x0, p; tol=1e-10, maxiter=100)
x = x0
for i in 1:maxiter
fx = f(x, p)
error = norm(fx)
println("Iteration $i: x = $x, f(x) = $fx, error = $error")
if error < tol
return x
end
get_derivative = (v, u, a, b) -> v .= derivative(x -> f(x, p), x, u, 1)
operator = FunctionOperator(get_derivative, similar(x), similar(x))
problem = LinearProblem(operator, -fx)
sol = solve(problem, KrylovJL_GMRES())
x += sol.u
end
return x
end

function halley(f, x0, p; tol=1e-10, maxiter=100)
x = x0
for i in 1:maxiter
fx = f(x, p)
error = norm(fx)
println("Iteration $i: x = $x, f(x) = $fx, error = $error")
if error < tol
return x
end
get_derivative = (v, u, a, b) -> v .= derivative(x -> f(x, p), x, u, 1)
operator = FunctionOperator(get_derivative, similar(x), similar(x))
problem = LinearProblem(operator, -fx)
a = solve(problem, KrylovJL_GMRES()).u
Haa = derivative(x -> f(x, p), x, a, 2)
problem2 = LinearProblem(operator, Haa)
b = solve(problem2, KrylovJL_GMRES()).u
x += (a .* a) ./ (a .+ b ./ 2)
end
return x
end

f(x, p) = x .* x - p
Loading

0 comments on commit 4f7b477

Please sign in to comment.