Add examples

JuliaDiff · Feb 21, 2024 · 4f7b477 · 4f7b477
1 parent 1ca63ca
commit 4f7b477
Show file tree

Hide file tree

Showing 3 changed files with 416 additions and 0 deletions.
diff --git a/examples/halley.ipynb b/examples/halley.ipynb
@@ -0,0 +1,264 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# The Jacobian- and Hessian-free Halley's Method\n",
+    "\n",
+    "Say we have a system of $n$ equations with $n$ unknowns\n",
+    "\n",
+    "$$\n",
+    "f(x)=0\n",
+    "$$\n",
+    "\n",
+    "and $f\\in \\mathbb R^n\\to\\mathbb R^n$ is sufficiently smooth.\n",
+    "\n",
+    "Given a initial guess $x_0$, Halley's method specifies a series of points approximating the solution, where each iteration is\n",
+    "\n",
+    "$$\n",
+    "x^{(i+1)}=x^{(i)}+\\frac{a^{(i)}a^{(i)}}{a^{(i)}+b^{(i)}/2}\n",
+    "$$\n",
+    "\n",
+    "where the vector multiplication and division $ab, a/b$ is defined in Banach algebra, and the vectors $a^{(i)}, b^{(i)}$ are defined as\n",
+    "\n",
+    "$$\n",
+    "J(x^{(i)})a^{(i)} = -f(x^{(i)})\n",
+    "$$\n",
+    "\n",
+    "and\n",
+    "\n",
+    "$$\n",
+    "J(x^{(i)})b^{(i)} = H(x^{(i)})a^{(i)}a^{(i)}\n",
+    "$$\n"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "The full Jacobian (a matrix) and full Hessian (a 3-tensor) representation can be avoided by using forward-mode automatic differentiation. It is well known that a forward evaluation on a dual number $(x, v)$ gives the Jacobian-vector product,\n",
+    "\n",
+    "$$\n",
+    "f(x,v)=(f(x),Jv)\n",
+    "$$\n",
+    "\n",
+    "and similarly a forward evaluation on a second order Taylor expansion gives the Hessian-vector-vector product,\n",
+    "\n",
+    "$$\n",
+    "f(x,v,0)=f(x,Jv,Hvv)\n",
+    "$$\n",
+    "\n",
+    "Below, we demonstrate this possibility with TaylorDiff.jl."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Jacobian-free Newton Krylov\n",
+    "\n",
+    "To get started we first get familiar with the JFNK:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 1,
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "newton (generic function with 1 method)"
+      ]
+     },
+     "metadata": {},
+     "output_type": "display_data"
+    }
+   ],
+   "source": [
+    "# The Jacobi- and Hessian-free Halley method for solving nonlinear equations\n",
+    "\n",
+    "using TaylorDiff\n",
+    "using LinearAlgebra\n",
+    "using LinearSolve\n",
+    "\n",
+    "function newton(f, x0, p; tol=1e-10, maxiter=100)\n",
+    "    x = x0\n",
+    "    for i in 1:maxiter\n",
+    "        fx = f(x, p)\n",
+    "        error = norm(fx)\n",
+    "        println(\"Iteration $i: x = $x, f(x) = $fx, error = $error\")\n",
+    "        if error < tol\n",
+    "            return x\n",
+    "        end\n",
+    "        get_derivative = (v, u, a, b) -> v .= derivative(x -> f(x, p), x, u, 1)\n",
+    "        operator = FunctionOperator(get_derivative, similar(x), similar(x))\n",
+    "        problem = LinearProblem(operator, -fx)\n",
+    "        sol = solve(problem, KrylovJL_GMRES())\n",
+    "        x += sol.u\n",
+    "    end\n",
+    "    return x\n",
+    "end"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Jacobian- and Hessian-free Halley\n",
+    "\n",
+    "This naturally follows, only difference is replacing the rhs by Hessian-vector-vector product:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 2,
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "halley (generic function with 1 method)"
+      ]
+     },
+     "metadata": {},
+     "output_type": "display_data"
+    }
+   ],
+   "source": [
+    "function halley(f, x0, p; tol=1e-10, maxiter=100)\n",
+    "    x = x0\n",
+    "    for i in 1:maxiter\n",
+    "        fx = f(x, p)\n",
+    "        error = norm(fx)\n",
+    "        println(\"Iteration $i: x = $x, f(x) = $fx, error = $error\")\n",
+    "        if error < tol\n",
+    "            return x\n",
+    "        end\n",
+    "        get_derivative = (v, u, a, b) -> v .= derivative(x -> f(x, p), x, u, 1)\n",
+    "        operator = FunctionOperator(get_derivative, similar(x), similar(x))\n",
+    "        problem = LinearProblem(operator, -fx)\n",
+    "        a = solve(problem, KrylovJL_GMRES()).u\n",
+    "        Haa = derivative(x -> f(x, p), x, a, 2)\n",
+    "        problem2 = LinearProblem(operator, Haa)\n",
+    "        b = solve(problem2, KrylovJL_GMRES()).u\n",
+    "        x += (a .* a) ./ (a .+ b ./ 2)\n",
+    "    end\n",
+    "    return x\n",
+    "end"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 3,
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "f (generic function with 1 method)"
+      ]
+     },
+     "metadata": {},
+     "output_type": "display_data"
+    }
+   ],
+   "source": [
+    "# Testing with simple examples:\n",
+    "\n",
+    "f(x, p) = x .* x - p"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 4,
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "Iteration 1: x = [1.0, 1.0], f(x) = [-1.0, -1.0], error = 1.4142135623730951\n"
+     ]
+    },
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "Iteration 2: x = [1.5, 1.5], f(x) = [0.25, 0.25], error = 0.3535533905932738\n",
+      "Iteration 3: x = [1.4166666666666667, 1.4166666666666667], f(x) = [0.006944444444444642, 0.006944444444444642], error = 0.009820927516480105\n",
+      "Iteration 4: x = [1.4142156862745099, 1.4142156862745099], f(x) = [6.007304882871267e-6, 6.007304882871267e-6], error = 8.495612038666664e-6\n",
+      "Iteration 5: x = [1.4142135623746899, 1.4142135623746899], f(x) = [4.510614104447086e-12, 4.510614104447086e-12], error = 6.378971641140442e-12\n"
+     ]
+    },
+    {
+     "data": {
+      "text/plain": [
+       "2-element Vector{Float64}:\n",
+       " 1.4142135623746899\n",
+       " 1.4142135623746899"
+      ]
+     },
+     "metadata": {},
+     "output_type": "display_data"
+    }
+   ],
+   "source": [
+    "newton(f, [1., 1.], [2., 2.])"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 5,
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "Iteration 1: x = [1.0, 1.0], f(x) = [-1.0, -1.0], error = 1.4142135623730951\n"
+     ]
+    },
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "Iteration 2: x = [1.4000000000000001, 1.4000000000000001], f(x) = [-0.03999999999999959, -0.03999999999999959], error = 0.05656854249492323\n",
+      "Iteration 3: x = [1.4142131979695431, 1.4142131979695431], f(x) = [-1.0306887576749801e-6, -1.0306887576749801e-6], error = 1.4576140196894333e-6\n",
+      "Iteration 4: x = [1.414213562373142, 1.414213562373142], f(x) = [1.3278267374516872e-13, 1.3278267374516872e-13], error = 1.877830580585795e-13\n"
+     ]
+    },
+    {
+     "data": {
+      "text/plain": [
+       "2-element Vector{Float64}:\n",
+       " 1.414213562373142\n",
+       " 1.414213562373142"
+      ]
+     },
+     "metadata": {},
+     "output_type": "display_data"
+    }
+   ],
+   "source": [
+    "halley(f, [1., 1.], [2., 2.])"
+   ]
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Julia 1.10.1",
+   "language": "julia",
+   "name": "julia-1.10"
+  },
+  "language_info": {
+   "file_extension": ".jl",
+   "mimetype": "application/julia",
+   "name": "julia",
+   "version": "1.10.1"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 2
+}
diff --git a/examples/halley.jl b/examples/halley.jl
@@ -0,0 +1,46 @@
+# The Jacobi- and Hessian-free Halley method for solving nonlinear equations
+
+using TaylorDiff
+using LinearAlgebra
+using LinearSolve
+
+function newton(f, x0, p; tol=1e-10, maxiter=100)
+    x = x0
+    for i in 1:maxiter
+        fx = f(x, p)
+        error = norm(fx)
+        println("Iteration $i: x = $x, f(x) = $fx, error = $error")
+        if error < tol
+            return x
+        end
+        get_derivative = (v, u, a, b) -> v .= derivative(x -> f(x, p), x, u, 1)
+        operator = FunctionOperator(get_derivative, similar(x), similar(x))
+        problem = LinearProblem(operator, -fx)
+        sol = solve(problem, KrylovJL_GMRES())
+        x += sol.u
+    end
+    return x
+end
+
+function halley(f, x0, p; tol=1e-10, maxiter=100)
+    x = x0
+    for i in 1:maxiter
+        fx = f(x, p)
+        error = norm(fx)
+        println("Iteration $i: x = $x, f(x) = $fx, error = $error")
+        if error < tol
+            return x
+        end
+        get_derivative = (v, u, a, b) -> v .= derivative(x -> f(x, p), x, u, 1)
+        operator = FunctionOperator(get_derivative, similar(x), similar(x))
+        problem = LinearProblem(operator, -fx)
+        a = solve(problem, KrylovJL_GMRES()).u
+        Haa = derivative(x -> f(x, p), x, a, 2)
+        problem2 = LinearProblem(operator, Haa)
+        b = solve(problem2, KrylovJL_GMRES()).u
+        x += (a .* a) ./ (a .+ b ./ 2)
+    end
+    return x
+end
+
+f(x, p) = x .* x - p