Fix typos

adityam · Sep 25, 2023 · 5977142 · 5977142
1 parent 84138b2
commit 5977142
Show file tree

Hide file tree

Showing 2 changed files with 28 additions and 11 deletions.
diff --git a/references.bib b/references.bib
@@ -2124,4 +2124,24 @@ @Book{Borkar2008
   doi       = {10.1007/978-93-86279-38-5},
 }
 
+@Book{Arthur1994,
+  title     = {Increasing Returns and Path Dependence in the Economy},
+  publisher = {University of Michigan Press},
+  year      = {1994},
+  author    = {Arthur, W. Brian},
+  doi       = {10.3998/mpub.10029},
+}
+
+@Article{Lai2003,
+  author    = {Tze Leung Lai},
+  title     = {Stochastic approximation: invited paper},
+  journal   = {The Annals of Statistics},
+  year      = {2003},
+  volume    = {31},
+  number    = {2},
+  month     = {apr},
+  doi       = {10.1214/aos/1051027873},
+  publisher = {Institute of Mathematical Statistics},
+}
+
 @Comment{jabref-meta: databaseType:bibtex;}
diff --git a/rl/stochastic-approximation.qmd b/rl/stochastic-approximation.qmd
@@ -5,7 +5,7 @@ keywords:
   - stochastic approximation
 ---
 
-Suppose $f \colon \reals^d \to \reals^d$ and it is desired to fina a solution $θ^*$ to the equation $f(θ) = 0$. There are many methods for determining the value of
+Suppose $f \colon \reals^d \to \reals^d$ and it is desired to find a solution $θ^*$ to the equation $f(θ) = 0$. There are many methods for determining the value of
 $θ$ by successive approximation where we start with an initial guess $θ_0$ and
 then recursively obtain a new value $θ_{t+1}$ as a function of the previously
 obtained $θ_0, \dots, θ_{k}$, the values $f(θ_1), \dots, f(θ_{t})$, and
@@ -48,7 +48,6 @@ $$
   θ_{t+1} = θ_t + \frac{1}{t+1}\bigl[ (p(θ_t) - θ_t ) + (w_{t+1} - p(θ_t)) \bigr]
 $$
 Define $ξ_{t+1} = w_{t+1} - p(θ_t)$. Note that $\{ξ_t\}_{t \ge 1}$ is Martingale difference sequence, i.e., $\EXP[ ξ_{t+1} \mid θ_0, w_{1:t} ] = 0$. Thus the above equation is of the form \\eqref{eq:SA} with $f(θ_t) = p(θ_t) - θ_t$. 
-
 :::
 
 The key idea behind stochastic approximation is that under appropriate
@@ -57,7 +56,7 @@ equilibrium point of the ODE
 $$ \begin{equation} \label{eq:ODE}
   \dot θ(t) = f(θ(t))
 \end{equation} $$
-with initial conditions $θ(0) = θ_0$. For instance, for @exm-urn-model, this means that under appropriate conditions, the discrete-time iterates $\{θ_t\}_{t \ge 0}$ converge to the solution of the ODE \\eqref{eq:ODE}.
+with initial conditions $θ(0) = θ_0$. For instance, for @exm-urn-model, this means that under appropriate conditions, the discrete-time iterates $\{θ_t\}_{t \ge 0}$ converge to the solution of the ODE \\eqref{eq:ODE}. In particular, they would converge to the equilibrium set $H = \{ θ : p(θ) = θ \}$. Suppose that $p(θ)$ is such that there exists a $θ_\circ$ such that $p(θ) > θ$ for $θ \in (θ_\circ, 1)$ and $p(θ) < θ$ for $θ \in (0, θ_\circ)$. Then, the set of equilibrium points are $H = \{0, θ_\circ, 1\}$. Out of these $\{0, 1\}$ are stable and $θ_\circ$ is unstable. The stochastic approximation theory shows that the iterations \eqref{eq:SA} will converge to either $0$ or $1$. Thus, along each sample path, the iterates $\{θ_t\}_{t \ge 1}$ will be 'locked into' one color which will dominate. 
 
 In this section, we summarize these conditions (without proofs). 
 
@@ -177,7 +176,7 @@ From \\eqref{eq:SA}, we get
 where $(a)$ uses (N1)
 ::: -->
 
-## Borkar-Meyn's result
+## Borkar-Meyn's result {#sec-borkar-meyn}
 
 The following is a restatement of the result of @Borkar2000.
 
@@ -327,7 +326,7 @@ Assumption (F1) and (F3) implies that
 $$
 \NORM{f(θ_t)}_2^2 = \NORM{f(θ_t) - f(θ^*)}_2^2 \le L^2 \NORM{θ_t - θ^*}_2^2.
 $$
-Subsituting in the above bound, we get:
+Substituting in the above bound, we get:
 \begin{equation}\label{eq:vidyasagar-1-pf-step-1}
 \EXP[V(θ_{t+1}) \mid \ALPHABET F_t] \le V(θ_t) 
 + α_t \dot V(θ_t)
@@ -391,13 +390,13 @@ $$
 \sum_{t \ge T} α_t \phi(\NORM{θ_t - θ^*}_2) \ge
 \sum_{t \ge T} α_t δ = ∞,
 $$
-due to (R2). But this contraducts \\eqref{eq:vidyasagar-1-pf-step-2}. Hence, there is no $ω \in Ω_1$ such that $ζ(ω) > 0$. Therefore, $ζ = 0$ almost surely, i.e., $V(θ_t) \to 0$ almost surely. Finally, it follows from \\eqref{eq:vidyasagar-cond-1} that $θ_t \to θ^*$ almost surely as $t \to ∞$. 
+due to (R2). But this contradicts \\eqref{eq:vidyasagar-1-pf-step-2}. Hence, there is no $ω \in Ω_1$ such that $ζ(ω) > 0$. Therefore, $ζ = 0$ almost surely, i.e., $V(θ_t) \to 0$ almost surely. Finally, it follows from \\eqref{eq:vidyasagar-cond-1} that $θ_t \to θ^*$ almost surely as $t \to ∞$. 
 :::
 
 
 @thm-vidyasagar-1 requires the existence of a suitable Lyapunov function that satisfies various conditions. Verifying whether or not such a function exists can be a bottleneck. 
 
-If can be shown (see Theorem 4 of @Vidyasagar2023) that the conditions on $V$ in @thm-vidyasagar-1 ensure that the equilibrium $θ^*$ of the ODE \\eqref{eq:ODE} is globally asymptotically stable. By strenghtening this assumption to global _exponential_ stability of $θ^*$ and adding a few other conditions, it is possible to establish a "converse" Lyapunov theorem that establishes the existence of such a $V$. This is done below.
+If can be shown (see Theorem 4 of @Vidyasagar2023) that the conditions on $V$ in @thm-vidyasagar-1 ensure that the equilibrium $θ^*$ of the ODE \\eqref{eq:ODE} is globally asymptotically stable. By strengthening this assumption to global _exponential_ stability of $θ^*$ and adding a few other conditions, it is possible to establish a "converse" Lyapunov theorem that establishes the existence of such a $V$. This is done below.
 
 :::{#thm-vidyasagar-2} 
 Suppose assumptions (F1'), (F2'), (F3) and (F4) hold. Then, there exists a twice differentiable function $V \colon \reals^d \to \reals_{\ge 0}$ such that $V$ and its derivative $\dot V \colon \reals^d \to \reals_{\ge 0}$ defined as $\dot V(θ) \coloneqq \langle \langle \GRAD V(θ), f(θ) \rangle$ together satisfy the following conditions: there exist positive constants $a$, $b$, $c$, and a finite constant $M$ such that for all $θ \in \reals^d$:
@@ -418,10 +417,8 @@ Suppose assumptions (F1'), (F2'), (F3), and (F4) as well as assumptions (N1) and
 
 ## Notes {-}
 
-The stochastic approximation algorithm was introduced by @Robbins1951. 
+The stochastic approximation algorithm was introduced by @Robbins1951. See @Lai2003 for a historical overview.
 
-@exm-urn-model is from @Borkar2008.
+@exm-urn-model is borrowed from @Borkar2008, who points out that it was proposed by @Arthur1994 to model the phenomenon of decreasing returns in economics.
 
 The material in this section is adapted from @Vidyasagar2023.
-
-