Skip to content

Commit

Permalink
Fix typos
Browse files Browse the repository at this point in the history
  • Loading branch information
adityam committed Sep 25, 2023
1 parent 84138b2 commit 5977142
Show file tree
Hide file tree
Showing 2 changed files with 28 additions and 11 deletions.
20 changes: 20 additions & 0 deletions references.bib
Original file line number Diff line number Diff line change
Expand Up @@ -2124,4 +2124,24 @@ @Book{Borkar2008
doi = {10.1007/978-93-86279-38-5},
}

@Book{Arthur1994,
title = {Increasing Returns and Path Dependence in the Economy},
publisher = {University of Michigan Press},
year = {1994},
author = {Arthur, W. Brian},
doi = {10.3998/mpub.10029},
}

@Article{Lai2003,
author = {Tze Leung Lai},
title = {Stochastic approximation: invited paper},
journal = {The Annals of Statistics},
year = {2003},
volume = {31},
number = {2},
month = {apr},
doi = {10.1214/aos/1051027873},
publisher = {Institute of Mathematical Statistics},
}

@Comment{jabref-meta: databaseType:bibtex;}
19 changes: 8 additions & 11 deletions rl/stochastic-approximation.qmd
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ keywords:
- stochastic approximation
---

Suppose $f \colon \reals^d \to \reals^d$ and it is desired to fina a solution $θ^*$ to the equation $f(θ) = 0$. There are many methods for determining the value of
Suppose $f \colon \reals^d \to \reals^d$ and it is desired to find a solution $θ^*$ to the equation $f(θ) = 0$. There are many methods for determining the value of
$θ$ by successive approximation where we start with an initial guess $θ_0$ and
then recursively obtain a new value $θ_{t+1}$ as a function of the previously
obtained $θ_0, \dots, θ_{k}$, the values $f(θ_1), \dots, f(θ_{t})$, and
Expand Down Expand Up @@ -48,7 +48,6 @@ $$
θ_{t+1} = θ_t + \frac{1}{t+1}\bigl[ (p(θ_t) - θ_t ) + (w_{t+1} - p(θ_t)) \bigr]
$$
Define $ξ_{t+1} = w_{t+1} - p(θ_t)$. Note that $\{ξ_t\}_{t \ge 1}$ is Martingale difference sequence, i.e., $\EXP[ ξ_{t+1} \mid θ_0, w_{1:t} ] = 0$. Thus the above equation is of the form \\eqref{eq:SA} with $f(θ_t) = p(θ_t) - θ_t$.

:::

The key idea behind stochastic approximation is that under appropriate
Expand All @@ -57,7 +56,7 @@ equilibrium point of the ODE
$$ \begin{equation} \label{eq:ODE}
\dot θ(t) = f(θ(t))
\end{equation} $$
with initial conditions $θ(0) = θ_0$. For instance, for @exm-urn-model, this means that under appropriate conditions, the discrete-time iterates $\{θ_t\}_{t \ge 0}$ converge to the solution of the ODE \\eqref{eq:ODE}.
with initial conditions $θ(0) = θ_0$. For instance, for @exm-urn-model, this means that under appropriate conditions, the discrete-time iterates $\{θ_t\}_{t \ge 0}$ converge to the solution of the ODE \\eqref{eq:ODE}. In particular, they would converge to the equilibrium set $H = \{ θ : p(θ) = θ \}$. Suppose that $p(θ)$ is such that there exists a $θ_\circ$ such that $p(θ) > θ$ for $θ \in (θ_\circ, 1)$ and $p(θ) < θ$ for $θ \in (0, θ_\circ)$. Then, the set of equilibrium points are $H = \{0, θ_\circ, 1\}$. Out of these $\{0, 1\}$ are stable and $θ_\circ$ is unstable. The stochastic approximation theory shows that the iterations \eqref{eq:SA} will converge to either $0$ or $1$. Thus, along each sample path, the iterates $\{θ_t\}_{t \ge 1}$ will be 'locked into' one color which will dominate.

In this section, we summarize these conditions (without proofs).

Expand Down Expand Up @@ -177,7 +176,7 @@ From \\eqref{eq:SA}, we get
where $(a)$ uses (N1)
::: -->

## Borkar-Meyn's result
## Borkar-Meyn's result {#sec-borkar-meyn}

The following is a restatement of the result of @Borkar2000.

Expand Down Expand Up @@ -327,7 +326,7 @@ Assumption (F1) and (F3) implies that
$$
\NORM{f(θ_t)}_2^2 = \NORM{f(θ_t) - f(θ^*)}_2^2 \le L^2 \NORM{θ_t - θ^*}_2^2.
$$
Subsituting in the above bound, we get:
Substituting in the above bound, we get:
\begin{equation}\label{eq:vidyasagar-1-pf-step-1}
\EXP[V(θ_{t+1}) \mid \ALPHABET F_t] \le V(θ_t)
+ α_t \dot V(θ_t)
Expand Down Expand Up @@ -391,13 +390,13 @@ $$
\sum_{t \ge T} α_t \phi(\NORM{θ_t - θ^*}_2) \ge
\sum_{t \ge T} α_t δ = ∞,
$$
due to (R2). But this contraducts \\eqref{eq:vidyasagar-1-pf-step-2}. Hence, there is no $ω \in Ω_1$ such that $ζ(ω) > 0$. Therefore, $ζ = 0$ almost surely, i.e., $V(θ_t) \to 0$ almost surely. Finally, it follows from \\eqref{eq:vidyasagar-cond-1} that $θ_t \to θ^*$ almost surely as $t \to ∞$.
due to (R2). But this contradicts \\eqref{eq:vidyasagar-1-pf-step-2}. Hence, there is no $ω \in Ω_1$ such that $ζ(ω) > 0$. Therefore, $ζ = 0$ almost surely, i.e., $V(θ_t) \to 0$ almost surely. Finally, it follows from \\eqref{eq:vidyasagar-cond-1} that $θ_t \to θ^*$ almost surely as $t \to ∞$.
:::


@thm-vidyasagar-1 requires the existence of a suitable Lyapunov function that satisfies various conditions. Verifying whether or not such a function exists can be a bottleneck.

If can be shown (see Theorem 4 of @Vidyasagar2023) that the conditions on $V$ in @thm-vidyasagar-1 ensure that the equilibrium $θ^*$ of the ODE \\eqref{eq:ODE} is globally asymptotically stable. By strenghtening this assumption to global _exponential_ stability of $θ^*$ and adding a few other conditions, it is possible to establish a "converse" Lyapunov theorem that establishes the existence of such a $V$. This is done below.
If can be shown (see Theorem 4 of @Vidyasagar2023) that the conditions on $V$ in @thm-vidyasagar-1 ensure that the equilibrium $θ^*$ of the ODE \\eqref{eq:ODE} is globally asymptotically stable. By strengthening this assumption to global _exponential_ stability of $θ^*$ and adding a few other conditions, it is possible to establish a "converse" Lyapunov theorem that establishes the existence of such a $V$. This is done below.

:::{#thm-vidyasagar-2}
Suppose assumptions (F1'), (F2'), (F3) and (F4) hold. Then, there exists a twice differentiable function $V \colon \reals^d \to \reals_{\ge 0}$ such that $V$ and its derivative $\dot V \colon \reals^d \to \reals_{\ge 0}$ defined as $\dot V(θ) \coloneqq \langle \langle \GRAD V(θ), f(θ) \rangle$ together satisfy the following conditions: there exist positive constants $a$, $b$, $c$, and a finite constant $M$ such that for all $θ \in \reals^d$:
Expand All @@ -418,10 +417,8 @@ Suppose assumptions (F1'), (F2'), (F3), and (F4) as well as assumptions (N1) and

## Notes {-}

The stochastic approximation algorithm was introduced by @Robbins1951.
The stochastic approximation algorithm was introduced by @Robbins1951. See @Lai2003 for a historical overview.

@exm-urn-model is from @Borkar2008.
@exm-urn-model is borrowed from @Borkar2008, who points out that it was proposed by @Arthur1994 to model the phenomenon of decreasing returns in economics.

The material in this section is adapted from @Vidyasagar2023.


0 comments on commit 5977142

Please sign in to comment.