You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardexpand all lines: rl/stochastic-approximation.qmd
+48-6
Original file line number
Diff line number
Diff line change
@@ -288,7 +288,7 @@ $$
288
288
\forall θ \in \reals^d.
289
289
$$
290
290
291
-
b. Origin is asymptotically stable equilibrium of the ODE
291
+
b. Origin is globally asymptotically stable equilibrium of the ODE
292
292
$$
293
293
\dot θ(t) = f_{∞}(θ(t)).
294
294
$$
@@ -337,6 +337,8 @@ Consider a continuous function $f \colon \reals_{\ge 0} \to \reals_{\ge 0}$.
337
337
$$
338
338
\inf_{ε \le r \le M} f(r) > 0.
339
339
$$
340
+
341
+
**Note** The notation of function class $\ALPHABET B$ clashes with that of the Bellman operator. I hope that the distinction will be clear from context.
340
342
:::
341
343
342
344
:::{#exm-class-K-vs-B}
@@ -378,6 +380,30 @@ Then,
378
380
Then, $θ_t \to θ^*$ almost surely as $t \to ∞$.
379
381
:::
380
382
383
+
:::{.callout-important}
384
+
#### Relationship to Lyapunov stability
385
+
386
+
Consider the ODE
387
+
$$
388
+
\dot \theta = f(\theta),
389
+
\quad \theta \in \reals^d.
390
+
$$
391
+
Consider a function $V \colon \reals^d \to \reals_{\ge 0}$ that is continuous and differentiable and let $\GRAD V$ denote the gradient of $V$. Then, the time-derivative of $V$ along the trajectories of the ODE is given by
where the first equality follows from the chain rule. Thus, the conditions of @thm-vidyasagar-1 assert that there exists a Lyapunov function for the ODE (even though we do not use any property of the ODE analysis!)
396
+
397
+
Note that the typical conditions of Lyapunov stabilty assert that if there exists a Lyapunov function $V \colon \reals^d \to \reals$ and functions $η_1, η_2 \in \ALPHABET K \ALPHABET R$, $\textcolor{red}{\phi \in \ALPHABET K}$ such that
then $θ^*$ is globally asymptotically stable equilibrium of the ODE $\dot θ = f(θ)$. It is shown in [@Vidyasagar2023, Theorem 4] this this condition can be weakended to $\phi \in \ALPHABET B$. Thus, the conditions of @thm-vidyasagar-1 imply (F3).
405
+
:::
406
+
381
407
:::{.callout-tip}
382
408
#### Discussion of the conditions
383
409
@@ -387,7 +413,24 @@ It is worthwhile to compare the conditions of @thm-borkar-meyn and @thm-vidyasag
387
413
388
414
2. The assumptions on $\dot V$ in part 1 of @thm-vidyasagar-1 imply only that $θ^*$ is a _locally stable_ equilibrim of the ODE \\eqref{eq:ODE}. This is in contrast to @thm-borkar-meyn imply that $θ^*$ is _globally asymptotically stable_.
389
415
390
-
3. The assumptions in part 2 of @thm-vidyasagar-2 ensure that $θ^*$ is globally asymptotically stable equilibrium of the ODE \\eqref{eq:ODE}. Therefore, assumption (F1') is implicit in the second part of @thm-vidyasagar-1.
416
+
3. As an illustration, consider $f \colon \reals \to \reals$ given by
417
+
$$
418
+
f(θ) = \begin{cases}
419
+
-1 + \sin(θ + π/2), & θ \ge 0 \\
420
+
f(-θ), & θ < 0.
421
+
\end{cases}
422
+
$$
423
+
The roots of $f(θ) = 0$ are all $θ \in \{ 2 πn : n \in \integers \}$. Suppose $θ^* = 0$ is the solution of interest. Since $f(θ) = 0$ has multiple solutions, $θ^* = 0$ cannot be globally asymptotically stable. So (F3) does not hold. More importantly, the limit function $f_{∞} ≡ 0$ because
424
+
$$
425
+
f_{∞}(θ) = \lim_{r \to ∞} \frac{f(r θ)}{r} = 0.
426
+
$$
427
+
So, the ODE $\dot θ = f_{\infty}(θ)$ cannot be globally asymptotically stable and therefore the results of @thm-borkar-meyn are not applicable. Nonetheless, it is easy to see that the first result of @thm-vidyasagar-1 is applicable.
428
+
429
+
In particular, consider the Lyapunov function $V(θ) = θ^2$. Then, $\dot V(θ) = θ \cdot f(θ) \le 0$ (can verify by plotting). Therefore, \emph{all} assumptions of @thm-vidyasagar-1 are satisfied. Consequently, whenever (R1) is satisfied, $\{θ_t\}_{t \ge 1}$ is almost surely bounded.
430
+
431
+
However note that we cannot verify \eqref{eq:vidyasagar-cond-3}. Therefore, we cannot argue that $\theta_t \to \theta^*$ almost surely. This is not surprising. Since $f(θ) = θ$ has multiple solutions, we will converge to one of them; not a specific one.
432
+
433
+
4. The assumptions in part 2 of @thm-vidyasagar-2 ensure that $θ^*$ is globally asymptotically stable equilibrium of the ODE \\eqref{eq:ODE}. Therefore, assumption (F1') is implicit in the second part of @thm-vidyasagar-1.
391
434
392
435
:::
393
436
@@ -492,13 +535,12 @@ $$
492
535
due to (R2). But this contradicts \\eqref{eq:vidyasagar-1-pf-step-2}. Hence, there is no $ω \in Ω_1$ such that $ζ(ω) > 0$. Therefore, $ζ = 0$ almost surely, i.e., $V(θ_t) \to 0$ almost surely. Finally, it follows from \\eqref{eq:vidyasagar-cond-1} that $θ_t \to θ^*$ almost surely as $t \to ∞$.
493
536
:::
494
537
495
-
496
538
@thm-vidyasagar-1 requires the existence of a suitable Lyapunov function that satisfies various conditions. Verifying whether or not such a function exists can be a bottleneck.
497
539
498
-
If can be shown (see Theorem 4 of @Vidyasagar2023) that the conditions on $V$ in @thm-vidyasagar-1ensure that the equilibrium $θ^*$ of the ODE \\eqref{eq:ODE} is globally asymptotically stable. By strengthening this assumption to global _exponential_ stability of $θ^*$ and adding a few other conditions, it is possible to establish a "converse" Lyapunov theorem that establishes the existence of such a $V$. This is done below.
540
+
As argued above, the conditions of @thm-vidyasagar-1imply (F3). If instead of (F3), we assume the stronger condition (F3'), then it is possible to establish the following "converse" Lyapunov theorem which guarantees the existence of such a Lyapunov function $V$.
499
541
500
542
:::{#thm-vidyasagar-2}
501
-
Suppose assumptions (F1'), (F2'), (F3) and (F4) hold. Then, there exists a twice differentiable function $V \colon \reals^d \to \reals_{\ge 0}$ such that $V$ and its derivative $\dot V \colon \reals^d \to \reals_{\ge 0}$ defined as $\dot V(θ) \coloneqq \langle \langle \GRAD V(θ), f(θ) \rangle$ together satisfy the following conditions: there exist positive constants $a$, $b$, $c$, and a finite constant $M$ such that for all $θ \in \reals^d$:
543
+
Suppose assumptions (F1'), (F2'), (F3') and (F4) hold. Then, there exists a twice differentiable function $V \colon \reals^d \to \reals_{\ge 0}$ such that $V$ and its derivative $\dot V \colon \reals^d \to \reals_{\ge 0}$ defined as $\dot V(θ) \coloneqq \langle \langle \GRAD V(θ), f(θ) \rangle$ together satisfy the following conditions: there exist positive constants $a$, $b$, $c$, and a finite constant $M$ such that for all $θ \in \reals^d$:
0 commit comments