Skip to content

Commit

Permalink
Typos
Browse files Browse the repository at this point in the history
  • Loading branch information
adityam committed Jul 21, 2024
1 parent d5458f6 commit 1b34a35
Showing 1 changed file with 2 additions and 2 deletions.
4 changes: 2 additions & 2 deletions approx-mdps/model-approximation.qmd
Original file line number Diff line number Diff line change
Expand Up @@ -694,7 +694,7 @@ Similar to the above, we can also bound the difference between the optimal value
The proof argument is similar to the proof of @prp-value-error.
The first bound is obtained as follows:
\begin{align}
\| V^{*} - \hat V^{*} \circ \|_
\| V^{*} - \hat V^{*} \circ φ \|_
&=
\| \BELLMAN^* V^* - (\hat {\BELLMAN}^* \hat V^*) \circ φ \|_
\notag \\
Expand Down Expand Up @@ -758,7 +758,7 @@ Recall that we can split the model error using triangle inequality as in \eqref{

The policy $\hat π^*$ is an $α$-optimal policy of $\ALPHABET M$ where
$$
α := \| V^* - V^{\hat π^* \circ} \|_∞ \le
α := \| V^* - V^{\hat π^* \circ φ} \|_∞ \le
\frac{1}{1-γ} \bigl[ \MISMATCH^*_{φ} \hat V^* + \MISMATCH^{\hat
π^*}_{φ} \hat V^* \bigr].
$$
Expand Down

0 comments on commit 1b34a35

Please sign in to comment.