Skip to content

Commit bdbd506

Browse files
authored
Merge pull request #21 from amitfishy/quarto
risk sensitive mdps corrections
2 parents 7d4df46 + 37873d1 commit bdbd506

File tree

1 file changed

+8
-8
lines changed

1 file changed

+8
-8
lines changed

risk-sensitive/risk-sensitive-mdps.qmd

+8-8
Original file line numberDiff line numberDiff line change
@@ -113,16 +113,16 @@ $$ λ(x) = \begin{cases}
113113
Let $w \colon \ALPHABET X \to \reals$ be a bounded function and $ν \in
114114
Δ(\ALPHABET X)$. Then,
115115
$$
116-
\log \sum_{x \in \ALPHABET X} ν(x) \exp( w(x)) =
117-
\sup_{μ \in Δ(\ALPHABET X)} \Bigl\{
118-
\sum_{x \in \ALPHABET X} μ(x) w(x) -
119-
I(μ \| ν)
116+
\log \sum_{x \in \ALPHABET X} ν(x) \exp( θw(x)) =
117+
\inf_{μ \in Δ(\ALPHABET X)} \Bigl\{
118+
\sum_{x \in \ALPHABET X} μ(x) w(x) +
119+
\frac{1}{θ} I(μ \| ν)
120120
\Bigr\},
121121
$$
122-
where the supremum is attained at the unique probability measure $μ^*$
122+
where the infimum is attained at the unique probability measure $μ^*$
123123
given by
124124
$$
125-
μ^*(x) = \frac{e^{θv(x)}}{\int e^{θv(x)}ν(x) dx} ν(x).
125+
μ^*(x) = \frac{e^{θw(x)}}{\sum_{x \in \ALPHABET X} e^{θw(x)}ν(x)} ν(x).
126126
$$
127127
:::
128128

@@ -138,9 +138,9 @@ See, for example, @Follmer2010.
138138
Using @lem-legendre-mutual-info, the dynamic program of \\eqref{eq:avg} can be
139139
written as
140140
$$ \begin{equation}
141-
J + v(x) = \min_{u \in \ALPHABET U} \sup_{μ \in Δ(\ALPHABET X)}
141+
J + v(x) = \min_{u \in \ALPHABET U} \inf_{μ \in Δ(\ALPHABET X)}
142142
\Bigl\{
143-
c(x,u) + \sum_{y \in \ALPHABET X} μ(y) v(y) - \frac{1}{θ}
143+
c(x,u) + \sum_{y \in \ALPHABET X} μ(y) v(y) + \frac{1}{θ}
144144
I(μ \| P(\cdot | x, u) )
145145
\Bigr\}.
146146
\end{equation} $$

0 commit comments

Comments
 (0)