Skip to content

Commit 71e2303

Browse files
torfjeldemhauru
andauthoredJun 11, 2024··
Alternative tutorial on implementing samplers (#468)
* replaced the external sampler docs with a more straight-forward tutorial on implementing samplers * fixed reference to the impl samplers tutorial * moved docs on implementing samplers * revert changes to docs-16-using-turing-external-samplers * make the `getparams` overload explicit * renmaed NormalLogDensity to IsotropicNormalModel as per @mhauru suggestion * replace underline with emph * Apply suggestions from code review Co-authored-by: Markus Hauru <mhauru@turing.ac.uk> * fix formatting of summary * removed type-piracy and added note on why we're using this example of MALA --------- Co-authored-by: Markus Hauru <mhauru@turing.ac.uk>
1 parent 160e254 commit 71e2303

File tree

4 files changed

+2719
-0
lines changed

4 files changed

+2719
-0
lines changed
 

‎_quarto.yml

+2
Original file line numberDiff line numberDiff line change
@@ -106,6 +106,8 @@ website:
106106
contents:
107107
- tutorials/docs-04-for-developers-abstractmcmc-turing/index.qmd
108108
- tutorials/docs-07-for-developers-variational-inference/index.qmd
109+
- text: "Implementing Samplers"
110+
href: tutorials/docs-17-implementing-samplers/index.qmd
109111

110112

111113
page-footer:

‎tutorials/docs-17-implementing-samplers/Manifest.toml

+2,213
Large diffs are not rendered by default.
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,12 @@
1+
[deps]
2+
ADTypes = "47edcb42-4c32-4615-8424-f2b9edc5f35b"
3+
AbstractMCMC = "80f14c24-f653-4e6a-9b94-39d6b0f70001"
4+
Distributions = "31c24e10-a181-5473-b8eb-7969acd0382f"
5+
ForwardDiff = "f6369f11-7733-5829-9624-2563aa707210"
6+
LinearAlgebra = "37e2e46d-f89d-539d-b4ee-838fcccc9c8e"
7+
LogDensityProblems = "6fdf6af0-433a-55f7-b3ed-c6c6e0b8df7c"
8+
LogDensityProblemsAD = "996a588d-648d-4e1f-a8f0-a84b347e47b1"
9+
Random = "9a3f8284-a2c9-5f02-9a11-845980a1fd5c"
10+
ReverseDiff = "37e2e3b7-166d-5795-8a7a-e32c996b4267"
11+
StatsPlots = "f3b207a7-027a-5e70-b257-86293d7955fd"
12+
Turing = "fce5fe82-541a-59a6-adf8-730c64b5f9a0"
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,492 @@
1+
---
2+
title: Implementing samplers
3+
engine: julia
4+
julia:
5+
exeflags: ["--project=@.", "-t 4"]
6+
---
7+
8+
```{julia}
9+
#| echo: false
10+
#| output: false
11+
using Pkg;
12+
Pkg.instantiate();
13+
```
14+
15+
In this tutorial, we'll go through step-by-step how to implement a "simple" sampler in [AbstractMCMC.jl](https://github.com/TuringLang/AbstractMCMC.jl) in such a way that it can be easily applied to Turing.jl models.
16+
17+
In particular, we're going to implement a version of **Metropolis-adjusted Langevin (MALA)**.
18+
19+
Note that we will implement this sampler in the [AbstractMCMC.jl](https://github.com/TuringLang/AbstractMCMC.jl) framework, completely "ignoring" Turing.jl until the very end of the tutorial, at which point we'll use a single line of code to make the resulting sampler available to Turing.jl. This is to really drive home the point that one can implement samplers in a way that is accessible to all of Turing.jl's users without having to use Turing.jl yourself.
20+
21+
22+
## Quick overview of MALA
23+
24+
We can view MALA as a single step of the leapfrog intergrator with resampling of momentum $p$ at every step.[^2] To make that statement a bit more concrete, we first define the *extended* target $\bar{\gamma}(x, p)$ as
25+
26+
\begin{equation*}
27+
\log \bar{\gamma}(x, p) \propto \log \gamma(x) + \log \gamma_{\mathcal{N}(0, M)}(p)
28+
\end{equation*}
29+
30+
where $\gamma_{\mathcal{N}(0, M)}$ denotes the density for a zero-centered Gaussian with covariance matrix $M$.
31+
We then consider targeting this joint distribution over both $x$ and $p$ as follows.
32+
First we define the map
33+
34+
\begin{equation*}
35+
\begin{split}
36+
L_{\epsilon}: \quad & \mathbb{R}^d \times \mathbb{R}^d \to \mathbb{R}^d \times \mathbb{R}^d \\
37+
& (x, p) \mapsto (\tilde{x}, \tilde{p}) := L_{\epsilon}(x, p)
38+
\end{split}
39+
\end{equation*}
40+
41+
as
42+
43+
\begin{equation*}
44+
\begin{split}
45+
p_{1 / 2} &:= p + \frac{\epsilon}{2} \nabla \log \gamma(x) \\
46+
\tilde{x} &:= x + \epsilon M^{-1} p_{1 /2 } \\
47+
p_1 &:= p_{1 / 2} + \frac{\epsilon}{2} \nabla \log \gamma(\tilde{x}) \\
48+
\tilde{p} &:= - p_1
49+
\end{split}
50+
\end{equation*}
51+
52+
This might be familiar for some readers as a single step of the Leapfrog integrator.
53+
We then define the MALA kernel as follows: given the current iterate $x_i$, we sample the next iterate $x_{i + 1}$ as
54+
55+
\begin{equation*}
56+
\begin{split}
57+
p &\sim \mathcal{N}(0, M) \\
58+
(\tilde{x}, \tilde{p}) &:= L_{\epsilon}(x_i, p) \\
59+
\alpha &:= \min \left\{ 1, \frac{\bar{\gamma}(\tilde{x}, \tilde{p})}{\bar{\gamma}(x_i, p)} \right\} \\
60+
x_{i + 1} &:=
61+
\begin{cases}
62+
\tilde{x} \quad & \text{ with prob. } \alpha \\
63+
x_i \quad & \text{ with prob. } 1 - \alpha
64+
\end{cases}
65+
\end{split}
66+
\end{equation*}
67+
68+
i.e. we accept the proposal $\tilde{x}$ with probability $\alpha$ and reject it, thus sticking with our current iterate, with probability $1 - \alpha$.
69+
70+
## What we need from a model: [LogDensityProblems.jl](https://github.com/tpapp/LogDensityProblems.jl)
71+
72+
There are a few things we need from the "target" / "model" / density that we want to sample from:
73+
74+
1. We need access to log-density *evaluations* $\log \gamma(x)$ so we can compute the acceptance ratio involving $\log \bar{\gamma}(x, p)$.
75+
2. We need access to log-density *gradients* $\nabla \log \gamma(x)$ so we can compute the Leapfrog steps $L_{\epsilon}(x, p)$.
76+
3. We also need access to the "size" of the model so we can determine the size of $M$.
77+
78+
Luckily for us, there is a package called [LogDensityProblems.jl](https://github.com/tpapp/LogDensityProblems.jl) which provides an interface for *exactly* this!
79+
80+
To demonstrate how one can implement the "[LogDensityProblems.jl](https://github.com/tpapp/LogDensityProblems.jl) interface"[^1] we will use a simple Gaussian model as an example:
81+
82+
```{julia}
83+
using LogDensityProblems: LogDensityProblems;
84+
85+
# Let's define some type that represents the model.
86+
struct IsotropicNormalModel{M<:AbstractVector{<:Real}}
87+
"mean of the isotropic Gaussian"
88+
mean::M
89+
end
90+
91+
# Specifies what input length the model expects.
92+
LogDensityProblems.dimension(model::IsotropicNormalModel) = length(model.mean)
93+
# Implementation of the log-density evaluation of the model.
94+
function LogDensityProblems.logdensity(model::IsotropicNormalModel, x::AbstractVector{<:Real})
95+
return - sum(abs2, x .- model.mean) / 2
96+
end
97+
```
98+
99+
This gives us all of the properties we want for our MALA sampler with the exception of the computation of the *gradient* $\nabla \log \gamma(x)$. There is the method `LogDensityProblems.logdensity_and_gradient` which should return a 2-tuple where the first entry is the evaluation of the logdensity $\log \gamma(x)$ and the second entry is the gradient $\nabla \log \gamma(x)$.
100+
101+
There are two ways to "implement" this method: 1) we implement it by hand, which is feasible in the case of our `IsotropicNormalModel`, or b) we defer the implementation of this to a automatic differentiation backend.
102+
103+
To implement it by hand we can simply do
104+
105+
```{julia}
106+
# Tell LogDensityProblems.jl that first-order, i.e. gradient information, is available.
107+
LogDensityProblems.capabilities(model::IsotropicNormalModel) = LogDensityProblems.LogDensityOrder{1}()
108+
109+
# Implement `logdensity_and_gradient`.
110+
function LogDensityProblems.logdensity_and_gradient(model::IsotropicNormalModel, x)
111+
logγ_x = LogDensityProblems.logdensity(model, x)
112+
∇logγ_x = -x .* (x - model.mean)
113+
return logγ_x, ∇logγ_x
114+
end
115+
```
116+
117+
Let's just try it out:
118+
119+
```{julia}
120+
# Instantiate the problem.
121+
model = IsotropicNormalModel([-5., 0., 5.])
122+
# Create some example input that we can test on.
123+
x_example = randn(LogDensityProblems.dimension(model))
124+
# Evaluate!
125+
LogDensityProblems.logdensity(model, x_example)
126+
```
127+
128+
To defer it to an automatic differentiation backend, we can do
129+
130+
```{julia}
131+
# Tell LogDensityProblems.jl we only have access to 0-th order information.
132+
LogDensityProblems.capabilities(model::IsotropicNormalModel) = LogDensityProblems.LogDensityOrder{0}()
133+
134+
# Use `LogDensityProblemsAD`'s `ADgradient` in combination with some AD backend to implement `logdensity_and_gradient`.
135+
using LogDensityProblemsAD, ADTypes, ForwardDiff
136+
model_with_grad = ADgradient(AutoForwardDiff(), model)
137+
LogDensityProblems.logdensity(model_with_grad, x_example)
138+
```
139+
140+
We'll continue with the second approach in this tutorial since this is typically what one does in practice, because there are better hobbies to spend time on than deriving gradients by hand.
141+
142+
At this point, one might wonder how we're going to tie this back to Turing.jl in the end. Effectively, when working with inference methods that only require log-density evaluations and / or higher-order information of the log-density, Turing.jl actually converts the user-provided `Model` into an object implementing the above methods for [LogDensityProblems.jl](https://github.com/tpapp/LogDensityProblems.jl). As a result, most samplers provided by Turing.jl are actually implemented to work with [LogDensityProblems.jl](https://github.com/tpapp/LogDensityProblems.jl), enabling their use both *within* Turing.jl and *outside* of Turing.jl! Morever, there exists similar conversions for Stan through BridgeStan and Stan[LogDensityProblems.jl](https://github.com/tpapp/LogDensityProblems.jl), which means that a sampler supporting the [LogDensityProblems.jl](https://github.com/tpapp/LogDensityProblems.jl) interface can easily be used on both Turing.jl *and* Stan models (in addition to user-provided models, as our `IsotropicNormalModel` above)!
143+
144+
Anyways, let's move on to actually implementing the sampler.
145+
146+
## Implementing MALA in [AbstractMCMC.jl](https://github.com/TuringLang/AbstractMCMC.jl)
147+
148+
Now that we've established that a model implementing the [LogDensityProblems.jl](https://github.com/tpapp/LogDensityProblems.jl) interface provides us with all the information we need from $\log \gamma(x)$, we can address the question: given an object that implements the [LogDensityProblems.jl](https://github.com/tpapp/LogDensityProblems.jl) interface, how can we define a sampler for it?
149+
150+
We're going to do this by making our sampler a sub-type of `AbstractMCMC.AbstractSampler` in addition to implementing a few methods from [AbstractMCMC.jl](https://github.com/TuringLang/AbstractMCMC.jl). Why? Because it gets us *a lot* of functionality for free, as we will see later.
151+
152+
Moreover, [AbstractMCMC.jl](https://github.com/TuringLang/AbstractMCMC.jl) provides a very natural interface for MCMC algorithms.
153+
154+
First, we'll define our `MALA` type
155+
156+
```{julia}
157+
using AbstractMCMC
158+
159+
struct MALA{T,A} <: AbstractMCMC.AbstractSampler
160+
"stepsize used in the leapfrog step"
161+
ϵ_init::T
162+
"covariance matrix used for the momentum"
163+
M_init::A
164+
end
165+
```
166+
167+
Notice how we've added the suffix `_init` to both the stepsize and the covariance matrix. We've done this because a `AbstractMCMC.AbstractSampler` should be *immutable*. Of course there might be many scenarios where we want to allow something like the stepsize and / or the covariance matrix to vary between iterations, e.g. during the burn-in / adaptation phase of the sampling process we might want to adjust the parameters using statistics computed from these initial iterations. But information which can change between iterations *should not go in the sampler itself*! Instead, this information should go in the sampler *state*.
168+
169+
The sampler state should at the very least contain all the necessary information to perform the next MCMC iteration, but usually contains further information, e.g. quantities and statistics useful for evaluating whether the sampler has converged.
170+
171+
We will use the following sampler state for our `MALA` sampler:
172+
173+
```{julia}
174+
struct MALAState{A<:AbstractVector{<:Real}}
175+
"current position"
176+
x::A
177+
end
178+
```
179+
180+
This might seem overly redundant: we're defining a type `MALAState` and it only contains a simple vector of reals.
181+
In this particular case we indeed could have dropped this and simply used a `AbstractVector{<:Real}` as our sampler state, but typically, as we will see later, one wants to include other quantities in the sampler state.
182+
For example, if we also wanted to adapt the parameters of our `MALA`, e.g. alter the stepsize depending on acceptance rates, in which case we should also put `ϵ` in the state, but for now we'll keep things simple.
183+
184+
Moreover, we also want a _sample_ type, which is a type meant for "public consumption", i.e. the end-user. This is generally going to contain a subset of the information present in the state. But in such a simple scenario as this, we similarly only have a `AbstractVector{<:Real}`:
185+
186+
```{julia}
187+
struct MALASample{A<:AbstractVector{<:Real}}
188+
"current position"
189+
x::A
190+
end
191+
```
192+
193+
We currently have three things:
194+
195+
1. A `AbstractMCMC.AbstractSampler` implementation called `MALA`.
196+
2. A state `MALAState` for our sampler `MALA`.
197+
3. A sample `MALASample` for our sampler `MALA`.
198+
199+
That means that we're ready to implement the only thing that really matters: `AbstractMCMC.step`.
200+
201+
`AbstractMCMC.step` defines the MCMC iteration of our `MALA` given the current `MALAState`. Specifically, the signature of the function is as follows:
202+
203+
```{julia}
204+
#| eval: false
205+
function AbstractMCMC.step(
206+
# The RNG to ensure reproducibility.
207+
rng::Random.AbstractRNG,
208+
# The model that defines our target.
209+
model::AbstractMCMC.AbstractModel,
210+
# The sampler for which we're taking a `step`.
211+
sampler::AbstractMCMC.AbstractSampler,
212+
# The current sampler `state`.
213+
state;
214+
# Additional keyword arguments that we may or may not need.
215+
kwargs...
216+
)
217+
```
218+
219+
Moreover, there is a specific `AbstractMCMC.AbstractModel` which is used to indicate that the model that is provided implements the [LogDensityProblems.jl](https://github.com/tpapp/LogDensityProblems.jl) interface: `AbstractMCMC.LogDensityModel`.
220+
221+
Since, as we discussed earlier, in our case we're indeed going to work with types that support the [LogDensityProblems.jl](https://github.com/tpapp/LogDensityProblems.jl) interface, we'll define `AbstractMCMC.step` for such a `AbstractMCMC.LogDensityModel`.
222+
223+
Note that `AbstractMCMC.LogDensityModel` has no other purpose; it has a single field called `logdensity`, and it does nothing else. But by wrapping the model in `AbstractMCMC.LogDensityModel`, it allows samplers that want to work with [LogDensityProblems.jl](https://github.com/tpapp/LogDensityProblems.jl) to define their `AbstractMCMC.step` on this type without running into method ambiguities.
224+
225+
All in all, that means that the signature for our `AbstractMCMC.step` is going to be the following:
226+
227+
```{julia}
228+
#| eval: false
229+
function AbstractMCMC.step(
230+
rng::Random.AbstractRNG,
231+
# `LogDensityModel` so we know we're working with LogDensityProblems.jl model.
232+
model::AbstractMCMC.LogDensityModel,
233+
# Our sampler.
234+
sampler::MALA,
235+
# Our sampler state.
236+
state::MALAState;
237+
kwargs...
238+
)
239+
```
240+
241+
Great! Now let's actually implement the full `AbstractMCMC.step` for our `MALA`.
242+
243+
Let's remind ourselves what we're going to do:
244+
245+
1. Sample a new momentum $p$.
246+
2. Compute the log-density of the extended target $\log \bar{\gamma}(x, p)$.
247+
3. Take a single leapfrog step $(\tilde{x}, \tilde{p}) = L_{\epsilon}(x, p)$.
248+
4. Accept or reject the proposed $(\tilde{x}, \tilde{p})$.
249+
250+
All in all, this results in the following:
251+
252+
```{julia}
253+
using Random: Random
254+
using Distributions # so we get the `MvNormal`
255+
256+
function AbstractMCMC.step(
257+
rng::Random.AbstractRNG,
258+
model_wrapper::AbstractMCMC.LogDensityModel,
259+
sampler::MALA,
260+
state::MALAState;
261+
kwargs...
262+
)
263+
# Extract the wrapped model which implements LogDensityProblems.jl.
264+
model = model_wrapper.logdensity
265+
# Let's just extract the sampler parameters to make our lives easier.
266+
ϵ = sampler.ϵ_init
267+
M = sampler.M_init
268+
# Extract the current parameters.
269+
x = state.x
270+
# Sample the momentum.
271+
p_dist = MvNormal(zeros(LogDensityProblems.dimension(model)), M)
272+
p = rand(rng, p_dist)
273+
# Propose using a single leapfrog step.
274+
x̃, p̃ = leapfrog_step(model, x, p, ϵ, M)
275+
# Accept or reject proposal.
276+
logp = LogDensityProblems.logdensity(model, x) + logpdf(p_dist, p)
277+
logp̃ = LogDensityProblems.logdensity(model, x̃) + logpdf(p_dist, p̃)
278+
logα = logp̃ - logp
279+
state_new = if log(rand(rng)) < logα
280+
# Accept.
281+
MALAState(x̃)
282+
else
283+
# Reject.
284+
MALAState(x)
285+
end
286+
# Return the "sample" and the sampler state.
287+
return MALASample(state_new.x), state_new
288+
end
289+
```
290+
291+
Fairly straight-forward.
292+
293+
Of course, we haven't defined the `leapfrog_step` method yet, so let's do that:
294+
295+
```{julia}
296+
function leapfrog_step(model, x, p, ϵ, M)
297+
# Update momentum `p` using "position" `x`.
298+
∇logγ_x = last(LogDensityProblems.logdensity_and_gradient(model, x))
299+
p1 = p + (ϵ / 2) .* ∇logγ_x
300+
# Update the "position" `x` using momentum `p1`.
301+
x̃ = x + ϵ .* (M \ p1)
302+
# Update momentum `p1` using position `x̃`
303+
∇logγ_x̃ = last(LogDensityProblems.logdensity_and_gradient(model, x̃))
304+
p2 = p1 + (ϵ / 2) .* ∇logγ_x̃
305+
# Flip momentum `p2`.
306+
p̃ = -p2
307+
return x̃, p̃
308+
end
309+
```
310+
311+
With all of this, we're technically ready to sample!
312+
313+
```{julia}
314+
using Random, LinearAlgebra
315+
316+
rng = Random.default_rng()
317+
sampler = MALA(1, I)
318+
state = MALAState(zeros(LogDensityProblems.dimension(model)))
319+
320+
x_next, state_next = AbstractMCMC.step(
321+
rng,
322+
AbstractMCMC.LogDensityModel(model),
323+
sampler,
324+
state
325+
)
326+
```
327+
328+
Great, it works!
329+
330+
And I promised we would get quite some functionality for free if we implemented `AbstractMCMC.step`, and so we can now simply call `sample` to perform standard MCMC sampling:
331+
332+
```{julia}
333+
# Perform 1000 iterations with our `MALA` sampler.
334+
samples = sample(model_with_grad, sampler, 10_000; initial_state=state, progress=false)
335+
# Concatenate into a matrix.
336+
samples_matrix = stack(sample -> sample.x, samples)
337+
```
338+
339+
```{julia}
340+
# Compute the marginal means and standard deviations.
341+
hcat(mean(samples_matrix; dims=2), std(samples_matrix; dims=2))
342+
```
343+
344+
Let's visualize the samples
345+
346+
```{julia}
347+
using StatsPlots
348+
plot(transpose(samples_matrix[:, 1:10:end]), alpha=0.5, legend=false)
349+
```
350+
351+
Look at that! Things are working; amazin'.
352+
353+
We can also exploit [AbstractMCMC.jl](https://github.com/TuringLang/AbstractMCMC.jl)'s parallel sampling capabilities:
354+
355+
```{julia}
356+
# Run separate 4 chains for 10 000 iterations using threads to parallelize.
357+
num_chains = 4
358+
samples = sample(
359+
model_with_grad,
360+
sampler,
361+
MCMCThreads(),
362+
10_000,
363+
num_chains;
364+
# Note we need to provide an initial state for every chain.
365+
initial_state=fill(state, num_chains),
366+
progress=false
367+
)
368+
samples_array = stack(map(Base.Fix1(stack, sample -> sample.x), samples))
369+
```
370+
371+
But the fact that we have to provide the `AbstractMCMC.sample` call, etc. with an `initial_state` to get started is a bit annoying. We can avoid this by also defining a `AbstractMCMC.step` *without* the `state` argument:
372+
373+
```{julia}
374+
function AbstractMCMC.step(
375+
rng::Random.AbstractRNG,
376+
model_wrapper::AbstractMCMC.LogDensityModel,
377+
::MALA;
378+
# NOTE: No state provided!
379+
kwargs...
380+
)
381+
model = model_wrapper.logdensity
382+
# Let's just create the initial state by sampling using a Gaussian.
383+
x = randn(rng, LogDensityProblems.dimension(model))
384+
385+
return MALASample(x), MALAState(x)
386+
end
387+
```
388+
389+
Equipped with this, we no longer need to provide the `initial_state` everywhere:
390+
391+
```{julia}
392+
samples = sample(model_with_grad, sampler, 10_000; progress=false)
393+
samples_matrix = stack(sample -> sample.x, samples)
394+
hcat(mean(samples_matrix; dims=2), std(samples_matrix; dims=2))
395+
```
396+
397+
## Using our sampler with Turing.jl
398+
399+
As we promised, all of this hassle of implementing our `MALA` sampler in a way that uses [LogDensityProblems.jl](https://github.com/tpapp/LogDensityProblems.jl) and [AbstractMCMC.jl](https://github.com/TuringLang/AbstractMCMC.jl) gets us something more than *just* an "automatic" implementation of `AbstractMCMC.sample`.
400+
401+
It also enables use with Turing.jl through the `externalsampler`, but we need to do one final thing first: we need to tell Turing.jl how to extract a vector of parameters from the "sample" returned in our implementation of `AbstractMCMC.step`. In our case, the "sample" is a `MALASample`, so we just need the following line:
402+
403+
```{julia}
404+
# Load Turing.jl.
405+
using Turing
406+
407+
# Overload the `getparams` method for our "sample" type, which is just a vector.
408+
Turing.Inference.getparams(::Turing.Model, sample::MALASample) = sample.x
409+
```
410+
411+
And with that, we're good to go!
412+
413+
```{julia}
414+
# Our previous model defined as a Turing.jl model.
415+
@model mvnormal_model() = x ~ MvNormal([-5., 0., 5.], I)
416+
# Instantiate our model.
417+
turing_model = mvnormal_model()
418+
# Call `sample` but now we're passing in a Turing.jl `model` and wrapping
419+
# our `MALA` sampler in the `externalsampler` to tell Turing.jl that the sampler
420+
# expects something that implements LogDensityProblems.jl.
421+
chain = sample(turing_model, externalsampler(sampler), 10_000; progress=false)
422+
```
423+
424+
Pretty neat, eh?
425+
426+
### Models with constrained parameters
427+
428+
One thing we've sort of glossed over in all of the above is that MALA, at least how we've implemented it, requires $x$ to live in $\mathbb{R}^d$ for some $d > 0$. If some of the parameters were in fact constrained, e.g. we were working with a `Beta` distribution which has support on the interval $(0, 1)$, *not* on $\mathbb{R}^d$, we could easily end up outside of the valid range $(0, 1)$.
429+
430+
```{julia}
431+
@model beta_model() = x ~ Beta(3, 3)
432+
turing_model = beta_model()
433+
chain = sample(turing_model, externalsampler(sampler), 10_000; progress=false)
434+
```
435+
436+
Yep, that still works, but only because Turing.jl actually *transforms* the `turing_model` from constrained to unconstrained, so that the `sampler` provided to `externalsampler` is actually always working in unconstrained space! This is not always desirable, so we can turn this off:
437+
438+
```{julia}
439+
chain = sample(turing_model, externalsampler(sampler; unconstrained=false), 10_000; progress=false)
440+
```
441+
442+
The fun thing is that this still sort of works because
443+
444+
```{julia}
445+
logpdf(Beta(3, 3), 10.0)
446+
```
447+
448+
and so the samples that fall outside of the range are always rejected. But do notice how much worse all the diagnostics are, e.g. `ess_tail` is very poor compared to when we use `unconstrained=true`. Moreover, in more complex cases this won't just result in a "nice" `-Inf` log-density value, but instead will error:
449+
450+
```{julia}
451+
@model function demo()
452+
σ² ~ truncated(Normal(), lower=0)
453+
# If we end up with negative values for `σ²`, the `Normal` will error.
454+
x ~ Normal(0, σ²)
455+
end
456+
sample(demo(), externalsampler(sampler; unconstrained=false), 10_000; progress=false)
457+
```
458+
459+
As expected, we run into a `DomainError` at some point, while if we set `unconstrained=true`, letting Turing.jl transform the model to a unconstrained form behind the scenes, everything works as expected:
460+
461+
```{julia}
462+
sample(demo(), externalsampler(sampler; unconstrained=true), 10_000; progress=false)
463+
```
464+
465+
Neat!
466+
467+
Similarly, which automatic differentiation backend one should use can be specified through the `adtype` keyword argument too. For example, if we want to use [ReverseDiff.jl](https://github.com/JuliaDiff/ReverseDiff.jl) instead of the default [ForwardDiff.jl](https://github.com/JuliaDiff/ForwardDiff.jl):
468+
469+
```{julia}
470+
using ReverseDiff: ReverseDiff
471+
# Specify that we want to use `AutoReverseDiff`.
472+
sample(
473+
demo(),
474+
externalsampler(sampler; unconstrained=true, adtype=AutoReverseDiff()),
475+
10_000;
476+
progress=false
477+
)
478+
```
479+
480+
Double-neat.
481+
482+
## Summary
483+
484+
At this point it's worth maybe reminding ourselves what we did and also *why* we did it:
485+
486+
1. We define our models in the [LogDensityProblems.jl](https://github.com/tpapp/LogDensityProblems.jl) interface because it makes the sampler agnostic to how the underlying model is implemented.
487+
2. We implement our sampler in the [AbstractMCMC.jl](https://github.com/TuringLang/AbstractMCMC.jl) interface, which just means that our sampler is a subtype of `AbstractMCMC.AbstractSampler` and we implement the MCMC transition in `AbstractMCMC.step`.
488+
3. Points 1 and 2 makes it so our sampler can be used with a wide range of model implementations, amongst them being models implemented in both Turing.jl and Stan. This gives you, the inference implementer, a large collection of models to test your inference method on, in addition to allowing users of Turing.jl and Stan to try out your inference method with minimal effort.
489+
490+
[^1]: There is no such thing as a proper interface in Julia (at least not officially), and so we use the word "interface" here to mean a few minimal methods that needs to be implemented by any type that we treat as a target model.
491+
492+
[^2]: We're going with the leapfrog formulation because in a future version of this tutorial we'll add a section extending this simple "baseline" MALA sampler to more complex versions. See [issue #479](https://github.com/TuringLang/docs/issues/479) for progress on this.

0 commit comments

Comments
 (0)
Please sign in to comment.