Skip to content

Commit

Permalink
add Turing Example to docs (#86)
Browse files Browse the repository at this point in the history
  • Loading branch information
itsdfish authored Aug 14, 2023
1 parent 677b7b2 commit 8de85be
Show file tree
Hide file tree
Showing 2 changed files with 120 additions and 0 deletions.
1 change: 1 addition & 0 deletions docs/make.jl
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,7 @@ makedocs(;
),
pages=[
"Home" => "index.md",
"Using with Turing" => "turing.md",
],
strict=true,
checkdocs=:exports,
Expand Down
119 changes: 119 additions & 0 deletions docs/src/turing.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,119 @@
# Turing Example

This example demonstrates how to correctly compute PSIS LOO for a model developed with [Turing.jl](https://turinglang.org/stable/). Below, we show two ways to correctly specify the model in Turing. What is most important is to specify the model so that pointwise log densities are computed for each observation.

To make things simple, we will use a Gaussian model in each example. Suppose observations ``Y = \{y_1,y_2,\dots y_n\}`` come from a Gaussian distribution with an uknown parameter ``\mu`` and known parameter ``\sigma=1``. The model can be stated as follows:

``\mu \sim \mathrm{normal}(0, 1)``

``Y \sim \mathrm{Normal}(\mu, 1)``

## For Loop Method

One way to specify a model to correctly compute PSIS LOO is to iterate over the observations using a for loop, as follows:
```julia
using Turing
using ParetoSmooth
using Distributions
using Random

Random.seed!(5)

@model function model(data)
μ ~ Normal()
for i in 1:length(data)
data[i] ~ Normal(μ, 1)
end
end

data = rand(Normal(0, 1), 100)

chain = sample(model(data), NUTS(), 1000)
psis_loo(model(data), chain)
```
The output below correctly indicates PSIS LOO was computed with 100 data points.

```julia
[ Info: No source provided for samples; variables are assumed to be from a Markov Chain. If the samples are independent, specify this with keyword argument `source=:other`.
Results of PSIS-LOO-CV with 1000 Monte Carlo samples and 100 data points. Total Monte Carlo SE of 0.064.
┌───────────┬─────────┬──────────┬───────┬─────────┐
│ │ total │ se_total │ mean │ se_mean │
├───────────┼─────────┼──────────┼───────┼─────────┤
│ cv_elpd │ -158.829.24-1.590.09
│ naive_lpd │ -157.439.05-1.570.09
│ p_eff │ 1.390.190.010.00
└───────────┴─────────┴──────────┴───────┴─────────┘
```
## Dot Vectorization Method
The other method uses dot vectorization in the sampling statement: `.~`. Adapting the model above accordingly, we have:
```julia
using Turing
using ParetoSmooth
using Distributions
using Random

Random.seed!(5)

@model function model(data)
μ ~ Normal()
data .~ Normal(μ, 1)
end

data = rand(Normal(0, 1), 100)

chain = sample(model(data), NUTS(), 1000)
psis_loo(model(data), chain)
```
As before, the output correctly indicates PSIS LOO was computed with 100 observations.
```julia
[ Info: No source provided for samples; variables are assumed to be from a Markov Chain. If the samples are independent, specify this with keyword argument `source=:other`.
Results of PSIS-LOO-CV with 1000 Monte Carlo samples and 100 data points. Total Monte Carlo SE of 0.053.
┌───────────┬─────────┬──────────┬───────┬─────────┐
│ │ total │ se_total │ mean │ se_mean │
├───────────┼─────────┼──────────┼───────┼─────────┤
│ cv_elpd │ -158.719.23-1.590.09
│ naive_lpd │ -157.449.06-1.570.09
│ p_eff │ 1.270.180.010.00
└───────────┴─────────┴──────────┴───────┴─────────┘
```
## Incorrect Model Specification
Although the model below is valid, it will not produce the correct results for PSIS LOO because it computes a single log likelihood for the data rather than one for each observation. Note the lack of `.` in the sampling statement.
```julia
using Turing
using ParetoSmooth
using Distributions
using Random

Random.seed!(5)

@model function model(data)
μ ~ Normal()
data ~ Normal(μ, 1)
end

data = rand(Normal(0, 1), 100)

chain = sample(model(data), NUTS(), 1000)
psis_loo(model(data), chain)
```
In this case, there is only 1 data point and the standard errors cannot be computed:
```julia
[ Info: No source provided for samples; variables are assumed to be from a Markov Chain. If the samples are independent, specify this with keyword argument `source=:other`.
┌ Warning: Some Pareto k values are high (>.7), indicating PSIS has failed to approximate the true distribution.
└ @ ParetoSmooth ~/.julia/packages/ParetoSmooth/AJM3j/src/InternalHelpers.jl:46
Results of PSIS-LOO-CV with 1000 Monte Carlo samples and 1 data points. Total Monte Carlo SE of 0.15.
┌───────────┬─────────┬──────────┬─────────┬─────────┐
│ │ total │ se_total │ mean │ se_mean │
├───────────┼─────────┼──────────┼─────────┼─────────┤
│ cv_elpd │ -158.57NaN-158.57NaN
│ naive_lpd │ -157.91NaN-157.91NaN
│ p_eff │ 0.66NaN0.66NaN
└───────────┴─────────┴──────────┴─────────┴─────────┘
```

0 comments on commit 8de85be

Please sign in to comment.