-
Notifications
You must be signed in to change notification settings - Fork 101
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fixes to Bayes SDE notebook #449
Conversation
…e. Introduced new noisy observations that are better suited for the problem, and corrected the model to calculate the likelihood based on multiple trajectories rather than a single trajectory.
data[:, i] ~ MvNormal(predicted[i], σ^2 * I) | ||
for j in 1:length(predicted) | ||
for i in 1:length(predicted[j]) | ||
data[:, i] ~ MvNormal(predicted[j][i], σ^2 * I) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider adding a bit of explanation on the reasoning behind this likelihood function, e.g. following discussions in TuringLang/Turing.jl#2216
Thanks, @gjgress -- this looks like a pseudo-marginal HMC algorithm, although the stochasticity in the likelihood function might interact with the simulation of Hamiltonian dynamics of HMC. It seems like a correct MCMC algorithm, but it is not entirely clear that this has been done in the literature. |
It's my understanding that pseudo-marginal HMC, as introduced in Alenlöv et al., 2016, is a bit more nuanced than what is happening here. Even if you have an unbiased estimate of the likelihood, that doesn't mean your gradient estimates are unbiased and so your dynamics are incorrect. The approach taken in the PM-HMC paper is quite a bit more complicated. To avoid having a confusing/misleading example in the docs, it might be wiser to fit this model using MH, as in the PM-MH framework which is a valid "exact approximate" algorithm. That said, you could stick with the current approach but I think you'd need to make it clear that this an approximate method. |
The discussion on pseudo-marginal samplers or sequential Monte Carlo methods is not something I'm qualified to comment on, nor implement myself. From the viewpoint of creating a tutorial, I would say that using either of those tools is outside the scope of this tutorial, and ought to be its own tutorial (if its even within the scope of Turing as a whole-- I could not find an implementation of either of these methods using Turing during my search). Of course if it can be done it would be fantastic-- based on what others have said, it seems that except under particular conditions, it is generally bad form to do the method here, and so using this method without at least mentioning its approximate nature may be misleading to less experienced users trying to use Turing for their SDE problems. In any case, there isn't more for me to add to this tutorial, as I am not knowledgeable enough on this to add that commentary to the tutorial. |
Many thanks @gjgress for investigating the issue and the fixes to the tutorial. I agree that we seem have touched on a research-grade problem, which is great, but of course beyond the scope of a tutorial. |
For clarity, I am not sure this is incorrect either: The noise in the gradient might not matter as long as the MH step is correct, but of course, the resulting HMC might be suboptimal in terms of efficiency. |
Updated retcode success check to the current version used by SciMLBase. Introduced new noisy observations that are better suited for the problem, and corrected the model to calculate the likelihood based on multiple trajectories rather than a single trajectory.
#446
TuringLang/Turing.jl#2216