Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make AbstractMCMC.step function handle rng as part of state #116

Open
yebai opened this issue Jan 16, 2023 · 5 comments
Open

Make AbstractMCMC.step function handle rng as part of state #116

yebai opened this issue Jan 16, 2023 · 5 comments

Comments

@yebai
Copy link
Member

yebai commented Jan 16, 2023

For continuing MCMC sampling from a previous stopping point, we need to store the rng as part of the sampling state.

https://github.com/TuringLang/AdvancedMH.jl/blob/e1741179e2505da57945d47b7b1debbf3f0e848b/src/mh-core.jl#L83

https://github.com/TuringLang/AdvancedMH.jl/blob/e1741179e2505da57945d47b7b1debbf3f0e848b/src/mh-core.jl#L90

@devmotion
Copy link
Member

Isn't that a more general issue/question that is not specific for AdvancedMH? Also with other samplers you have to save the RNG if you want to continue with exactly the same stream of random numbers. But one can easily do that by passing an explicit RNG object and storing it separately when stopping sampling, I think? I would have assumed as well that in many cases it does not matter if one continues with a different RNG or differently seeded RNG, as long as the two streams of random numbers are not correlated and e.g. the seeds are sampled randomly.

@yebai
Copy link
Member Author

yebai commented Jan 17, 2023

Isn't that a more general issue/question that is not specific for AdvancedMH?

Yes, this is a more general issue ideally solved by AbstractMCMC.

But one can easily do that by passing an explicit RNG object and storing it separately when stopping sampling, I think?
I would have assumed as well that in many cases it does not matter if one continues with a different RNG or differently seeded RNG, as long as the two streams of random numbers are not correlated and e.g. the seeds are sampled randomly.

That works indeed for most cases. Consider a special case where we want to transfer the sampling process between machines. For example, we run a model for 10 minutes, save the states to disk and wait for the user to perform some convergence checks (or other actions). Later the user might decide to continue the sampling process for another 10 mins. Under such circumstances, we don't know how many steps of MCMC we will run under given time constraints. So the rng has to be returned and stored together with using StableRNG to guarantee full reproducibility. One natural way of "checkpointing" these rng stages is the AbstractMCMC.step function, I think.

@devmotion
Copy link
Member

I think that use case is also related to #109.

The annoying part about handling it in step is that every sampler package has to adjust for it. I wonder if it would be sufficient to handle it in bundle_samples and pass it there, together with e.g. the final state. This seems sufficient if one uses sample. And if one uses the iterator or transducer, one handles states, RNG etc. manually anyway, so saving the RNG should be trivial?

@yebai
Copy link
Member Author

yebai commented Jan 17, 2023

I wonder if it would be sufficient to handle it in bundle_samples and pass it there, together with e.g. the final state.

That could work well. We can treat these rng states as meta information and store them in chains, possibly together with other sampler states (e.g. HMC preconditioning matrix, leapfrog step size).

@yebai yebai transferred this issue from TuringLang/AdvancedMH.jl Jan 17, 2023
@yebai
Copy link
Member Author

yebai commented Jul 14, 2023

Related TuringLang/AdvancedHMC.jl#314

Cc @JaimeRZP maybe we can switch to non-mutating rng, then update AbstractMCMC.step to return the new rng state.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants