Improving Gradient-guided Nested Sampling for Posterior Inference

https://arxiv.org/abs/2312.03911

to implement. A Markov-Chain Monte Carlo (MCMC) algorithm is used to sample from the reward distribution. Backward sampling from these terminal states are used to generate off-policy trajectories.