You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Christof Angermueller, David Dohan, David Belanger, Ramya Deshpande, Kevin Murphy, Lucy Colwell
What is this
Ref: Algorithm 1: DyNA PPO
Comparison with previous researches. What are the novelties/good points?
Key points
Our method updates the policy’s parameters using sequences x generated by the current policy πθ(x), but evaluated using a learned surrogate f'(x), instead of the true, but unknown, oracle reward function f(x).
How the author proved effectiveness of the proposal?
Any discussions?
What should I read next?
The text was updated successfully, but these errors were encountered:
Summary
Link
Model-based reinforcement learning for biological sequence design
Author/Institution
Christof Angermueller, David Dohan, David Belanger, Ramya Deshpande, Kevin Murphy, Lucy Colwell
What is this
Ref: Algorithm 1: DyNA PPO
Comparison with previous researches. What are the novelties/good points?
Key points
How the author proved effectiveness of the proposal?
Any discussions?
What should I read next?
The text was updated successfully, but these errors were encountered: