-
Notifications
You must be signed in to change notification settings - Fork 19
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Parameter-Free Optimization Algorithms #81
Conversation
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #81 +/- ##
==========================================
- Coverage 96.09% 95.76% -0.33%
==========================================
Files 11 13 +2
Lines 205 260 +55
==========================================
+ Hits 197 249 +52
- Misses 8 11 +3 ☔ View full report in Codecov by Sentry. |
This PR is now ready. @yebai @mhauru @sunxd3 If anybody could take a look it would be great! Running the formatter unfortunately messed up the diff a little bit. Apologies for this in advance. The main breaking change by this PR is that |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is great! I learned a lot by reviewing.
Went though the algorithms in the paper, the implementations looks correct.
Couple of really minor comments.
This PR adds some recent parameter-free optimization algorithms. These algorithms should provide close-to-optimal performance without any tuning. The hope is to choose one of these algorithms as the default strategy so that
Turing
users don't need to perform any tuning when usingAdvancedVI
. In particular, the COCOB algorithm have been reported to be very effective for particle variational inference algorithms.The PR also adds parameter-averaging strategies. Some of the parameter-free algorithms report the best performance when combined with parameter averaging. This paper also previously suggested that VI is best combined with parameter averaging (albeit by determining when to start averaging through the$\widehat{R}$ measure.)
Once the PosteriorDB project is done, we can probably run some large-scale experiments to determine which one is best.