refactor interface for projections/proximal operators #147

Red-Portal · 2024-11-17T06:04:42Z

This PR refactors how post-hoc modifications are applied to the iterates after performing a gradient descent step. For instance, before, updating the parameters of LocationScale always silently applied a projection step. Now, everything needs to be made into its own OptimisationRule to make it more modular and explicit.

More concretely, this PR changes the following:

The scale matrix of a LocationScale distribution is no longer projected by default.
A new keyword argument, operator, is added to optimize.
The operator object is applied to the parameters after each gradient descent step.
For the location-scale family, I added ClipScale, which clips the diagonal of the scale matrix to be strictly positive.

For example:

AdvancedVI.optimize(
    model,
    elbo,
    q_transformed,
    max_iter;
    adtype=ADTypes.AutoForwardDiff(),
    optimizer=Optimisers.Adam(1e-3),
    operator=ClipScale(),
)

For gradient descent, the operator is applied as:

$$\lambda_{t+1} = \mathrm{operator}\left(\lambda_t + \gamma_t g_t\right),$$

where $g_t$ is the gradient estimator and $\gamma_t$ is the stepsize.

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

github-actions

Benchmark Results

Benchmark suite	Current: `635ea4e`	Previous: `9887bb4`	Ratio
`normal/RepGradELBO + STL/meanfield/Zygote`	`15002944353` ns	`14589178867` ns	`1.03`
`normal/RepGradELBO + STL/meanfield/ForwardDiff`	`3213209140` ns	`3168929241` ns	`1.01`
`normal/RepGradELBO + STL/meanfield/ReverseDiff`	`3201755852` ns	`3202361222` ns	`1.00`
`normal/RepGradELBO + STL/fullrank/Zygote`	`14902278767` ns	`14433568159` ns	`1.03`
`normal/RepGradELBO + STL/fullrank/ForwardDiff`	`3604116961` ns	`3452371390` ns	`1.04`
`normal/RepGradELBO + STL/fullrank/ReverseDiff`	`5831854767` ns	`5762008942` ns	`1.01`
`normal/RepGradELBO/meanfield/Zygote`	`7098681152` ns	`6877924349` ns	`1.03`
`normal/RepGradELBO/meanfield/ForwardDiff`	`2361672495` ns	`2305614010.5` ns	`1.02`
`normal/RepGradELBO/meanfield/ReverseDiff`	`1459158489` ns	`1433074275` ns	`1.02`
`normal/RepGradELBO/fullrank/Zygote`	`7131701632` ns	`6826219707` ns	`1.04`
`normal/RepGradELBO/fullrank/ForwardDiff`	`2542847767` ns	`2553813983` ns	`1.00`
`normal/RepGradELBO/fullrank/ReverseDiff`	`2679946317` ns	`2543813427` ns	`1.05`
`normal + bijector/RepGradELBO + STL/meanfield/Zygote`	`23446836790` ns	`22908413254` ns	`1.02`
`normal + bijector/RepGradELBO + STL/meanfield/ForwardDiff`	`10436881038` ns	`10230701628` ns	`1.02`
`normal + bijector/RepGradELBO + STL/meanfield/ReverseDiff`	`5149189755` ns	`5103157400` ns	`1.01`
`normal + bijector/RepGradELBO + STL/fullrank/Zygote`	`23488381369` ns	`22668345453` ns	`1.04`
`normal + bijector/RepGradELBO + STL/fullrank/ForwardDiff`	`10953526681` ns	`10781493177` ns	`1.02`
`normal + bijector/RepGradELBO + STL/fullrank/ReverseDiff`	`8405261867` ns	`8288136356` ns	`1.01`
`normal + bijector/RepGradELBO/meanfield/Zygote`	`14855248592` ns	`14406529461` ns	`1.03`
`normal + bijector/RepGradELBO/meanfield/ForwardDiff`	`9322626292` ns	`9026101677` ns	`1.03`
`normal + bijector/RepGradELBO/meanfield/ReverseDiff`	`3143892806` ns	`3137765457` ns	`1.00`
`normal + bijector/RepGradELBO/fullrank/Zygote`	`14925584523` ns	`14347124133` ns	`1.04`
`normal + bijector/RepGradELBO/fullrank/ForwardDiff`	`9458708256` ns	`9935024364` ns	`0.95`
`normal + bijector/RepGradELBO/fullrank/ReverseDiff`	`4589456910` ns	`4576538483` ns	`1.00`

This comment was automatically generated by workflow using github-action-benchmark.

…jected_proximal_location_scale

Red-Portal · 2024-12-10T08:20:46Z

@yebai I'll mark the v0.3 release (at last!) after this PR

mhauru

Could we have a test where the eigenvalues drift too low and we can test both that it fails when nothing using ProjectScale and that it then succeeds when using ProjectScale? Just to very concretely see the effect, and see that the first case fails in the expected (rather than some other, unexpected) way.

Except for the above request, I'm happy with the software engineering. I would prefer it though if someone else who has views on the design choices here gave a second, approving opinion. I have little idea of what users need and want from their interfaces here, e.g. if the name ProjectScale is intuitive for users, or if there should be a warning if someone tries to optimise a LocationScale without using ProjectScale.

yebai

Wrapping an optimiser inside ProjectScale(...) feels slightly strange to me. While using ProjectScale might be appropriate for a specific paper, but the terminology is not (yet) widely accepted. I think we could introduce an additional keyword argument to pass this information instead of overloading the optimiser argument for too many purposes.

Red-Portal · 2024-12-11T00:53:27Z

Thank you both for chiming in!

@yebai I was thinking this to be similar in functionality to operations like gradient clipping. How about I change the name to ClipScale and use the standard composition feature in Optimisers.jl, OptimiserChain?

yebai · 2024-12-11T11:24:00Z

@Red-Portal Your proposal looks good!

codecov · 2024-12-24T15:25:27Z

Codecov Report

Attention: Patch coverage is 92.85714% with 3 lines in your changes missing coverage. Please review.

Project coverage is 91.76%. Comparing base (616b581) to head (635ea4e).
Report is 8 commits behind head on master.

Files with missing lines	Patch %	Lines
src/optimization/clip_scale.jl	86.66%	2 Missing ⚠️
src/AdvancedVI.jl	66.66%	1 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##           master     #147      +/-   ##
==========================================
- Coverage   93.54%   91.76%   -1.79%     
==========================================
  Files          12       13       +1     
  Lines         372      352      -20     
==========================================
- Hits          348      323      -25     
- Misses         24       29       +5

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Red-Portal · 2024-12-25T23:17:39Z

@yebai Needed to change the API a little bit (see the summary at the top comment), do you agree with it?

yebai

Thanks @Red-Portal -- looks very good. I left two minor comments. Otherwise, this is ready to go!

docs/src/optimization.md

src/AdvancedVI.jl

Co-authored-by: Hong Ge <[email protected]>

…Lang/AdvancedVI.jl into projected_proximal_location_scale

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

yebai

Thanks @Red-Portal!

Red-Portal and others added 7 commits November 16, 2024 21:10

refactor make scale projection operator its own optimization rule

238128e

add docs for ProjectScale

03338d6

refactor change of type parameter order for LocationScaleLowRank

233cffa

apply formatter

960d77d

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

apply formatter

db42115

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

apply formatter

6dd0fd6

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

update README

074218a

github-actions bot reviewed Nov 17, 2024

View reviewed changes

Red-Portal added 3 commits December 9, 2024 00:00

Merge branch 'master' of github.com:TuringLang/AdvancedVI.jl into pro…

a11e5ce

…jected_proximal_location_scale

fix formatting

a3ce1d1

fix outdated type parameters in LocationScale

ee36164

Red-Portal requested review from yebai and mhauru December 10, 2024 08:20

mhauru reviewed Dec 10, 2024

View reviewed changes

yebai reviewed Dec 10, 2024

View reviewed changes

Red-Portal added this to the v0.3.0 milestone Dec 12, 2024

Red-Portal added 9 commits December 24, 2024 06:31

rename averaging function

cd35e4e

fix projection/proximal operator interface

f40df75

update documentation

97f64e1

fix formatting

9f1a549

fix benchmark

ebe0637

add missing test file

dcf21db

fix documentation

7868317

fix documentation

04db344

fix ambiguous specialization error for operate

f731bdc

Red-Portal requested a review from mhauru December 25, 2024 23:16

Red-Portal requested a review from yebai December 25, 2024 23:16

yebai reviewed Dec 26, 2024

View reviewed changes

docs/src/optimization.md Outdated Show resolved Hide resolved

src/AdvancedVI.jl Outdated Show resolved Hide resolved

Red-Portal and others added 4 commits December 27, 2024 02:35

update documentation

86e1ab3

Co-authored-by: Hong Ge <[email protected]>

refactor average and operate to specializations of apply

1b3b734

Merge branch 'projected_proximal_location_scale' of github.com:Turing…

9887bb4

…Lang/AdvancedVI.jl into projected_proximal_location_scale

Update docs/src/optimization.md

635ea4e

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

yebai approved these changes Dec 29, 2024

View reviewed changes

Red-Portal merged commit 54dff15 into master Dec 30, 2024
13 of 19 checks passed

Red-Portal deleted the projected_proximal_location_scale branch December 30, 2024 08:53

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

refactor interface for projections/proximal operators #147

refactor interface for projections/proximal operators #147

Red-Portal commented Nov 17, 2024 •

edited

Loading

github-actions bot left a comment •

edited

Loading

Red-Portal commented Dec 10, 2024

mhauru left a comment

yebai left a comment

Red-Portal commented Dec 11, 2024

yebai commented Dec 11, 2024

codecov bot commented Dec 24, 2024 •

edited

Loading

Red-Portal commented Dec 25, 2024

yebai left a comment

yebai left a comment

refactor interface for projections/proximal operators #147

refactor interface for projections/proximal operators #147

Conversation

Red-Portal commented Nov 17, 2024 • edited Loading

github-actions bot left a comment • edited Loading

Choose a reason for hiding this comment

Benchmark Results

Red-Portal commented Dec 10, 2024

mhauru left a comment

Choose a reason for hiding this comment

yebai left a comment

Choose a reason for hiding this comment

Red-Portal commented Dec 11, 2024

yebai commented Dec 11, 2024

codecov bot commented Dec 24, 2024 • edited Loading

Codecov Report

Red-Portal commented Dec 25, 2024

yebai left a comment

Choose a reason for hiding this comment

yebai left a comment

Choose a reason for hiding this comment

Red-Portal commented Nov 17, 2024 •

edited

Loading

github-actions bot left a comment •

edited

Loading

codecov bot commented Dec 24, 2024 •

edited

Loading