Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce Aqua.jl-based checks #379

Merged
merged 14 commits into from
Apr 16, 2024
Merged

Introduce Aqua.jl-based checks #379

merged 14 commits into from
Apr 16, 2024

Conversation

kellertuer
Copy link
Member

I added tests with Aqua.jl.

While we have a few ambiguity warnings that I for now ignored, they appear in different variants of the high-level solvers, that usually should not be a problem.

But there is one case of unbounded parameters that I could not resolve. Maybe @mateuszbaran you can take a look? The problem with

@noinline function StoreStateAction(
M::AbstractManifold;
store_fields::Vector{Symbol}=Symbol[],
store_points::Union{Type{TPS},Vector{Symbol}}=Tuple{},
store_vectors::Union{Type{TTS},Vector{Symbol}}=Tuple{},
p_init=rand(M),
X_init=zero_vector(M, p_init),
once=true,
) where {TPS<:Tuple,TTS<:Tuple}
if store_points isa Vector{Symbol}
TPS_tuple = tuple(store_points...)
else
TPS_tuple = Tuple(TPS.parameters)
end
if store_vectors isa Vector{Symbol}
TTS_tuple = tuple(store_vectors...)
else
TTS_tuple = Tuple(TTS.parameters)
end
point_values = NamedTuple{TPS_tuple}(map(_ -> p_init, TPS_tuple))
vector_values = NamedTuple{TTS_tuple}(map(_ -> X_init, TTS_tuple))
return StoreStateAction(store_fields, point_values, vector_values, once; M=M)
end

is that in case it is a vector of symbols, the two parameters TTS, TPS are declared but not use; I tried to localise them to but them into the union (the trick that otherwise helped), but then the code uses TPS and TTS and that is not possible if they are local to the union. Is there any other good way to access the parameters you need?

I will keep this open for a while and maybe work a bit on the ambiguities, but maybe that would require a breaking change as well. At least for the last step size I will try to resolve that.

@kellertuer
Copy link
Member Author

I think I found something. This was fun and I removed a few ambiguities already. The remaining ones I might think about, but they are a bit tricky, when we allow both solver(M, f, grad_f, rest...; kwargs...) as well as solver(M, objective, rest...; kwargs...)
especially when rest also contains the two forms for sub solvers we currently support (sub-problem/state and closed-form variant as a function).

Copy link

codecov bot commented Apr 12, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 99.76%. Comparing base (f546183) to head (37af2ae).

Additional details and impacted files
@@           Coverage Diff           @@
##           master     #379   +/-   ##
=======================================
  Coverage   99.76%   99.76%           
=======================================
  Files          74       73    -1     
  Lines        7241     7243    +2     
=======================================
+ Hits         7224     7226    +2     
  Misses         17       17           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@mateuszbaran
Copy link
Member

But there is one case of unbounded parameters that I could not resolve. Maybe @mateuszbaran you can take a look? The problem with

I will rewrite it

This was fun and I removed a few ambiguities already. The remaining ones I might think about, but they are a bit tricky, when we allow both solver(M, f, grad_f, rest...; kwargs...) as well as solver(M, objective, rest...; kwargs...)

This may be a bit hard to resolve.

@kellertuer
Copy link
Member Author

You would also usually not mix them up, because that would basically require to provide ab objective, a problem and another problem or such, so I am even fine with keeping the high-level interfaces on ignore.
It is also just a few ones that have different formats to be called: difference of convex have a few different formats that “conflict” in an ambiguity you would in reality never hit.

@kellertuer
Copy link
Member Author

I will rewrite it

Oh a very nice rewrite dispatching on that again :) Well-done.

@kellertuer
Copy link
Member Author

I checked both the get_last_stepsize and the higher level ones.
the last step size would need a bit of a (breaking) rework, that I could do before a next breaking release.

For the high level ones, I neither feel that would be easy to do nor that it might be a problem that there is an ambiguity. One would basically have to really mix two of the documented forms and provide two sub problem solvers, one in closed form , one as a problem, and not provide a state.
The reason I do not want to narrow down those parameters in type (which could also resolve the ambiguity) is that I do not want to restrict f or grad_f to for example Function.

@kellertuer kellertuer force-pushed the kellertuer/adding-aqua branch from 005db33 to beb2c50 Compare April 13, 2024 11:11
@mateuszbaran
Copy link
Member

mateuszbaran commented Apr 13, 2024

particle_swarm should be actually fixed IMO, see this call:

        M = Circle()
        f(::Circle, p) = p * 2 * p
        swarm = [-π / 4, 0, π / 4]
        s = particle_swarm(M, ManifoldCostObjective(f), swarm)
        @test s  0.0

I will take a look at other ones too.

EDIT:
stochastic_gradient_descent, difference_of_convex_proximal_point have a similar issue with Number type.

@mateuszbaran
Copy link
Member

truncated_conjugate_gradient_descent ambiguity should be harmless, and get_last_stepsize should probably be redesigned to fix those ambiguities.

@kellertuer
Copy link
Member Author

Ah I See the point with particle swarm, the main tricky thing seems really to be that we aim to cover single-number-cases as well, but it should be fine with just one extra case each hopefully then.

for the last stepsize I agree; I would consider that function more internal though, but currently its signatures are a bitt all-obver-the.place and should be minimised / unified, I can check whether I see a way without breaking (too much).

@kellertuer
Copy link
Member Author

I fixed last stepsize. There were a few too many interims dispatches that caused trouble, so I removed an unneccesary interims one, that was also only explicitly called in a test but nowhere else used. I documented the remaining ones more carefully and added an entry to the changelog about this

@kellertuer
Copy link
Member Author

I've taken a look at particle_swarm. The main problem are the two cases

  • particle_swarm(M, f, swarm) versus
  • particle_swarm(M, mco, swarm)

where for the first we have a method that also works if the swarm is a vector of numbers. For the second we do not do this, which leads to an ambiguity. the problem is here, that while mco does cary a cost inside and we could use get_cost_function to “extract” it and make it suitable to numbers, we can not “exchange” it in mco; at least I have no clue how to do that if the mco is for example “wrapped” in a cache.

Long story short, I can only resolve this, by writing a particle_swarm(M, mco, swarm::AbstractVector{T}) where {T<:Number} that errors. The only alternative would be to remove the variant that works on numbers, but that would even break the tests we have on that.

@kellertuer
Copy link
Member Author

difference_of_convex_proximal_point has exactly the same problem, namely that the only thing mandatory is a single function, just that there it is grad_h (instead of f), but the effect is completely the same as for PSO.

@mateuszbaran
Copy link
Member

I see, that's indeed rather unfixable. I'd suggest adding an error hint like here:
https://github.com/JuliaManifolds/Manifolds.jl/blob/4cbf30f44654f0be462193fdea7a164a3d746264/src/Manifolds.jl#L581-L588 and keeping it an ambiguity.

@kellertuer
Copy link
Member Author

truncated_conjugate_gradient_descent does not only cover a few deprecated cases, but also one where it is hard to distinguish the (optional) hessian and the optional initial value (wich unfortunately is both a point p and the tangent vector X), such that X and hessian get into conflict.
That leads to 3 ambiguities (the other 3 are resolved by a breaking change, removing the deprecated ones).

I am not sure what to do about these, since we usually do not type p nor X and while the hessian is currently “reconized” by being a function, we already discussed that we maybe do not want that either.
Though one could do functors to subtype Function that is maybe also restrictive. But then maybe the general question is:

If both a function am the initial point are optional, how to best distinguish the? For best of cases I would prefer to keep both positional, since that is “nice to use”, but hard to get unambiguous, probably.
So an alternative (but breaking change) could be to move all optional functions (like here the hessian) into (very unified named) keyword arguments.
(This would not resolve the two problems above, but well).

@kellertuer
Copy link
Member Author

I see, that's indeed rather unfixable. I'd suggest adding an error hint like here:
https://github.com/JuliaManifolds/Manifolds.jl/blob/4cbf30f44654f0be462193fdea7a164a3d746264/src/Manifolds.jl#L581-L588 and keeping it an ambiguity.

I have no clue how to write such a hint, but something like “doing PSO with a objective and a swarm consisting of numbers is not supported” – one could of course also just start such an algorithm, assuming that “A user providing an objective” knows what they do and it will be the right one already, so one would just generate the “array swarm”.

@kellertuer
Copy link
Member Author

Ah, one thing that might help (but maybe that is also a bit silly interface-wise) is to for example make the initial point always a tuple, so one would call

difference_of_convex_proximal_point(M, grad_h, (p0,))

which would remove the ambiguity with the Hessian, though not with the f-vs-objective (or grad-h-vs-objective) one.
I would prefer keeping p0 as the (always last) positional argument though, I think – because we dispatch on that for number types at least; otherwise one could consider moving p0 to a initial_value keyword or so (but that would again be quite breaking as well).

@mateuszbaran
Copy link
Member

I have no clue how to write such a hint, but something like “doing PSO with a objective and a swarm consisting of numbers is not supported” – one could of course also just start such an algorithm, assuming that “A user providing an objective” knows what they do and it will be the right one already, so one would just generate the “array swarm”.

I will write that hint then.

truncated_conjugate_gradient_descent does not only cover a few deprecated cases, but also one where it is hard to distinguish the (optional) hessian and the optional initial value (wich unfortunately is both a point p and the tangent vector X), such that X and hessian get into conflict.

OK, could you mention it in a comment in the test file for Aqua?

If both a function am the initial point are optional, how to best distinguish the? For best of cases I would prefer to keep both positional, since that is “nice to use”, but hard to get unambiguous, probably.
So an alternative (but breaking change) could be to move all optional functions (like here the hessian) into (very unified named) keyword arguments.

I would suggest making Hessian a keyword argument, and keeping the rest positional. 7 different calls signatures in a docstring is a bit much, and it doesn't have the "no Hessian, no p" variant there which should probably also work?

@kellertuer
Copy link
Member Author

I would suggest making Hessian a keyword argument, and keeping the rest positional. 7 different calls signatures in a docstring is a bit much, and it doesn't have the "no Hessian, no p" variant there which should probably also work?

That we should check carefully and introduce then for all solvers that need a hessian but are fine with an approximation (or maybe even more general: all solvers that require objective-parts that are optional / can be “filled” with a good default).

@mateuszbaran
Copy link
Member

Ah, one thing that might help (but maybe that is also a bit silly interface-wise) is to for example make the initial point always a tuple, so one would call

difference_of_convex_proximal_point(M, grad_h, (p0,))

which would remove the ambiguity with the Hessian, though not with the f-vs-objective (or grad-h-vs-objective) one. I would prefer keeping p0 as the (always last) positional argument though, I think – because we dispatch on that for number types at least; otherwise one could consider moving p0 to a initial_value keyword or so (but that would again be quite breaking as well).

Making initial point a tuple looks awful to me, and since it's I think the most important optional parameter of every solver, I would try to avoid making it a keyword argument.

That we should check carefully and introduce then for all solvers that need a hessian but are fine with an approximation (or maybe even more general: all solvers that require objective-parts that are optional / can be “filled” with a good default).

Sure, that sounds like a good idea.

@kellertuer
Copy link
Member Author

Ah to be precise p for TCG can never be optional, since it specifies the tangent space, the iterate is X which is optional, I am not sure a random tangent space as a domain is something reasonable ;)

But I do agree 7 signatures might be a few too much.

@mateuszbaran
Copy link
Member

Ah to be precise p for TCG can never be optional, since it specifies the tangent space, the iterate is X which is optional, I am not sure a random tangent space as a domain is something reasonable ;)

There is such signature there (the third one):

truncated_conjugate_gradient_descent(M, f, grad_f, Hess_f; kwargs...)

@kellertuer
Copy link
Member Author

Making initial point a tuple looks awful to me, and since it's I think the most important optional parameter of every solver, I would try to avoid making it a keyword argument.

I fully agree that that would be quite ugly, and I feel the idea “p=rand(M) is always the last positional argument” (for TCG: a random tangent vector ;)) is a very good design idea.

Then we should check for the next breaking release to make things like an optional hessian a keyword argument. Some solvers already have f or such as keywords, where they can be filled reasonably or are only necessary for more debug opportunities.

@kellertuer
Copy link
Member Author

I first thought this might just be a reasonable check to add, turns out, it even improves stability / reliability rethinking a few choices that lead to the ambiguities.

@kellertuer
Copy link
Member Author

Is it ok if we do the rework of the interface (and their unification) in a new PR? I can also open an issue (and add that to the 1.0 roadmap) that sketches how to unify the high level interfaces and how they accept functions

Copy link
Member

@mateuszbaran mateuszbaran left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, sure, that's fine.

@kellertuer kellertuer merged commit 462f8f0 into master Apr 16, 2024
15 checks passed
@kellertuer kellertuer deleted the kellertuer/adding-aqua branch May 4, 2024 17:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants