Quasi-Newton and Conjugate Gradient on Stiefel/Unitary/Orthogonal Manifolds #349
-
Hello! I tested some configurations on the Brockett optimization problem (a diagonalization problem which is also featured in your github tests (test_quasi_Newton.jl)) and ran into some problems. In general, I don't really get the CG algorithm going. Polak-Ribiere doesn't work at all, Fletcher Reeves works sometimes, depending on the initial conditions and size of the problem. At times, I also have problems with the log function (leading to "matrix contains Infs/NaNs" errors), but this is not always reproducible. For the real Stiefel manifold, some log/exp tests fail when using test_manifold for larger matrices! (10x10 for example) Maybe the error tolerance has to be set larger here, I did not go too much into detail. Most importantly, which retractions, vector transports and stepsize algorithms would you recommend for these kind of problems? Ideally, I would like to work on the Unitary Matrices manifold with the exponential map, parallel transport and a polynomial linesearch, but I would need to include that to the module I guess. Do you think it would bring a performance boost, or are other methods similar in accuracy and performance? This is the code I am performing my tests with at the moment (it is less complicated than it looks at first glance):
Maybe you can help me with this, thank you in advance, |
Beta Was this translation helpful? Give feedback.
Replies: 4 comments 68 replies
-
Hi, I can check your method closer the next few days, but with so many lines of code, that really will take a bit of time, reading and following up on so much code – even if you say it is simpler than it looks – at first glance it is more like a wall-of-code. A first few comments
This is more a matter of Compared to that, the work on
This is a bit hard to help, if the problem is a bit vague in “could not get it to work”
Yes that is probably a tolerance issue
The retraction / vector transport depends a lot on the problem you have, the manifold size you habe and such. But for unitary both exp ( https://juliamanifolds.github.io/Manifolds.jl/stable/manifolds/generalunitary.html#Base.exp-Tuple{Manifolds.GeneralUnitaryMatrices,%20Any,%20Any} ) and log (https://juliamanifolds.github.io/Manifolds.jl/stable/manifolds/generalunitary.html#Base.log-Tuple{Manifolds.GeneralUnitaryMatrices,%20Any,%20Any}, maybe with a typo in the documentation it seems) are available, if they are, maybe try them first over retractions. The step size depends a lot on the problem at hand and the algorithm you want to use. For example „a polynomial linesearch“ is currently probably not available but feel free to contribute that. Since I have note yet worked with that I can not say anything about possible performance boost (of things that I did not yet use). |
Beta Was this translation helpful? Give feedback.
-
Hi, thanks for your quick response and your effort in general!
That would be amazing, thank you! Sorry for the mess, it's the options for all three manifolds (and random initialization) that blow up the code a bit. Here is a shorter version thats more understandable, I hope. For complete error messages, I'll come back to you, I will be on holidays next week and need to trace down the issues more systematically! The problem is basically the same as "Brocket" in "test/solvers/test_quasi_Newton.jl", but simpler because I have only square matrices. (I tried three manifolds, unitary, orthogonal and Stiefel as their generalization though). A Hermitian matrix H shall be diagonalized by optimizing the Brockett functional, using a diagonal matrix N with distinct entries. In the end, the important things are the two blocks U1 and U2, where the optimization takes place.
Wow thats really cool! I understand, I just wanted to point it out, I thought maybe it's an error that nobody is aware of.
That's true, I was a bit lost here and just tried some coefficients with several retractions and vector transport methods, but most did not converge or only very slowly. In the end, my goal is, to get a (limited memory) BFGS algorithm that can outperform our existing CG algorithm. I did program a L-BFGS already (following https://www.math.fsu.edu/~aluffi/archive/paper488.pdf), but it didn't perform as expected. In particular, more memory lead to worse performance, which is why I wanted to check for consistency and maybe also get a comparison to the "full" BFGS. Thanks a lot for your help! |
Beta Was this translation helpful? Give feedback.
-
Hi!
@kellertuer I think one problem here is that this line: Manopt.jl/src/solvers/quasi_Newton.jl Line 658 in ebd4833 How did you get the
I would recommend trying the new Hager-Zhang linesearch: https://github.com/mateuszbaran/ImprovedHagerZhangLinesearch.jl . It performs much better than WolfePowell in my experiments. There are still a few things I need to tweak there and add an installation instruction but I will try to do it in the next couple of days. Regarding retractions and VTs, it's very much problem-dependent. I'm currently working on an automatic tuner that chooses them: https://github.com/JuliaManifolds/Manopt.jl/pull/339/files#diff-ee1b3e2c8f8de306b636f704ead86f0db8063ab3af536fdf14239c7d85c9ca52 . If you are interested, I can help you try that on your problem. It uses a Bayesian hyperparameter optimization with early stopping, using Python's Optuna library (Julia doesn't have anything close in functionality).
L-BFGS is quite hard to get right but I would expect it to work better than CG. It's definitely a good idea to investigate. |
Beta Was this translation helpful? Give feedback.
-
I just saw that in the beginning you wrote
Are these still missing? If so, could you maybe point to the closed form you spoke about so we can add them to Manifolds.jl :) ? |
Beta Was this translation helpful? Give feedback.
Thanks, this apparently solved the issue with the algorithm freezing! (Before I was using WolfePowellBinaryLinesearch)