You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Thanks a lot for the great repo! Clean implementation of iLQR and easy to understand, it's really useful.
I have a question though: would it be possible to constrain the possible values of the control input u that is returned by the optimization procedure? Like we want to pick the optimal control trajectory, but inside of some bounds on the values that are outputted by the algorithm. I'm happy to implement it myself if you know where/how it should be done, but I can't figure it out yet.
Let me know if you have any idea how to do that!
The text was updated successfully, but these errors were encountered:
The cartpole and pendulum notebooks in examples/ actually constrain the action space through a squashing function tanh(u) with iLQR. I believe something similar could work for constraining state. Unfortunately, this is neither an idiomatic nor ideal solution due to potentially vanishing gradients, but that's the best we can do within the limitations of iLQR.
Another option is to implement something like BoxQP that can deal with this more explicitly. I would gladly accept an implementation of that. It would probably fit well as another implementation of BaseController separate from the iLQR controller.
Thanks for your reply!
I read the paper, indeed their solution seems to be better than squashing or clamping. However I don't have time to implement it right now, I might try later, but for now I'll just use squashing as you do in the cartpole example. Hopefully it will be enough for my applications ! Otherwise I'll let you know.
Hi,
Thanks a lot for the great repo! Clean implementation of iLQR and easy to understand, it's really useful.
I have a question though: would it be possible to constrain the possible values of the control input u that is returned by the optimization procedure? Like we want to pick the optimal control trajectory, but inside of some bounds on the values that are outputted by the algorithm. I'm happy to implement it myself if you know where/how it should be done, but I can't figure it out yet.
Let me know if you have any idea how to do that!
The text was updated successfully, but these errors were encountered: