-
Notifications
You must be signed in to change notification settings - Fork 34
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Loess 0.6 fails (never completes) while Loess 0.5.4 completes in a few seconds #73
Comments
I've taken a closer look at this. There seems to be two things going on here. The first thing is that the old version used a different rule for deciding on splits when building the KD tree. The rule was different from the one in the Loess papers. The old approach just used the median even when it wasn't unique. The papers search for the index where a change happens. It took some work to figure out exactly how they did it so I wrote a comment. However, this is a linear search for where the x value changes and there are a lot of ties in your data. Specifically, it searches forward and backwards for
and it has to do a bit of sorting for each iteration. The second issue is that the partial sort seems to be slower than it should and if I add a function barrier then it's much faster so it seem that the closure is causing some overhead, but even after that change, building the tree for your dataset is prohibitively slow. However, I tried the |
Data for MWE (compressed because GitHub doesn't like
.arrow
as attachments)loess.arrow.zip
cc @dmbates
The text was updated successfully, but these errors were encountered: