Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

B-splines and incomplete grid #72

Open
sdrdis opened this issue Oct 25, 2016 · 5 comments
Open

B-splines and incomplete grid #72

sdrdis opened this issue Oct 25, 2016 · 5 comments

Comments

@sdrdis
Copy link
Contributor

sdrdis commented Oct 25, 2016

Hi,

Any indication on when B-spline will be able to work with incomplete grids ? Does an implementation / hack exists for 2D grids ?

Thanks,

@bgrimstad
Copy link
Owner

bgrimstad commented Oct 26, 2016

Hello,

Thank you for showing your interest in SPLINTER.

Actually, we do have two different approaches that work with scattered data (incomplete grids). Both implementations are experimental.

The first approach requires an understanding of how higher-dimensional splines are built from the knot vectors. Put simply: In 1-D, a B-spline is built from one knot vector which defines a grid on the real line. In 2-D, there are two knot vectors which forms a rectangular grid. Generally, in N-D, the knot vectors form a hyper-rectangular grid, so to speak. These grids are regular by construction and the number of grid points grows exponentially as the number of dimensions increase. This limits the conventional (tensor product) B-spline to 5-6 dimensions for all practical purposes.

Now, after the grid and B-spline degree has been decided upon, the remaining step in building a B-spline is to find coefficients so that we get a good match with the data. This involves solving a linear system of equations with the coefficients being the unknowns. The knot vectors decide the number of B-spline coefficients. They also decide the "area of effect" of each coefficient (read about the local support property of splines). Usually, it is a good idea to have at least as many data points as coefficients and to use regularization to prevent overfitting.

If you have scattered points, how would you select the knot vectors? There are some requirements or guidelines:

  1. all data points should lie within the grid (otherwise they will not be fitted to as the B-spline is zero outside of the grid)
  2. the number of grid points should not be much larger than the number of data points
  3. if there is a cluster of data points you may want a grid with high resolution in that area to capture the details in the data

If you put some effort into thinking about this problem, you will quickly find that selecting the grid, when the data points are scattered, is a very difficult problem. For example, the trade-off between 2. and 3. is not trivial.

If you construct a B-spline with the setting BSpline::KnotSpacing::EXPERIMENTAL, the builder will attempt to select good knot vectors from the data points (which may be scattered). The approach combines some logic with a sliding window approach to adapt the resolution of the grid to the data. We cannot guarantee that this works well for all data set (there are some cases in which the current implementation will give poor results). If you want to try this, I strongly suggest that you combine this setting with regularization and that you inspect the generated knot vectors.

The second approach is to use Gradient Boosting with splines. We have a light-weight implementation of this in Python. This algorithm will create a generalized additive model of splines, which is likely to give you better performance than the first approach I sketched above.

It depends on which problem you want to solve and your requirements, but I would generally recommend that you use the second approach (gradient boosting).

Let us know which path you choose to travel down.

Best regards,
Bjarne

@sdrdis
Copy link
Contributor Author

sdrdis commented Nov 1, 2016

Thank you very much for your complete answer, and sorry for my late answer. I will download the Development branch and try the second approach, and let you know if it works.

@bgrimstad
Copy link
Owner

Please do. You are the first tester outside of the development team so we would appreciate any feedback.

@sjulier
Copy link

sjulier commented Jan 14, 2018

We are interested in interpolating with scattered data. Has there been any development / feedback since these posts? I see the documentation still lists using scattered data as experimental and not advised.

@bgrimstad
Copy link
Owner

Hi @sjulier,

The development on interpolation with scattered data has not been prioritized since I wrote the comment above. As of now, there is a Python example in the bspline-interface branch showing Option 1. I have attached a figure below, showing the result of running the script. We will hopefully be able to prepare the release of v. 4, which enables this feature, in the near future.

May I ask for what purpose you would want to interpolate a B-spline to scattered data? I would disadvise using this method for predictive modeling as it likely will overfit the data. Regularization will help, but there are other methods that are better suited for this use case.

bspline_scattered_data_interpolation

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants