# Smoothing Spline Formulation¶

## Definition¶

The package implements cubic smooting spline algorithm proposed by Carl de Boor in his book “A Practical Guide to Splines” [1].

The smoothing spline \(f\) minimizes

where the first term is *error measure* and the second term is *roughness measure*.
Error measure weights \(w_j\) are equal to 1 by default.
\(D^2f\) denotes the second derivative of the function \(f\).

- The smoothing parameter \(p\) should be in range \([0, 1]\) where bounds are:
0: The smoothing spline is the least-squares straight line fit to the data

1: The natural cubic spline interpolant

The smoothing parameter p can be computed automatically based on the given data sites \(x\).

Note

The calculation of the smoothing spline requires the solution of a linear system whose coefficient matrix
has the form \(pA + (1 - p)B\), with the matrices \(A\) and \(B\) depending on the
data sites \(X\). The automatically computed smoothing parameter makes `p*trace(A) equal (1 - p)*trace(B)`

.

## Implementation¶

**csaps** is implemented as a pure (without C-extensions) Python modified port of MATLAB CSAPS function
that is an implementation of Fortran routine SMOOTH from PGS
(originally written by Carl de Boor). The implementation based on linear algebra routines and uses NumPy and sparse
matrices from SciPy.

## Differences from smoothing splines in SciPy¶

scipy.interpolate package contains the following classes:

These splines can be computed as \(k\)-ordered (0-5) spline and its smoothing parameter \(s\) specifies the number of knots by specifying a smoothing condition. Also it is only univariate and rect bivariate (2D grid) splines. The algrorithm cannot be used for vectorized computing splines for multivariate and nd-grid cases.

Also the performance of these splines depends on the data size and smoothing parameter `s`

because
the number of knots will be iterative increased until the smoothing condition is satisfied:

```
sum((w[i] * (y[i]-spl(x[i])))**2, axis=0) <= s
```

Unlike these splines the performance of **csaps** algorithm only depends on the data size and the data dimension.

**csaps** spline is cubic only and it has natural boundary condition type. The computation algorithm
is vectorized to compute splines for multivariate/gridded data. The smoothing parameter \(p\) determines
the weighted sum of terms and limited by the range \([0, 1]\). This is more convenient in practice
to control smoothing.

It is an example plot of comparison `csaps`

and `scipy.UnivariateSpline`

(k=3) with defaults (auto smoothing):

Footnotes