We often plot some data on a graph and wonder if there might be a simple mathematical function to describe what we see:
In many cases, a polynomial approximation would be particularly useful. That is, we want to describe the line in the form:
y(x)=a0+a1x+a2x2+…+aNxN(1)y(x)=a0+a1x+a2x2+…+aNxN(1)
for some maximum power N (called the order of the polynomial). There is an unbelievably simple trick for doing this, which for the above graph tells us that the data is well approximated by the following 3rd order polynomial:
y(x)≈−33.37+76.69x−10.11x2+0.34x3(2)y(x)≈−33.37+76.69x−10.11x2+0.34x3(2)
We can confirm the accuracy of this approximation by plotting it with the original data:
I will now describe how to do this. I will give specific details (and source code) for a Matlab implementation, but will also describe the theory so you can implement the method in any way you choose.
The Method
This method is so simple, it can be done using just 2 lines of Matlab code:
The first line simply creates a (L+1 x N+1) matrix, C, whose (n,p)th element is np. The second line forms the Moore-Penrose pseudoinverse of C and multiplies this with the input data vector, y.
Why does this work? Well, if we insert x = 0, 1, 2, ..., L into Equation (1), then we are saying that the data values will be approximated by:
\[y[0] \; = \; a_0 \\
y[1] \; = \; a_0 \; + \; a_1 \; + \; ... \; + \; a_N \\
y[2] \; = \; (2^0)a_0 \; + \; (2^1)a_1 \; + \; ... \; + \; (2^N)a_N \\
\; \; \; \; \; \; \; \vdots \\
y[L] \; = \; (L^0)a_0 \; + \; (L^1)a_1 \; + \; ... \; + \; (L^N)a_N\\
\]
y[1] \; = \; a_0 \; + \; a_1 \; + \; ... \; + \; a_N \\
y[2] \; = \; (2^0)a_0 \; + \; (2^1)a_1 \; + \; ... \; + \; (2^N)a_N \\
\; \; \; \; \; \; \; \vdots \\
y[L] \; = \; (L^0)a_0 \; + \; (L^1)a_1 \; + \; ... \; + \; (L^N)a_N\\
\]
We can rewrite this as a matrix equation:
\[
\left[
\begin{array}{c}
y[0] \\
y[1] \\
y[2] \\
\vdots \\
y[L]
\end{array}
\right] \; = \; \left[
\begin{array}{ccccc}
1 & 0^{1} & 0^{2} & \cdots & 0^{N} \\
1^{0} & 1^{1} & 1^{2} & \cdots & 1^{N} \\
2^{0} & 2^{1} & 2^{2} & \cdots & 2^{N} \\
\vdots & \vdots & \vdots & \ddots & \vdots \\
L^{0} & L^{1} & L^{2} & \cdots & L^{N}
\end{array}
\right] \left[
\begin{array}{c}
a_{0} \\
a_{1} \\
a_{2} \\
\vdots \\
a_{N}
\end{array}
\right]
\]
\left[
\begin{array}{c}
y[0] \\
y[1] \\
y[2] \\
\vdots \\
y[L]
\end{array}
\right] \; = \; \left[
\begin{array}{ccccc}
1 & 0^{1} & 0^{2} & \cdots & 0^{N} \\
1^{0} & 1^{1} & 1^{2} & \cdots & 1^{N} \\
2^{0} & 2^{1} & 2^{2} & \cdots & 2^{N} \\
\vdots & \vdots & \vdots & \ddots & \vdots \\
L^{0} & L^{1} & L^{2} & \cdots & L^{N}
\end{array}
\right] \left[
\begin{array}{c}
a_{0} \\
a_{1} \\
a_{2} \\
\vdots \\
a_{N}
\end{array}
\right]
\]
or, defining a variable for each of the bracketed terms, simply:
y–=Ca––(3)y_=Ca_(3)
So, all we have to do now is find the vector a which best fits this equation (allowing us to then simply plug a0, a1, a2, ..., aN back into Equation (1)). I say "best fits" because an exact solution may not exist. For L>N, this is an overdetermined system of equations, so we may have to settle for an approximate solution. The most popular approximate solution I know of is the "minimum norm" solution, which is given by:
a––approx=C+y–a_approx=C+y_
where C+ denotes the Moore-Penrose pseudoinverse of C. And... that's all there is to it! We can check our approximation by substituting aapprox back into Equation (3):
y–approx=Ca––approxy_approx=Ca_approx
and plotting this against y, as I did above.
Final Comments
There are two issues that should be mentioned:
- Overfitting
- Numerical accuracy
Numerical rounding errors can become an issue, particularly for larger N. Again, stick to lower N where possible. I have also heard that a QR decomposition of C will often yield a slightly more numerically-accurate solution for a. In this case, the Matlab code for computing aapprox would be:
This solution can then be refined as follows:
Code Matlab M - [expand] 1 2 3 r = y - C*a; err = R\(R'\(C'*r)); a = a + err;
(and this refinement can be repeated iteratively).
Matlab Code
Full Matlab code:
View attachment poly_approx.zip
Simplified code with minimal comments:
Code Matlab M - [expand] 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 % Maximum order of polynomial N = 3; % Generate some input data L = 24; % maximum x value (starting from 0) x = (0:L).'; y = (1/3)*x.^3 - 10*x.^2 + 75*x - 33; y = y + 10*randn(L+1,1); % add some random noise to make it less obvious % Calculate a (approx) C = bsxfun([USER=311]@power[/USER], (0:L).', (0:N)); a = pinv(C)*y; % Check our approximation of y y_approx = C*a; % Plot results plot(x,y); grid on; hold on; plot(x,y_approx,'--r'); xlabel('x'); ylabel('y'); legend('Input data', 'Polynomial Approximation');