

9.3 Curve fitting and Regression
Regression analysis is the statistical term for curve fitting. We produce a curve that best fits some observed data points. Using regression, we can make predictions as to the behavior of some property in the future.
Curve fitting can be performed for any degree, and Matlab offers two simple functions for this purpose.
Given a data set of sine values, we attempt to find a curve in some degree to fit the data:
x = 0:0.1:3; y = sin(2*x); plot(x, y);



Figure 9.5 Click image to enlarge, or click here to open


Using function
polyfit
, we pass paramaters for the range of data (x), the actual values (y), and the degree of the polynomial to which the data is to be fit:
[ 0.7548 3.5303 3.7832 0.1701 ]


Figure 9.6 Click image to enlarge, or click here to open


polyfit
returns a vector with (DEGREE + 1) elements, corresponding to the polynomial coefficients, starting with the highest degree.
In the above example with 3rd degree:
p(1) = coefficient of x^{3}
p(2) = coefficient of x^{2}
p(3) = coefficient of x^{1} = x
p(4) = coefficient of x^{0} = 1
i.e.:
y = 0.7548 * x^{3}  3.5303 * x^{2}+ 3.7832 * x  0.1701
Using
polyval
, the vector of coefficients can be evaluated for any data range of x. polyval
takes as input paramaters a vector of coefficients, and the data range over which a corresponding y vector is to be formed.
For example, generating the new yvalues:


Figure 9.7 Click image to enlarge, or click here to open


Plot both the original and the fitted data over the same range:
yFit=polyval(p, x); hold on; plot(x, y, 'b'); plot(x, yFit, 'r'); hold off; legend('Original', 'Degree 3'); title('Polynomial Curve fit');



Figure 9.8 Click image to enlarge, or click here to open


Generate ydata for a new data range, and plotting both:
xNew=2:0.1:5; yFit=polyval(p, xNew); hold on; plot(x, y, 'b'); plot(xNew, yFit, 'r'); hold off; legend('Original', 'Degree 3'); title('Polynomial Curve fit');



Figure 9.9 Click image to enlarge, or click here to open


Of course, the approximated curve is not a sine curve, but in the range in which it was evaluated, the approximated curve does fit.
xNew=2:0.1:5; ySine = sin(2*xNew); hold on; plot(xNew, ySine, 'b'); plot(xNew, yFit, 'r'); hold off; legend('Original', 'Degree 3'); title('Polynomial Curve fit');



Figure 9.10 Click image to enlarge, or click here to open


We attemt the same approach for the gasoline prices for the East Coast:
x=1:size(data,1); y=data(:,1)'; p = polyfit(x, y, 3); yFit=polyval(p, x); hold on; plot(x, y, 'b'); plot(x, yFit, 'r'); hold off; legend('Original', 'Degree 3 prediction'); title('Polynomial Curve fit for Gas Prices in East Coast Region');



Figure 9.11 Click image to enlarge, or click here to open


For a closer fit, we try degree 10:
x=1:size(data,1); y=data(:,1)'; p = polyfit(x, y, 10); yFit=polyval(p, x); hold on; plot(x, y, 'b'); plot(x, yFit, 'r'); hold off; legend('Original', 'Degree 10 prediction'); title('Polynomial Curve fit for Gas Prices in East Coast Region');



Figure 9.12 Click image to enlarge, or click here to open


Testing it out on a future date range reveals the sad, but obvious fact: We'll be paying for gas out of our eyes:
xNew=1:(size(data,1) + 50); yNew=polyval(p, xNew); hold on; plot(x, y, 'b'); plot(xNew, yNew, 'r'); hold off; legend('Original', 'Degree 10 prediction for future values'); title('Polynomial Curve fit for Gas Prices in East Coast Region');



Figure 9.13 Click image to enlarge, or click here to open



