Essays/Linear Regression
Linear regression is a statistical method of modeling the relationship between the dependent variable Y and independent X by estimating the coefficients of the linear form:
where each terms is a certain expression with the original independent variables (). For example, it could be that .
Least Squares Method
In least squares method, the coefficients of linear regression are selected in a way to minimize the sum of squared deviations between observations and their estimates:
Surface Fit Example
As an example we will take a certain bi-quadratic form
then add a small amount of noise, to simulate observed data, and try to reconstruct the coefficients using the least squares method.
inline:lsq_form.png inline:lsq_data.png inline:lsq_estm.png 'surface'plot X1;X2;FORM 'surface'plot X1;X2;DATA 'surface'plot X1;X2;COEF mp XMAT
load 'plot' mp =: +/ . * 'X1 X2' =: |: ,"0/~ i:8 $XMAT =: 1 , X1 , (X1^2) , X2 , (X1*X2) ,: (X2^2) 6 17 17 FORM =: 1 0 0.2 0.3 0 _0.4 mp XMAT FORM -: 1 + (0.2*X1^2) + (0.3*X2) + (_0.4*X2^2) 1 NOISE =: 4 * _0.5 + ($X1) ?.@$ 0 $DATA =: FORM + NOISE 17 17 COEF =: (,DATA) %. |:,"2 XMAT
Now we can compare the obtained coefficients with the original formula.
0j4": COEF ,: (,FORM) %. |:,"2 XMAT 1.0011 _0.0144 0.2005 0.3104 0.0024 _0.4013 1.0000 0.0000 0.2000 0.3000 0.0000 _0.4000
Additional regression analysis is provided in the 'stats' package.
load 'stats' (|:}.,"2 XMAT) regression ,DATA Var. Coeff. S.E. t 0 1.00105 0.12654 7.91 1 _0.01444 0.01375 _1.05 2 0.20052 0.00316 63.55 3 0.31036 0.01375 22.56 4 0.00241 0.00281 0.86 5 _0.40131 0.00316 _127.17 Source D.F. S.S. M.S. F Regression 5 27192.76720 5438.55344 4144.49 Error 283 371.36300 1.31224 Total 288 27564.13020 S.E. of estimate 1.14553 Corr. coeff. squared 0.98653
The index shows high degree of match between the observations and their estimates.
See Also
- [wiki:WikiPedia:Linear_regression Linear regression], [wiki:WikiPedia:Least_squares Least squares], [wiki:WikiPedia:Curve_fitting Curve fitting], [wiki:WikiPedia:Interpolation Interpolation], Wikipedia
- Introduction to Statistics: The Nonparametric Way, pp 251-272