Linear Least Squares

Linear least squares is a mathematical optimization technique to find an approximate solution for a system of linear equations that has no exact solution. This usually happens if the number of equations (m) is bigger than the number of variables (n). (See also linear regression.)

Definition

In mathematical terms, we want to find a solution for the "equation"
A\mathbf{x} \approx \mathbf{b},
where A is an m-by-n matrix (with m > n) and x and b are n- respectively m-dimensional column vectors. More precisely, we want to minimize the Euclidean norm squared of the residual Axb, that is, the quantity
\|A\mathbf{x}-\mathbf{b}\|^2 = \left(A\mathbf{x}_1-\mathbf{b}_1\right)^2
+\left(A\mathbf{x}_2-\mathbf{b}_2\right)^2 +\dots+ \left(A\mathbf{x}_n-\mathbf{b}_n\right)^2, where Axi denotes the i-th component of the vector Ax. Hence the name "least squares". It turns out that a vector x that minimizes this expression also solves the normal equation
A^T \! A \mathbf{x} = A^T \mathbf{b},
where AT stands for the transpose of A. Note that this corresponds to a system of linear equations. The matrix ATA on the left-hand side is a square matrix, which is invertible if A has full column rank (that is, if the rank of A is n). In that case, the solution of the system of linear equations is unique and given by
\mathbf{x} = (A^T\!A)^{-1} A^T \mathbf{b}.
The matrix (A^TA)^{-1}A^T is called the pseudo inverse of A. We cannot use the true matrix inverse of A (that is, A^{-1}), because it does not exist as A is not a square matrix (mn).

Computation

The normal equation can be solved like any other equation system, yet an efficient and numerically stable method can be obtained by first computing the QR decomposition of the matrix A. Then, with A = QR, where Q is an orthogonal matrix and R is an upper triangular matrix, the normal equation simplifies to
Rx = QTb.

Applications

The linear least squares method can be used to find an affine function RnR that best fits a given set of data (see general least squares method). (It is widely and erroneously thought that the word linear in the term linear regression refers to the linear or affine nature of the fitted function, but see that article.) We write the linear function we try to find as a 1-by-n matrix xT (so x is actually a column vector, see also linear transformation). The set of data consists of m (n+1)-tuples (x1, ..., xn, y). Those can be written into an m-by-n matrix A and a vector b, where every tuple corresponds to a row of A, the y becoming the corresponding entry in b. Then,
Axb
yields the function x we seek.

Example

Consider the points (0, 3), (2, 3), (4, 4), (−1, 2). We seek a solution of the form ax + b = y, that is,
\begin{pmatrix}x & 1 \end{pmatrix}\begin{pmatrix} a \\ b\end{pmatrix} = y
We can form then the matrix A:
A=\begin{pmatrix}
0 & 1 \\ 2 & 1 \\ 4 & 1 \\ -1 & 1 \\ \end{pmatrix}
A^T=\begin{pmatrix}
0 & 2 & 4 & -1 \\ 1 & 1 & 1 & 1 \end{pmatrix}
A^TA=\begin{pmatrix}
21 & 5 \\ 5 & 4 \end{pmatrix} and the vector b
\mathbf{b} = \begin{pmatrix}
3 \\ 3 \\ 4 \\ 2 \end{pmatrix} and then
A^T\mathbf{b}=\begin{pmatrix}
20 \\ 12 \end{pmatrix} So, the normal equation is
A^TA\begin{pmatrix} a \\ b\end{pmatrix} = A^T\mathbf{b}
\begin{pmatrix}
21 & 5 \\ 5 & 4 \end{pmatrix}.\begin{pmatrix} a \\ b\end{pmatrix}=\begin{pmatrix} 20 \\ 12 \end{pmatrix} Then,
(A^TA)^{-1}={1\over 59}\begin{pmatrix}
4 & -5 \\ -5 & 21 \end{pmatrix} and
\begin{pmatrix} a \\ b\end{pmatrix}={1\over 59}\begin{pmatrix}
4 & -5 \\ -5 & 21 \end{pmatrix}\begin{pmatrix} 20 \\ 12 \end{pmatrix}=\begin{pmatrix} 20/59 \\ 152/59 \end{pmatrix} and the line of best fit is (20/59)x + 152/59 = y.

See also

 

<< PreviousWord BrowserNext >>
saturn award for best horror film
maxim litvinov
egilsay
thriller film
wyre, orkney
nominal category
north ronaldsay
american alligator
wgbh
saturn award for best actor (film)
israel prize
papa westray
feels like home
dutch alphabet
saturn award for best actress (film)
wrap
symphony no. 7 (shostakovich)
undine (novella)
london knights
list of ireland related topics
saturn award for best supporting actor (film)
ij (letter)
chatterton's compound
bombur
guacamole
fli and kli
dorkbot
the war room
theodora
1779 in music
university of michigan executive system
descriptive geometry
susu
artbots
uss harder (ss 568)
battle of bywater
home cheesemaking
hamarkameratene
myst: the book of ti'ana
the rule of names
the word of unbinding
gillikin country
lake saint clair (north america)
list of rulers of croatia