Eigendecomposition
Suppose a matrix \( \mA \in \real^{n \times n} \) is a square matrix with \( n \) linearly independent eigenvectors \( \vx_1, \ldots, \vx_n \) and corresponding eigenvalues \( \lambda_1,\ldots,\lambda_n \).
Let's define a new matrix \( \mQ \) to be a collection of these eigenvectors, such as each column of \( \mQ \) is an eigenvector of \( \mA \)
$$ \mQ = \begin{bmatrix} \vx_1 ~\ldots~ \vx_n \end{bmatrix} $$
Now,
$$ \mA \mQ = \begin{bmatrix} \mA \vx_1 ~\ldots~ \mA \vx_n \end{bmatrix} $$
And because \( \vx_i, \forall i \) are eigenvectors, \( \mA \vx_i = \lambda_i \vx_i \). Substituting
$$ \mA \mQ = \begin{bmatrix} \lambda_1 \vx_1 ~ \ldots ~ \lambda_n \vx_n \end{bmatrix} $$
This means \( \mA \mQ = \mQ \mLambda \), where we have let \( \mLambda \) denote diagonal matrix of eigenvalues.
$$
\mLambda = \begin{bmatrix}
\lambda_1 & \ldots & 0 \\
\vdots & \ddots & \vdots \\
0 & \ldots & \lambda_n
\end{bmatrix}
$$
Since the matrix \( \mQ \) is invertible, (Do you know why?), we can multiply both sides of the equation with \( \mQ^{-1} \).
$$ \mA = \mQ \mLambda \mQ^{-1} $$
Here, \( \mathbf{\Lambda} \) is a diagonal matrix, such that the elements along the main diagonal are eigenvalues of \( \mA \). The matrix \( \mathbf{Q} \in \real^{n \times n}\) is a square matrix such that the \( i \)-th column of \( \mathbf{Q} \) is the eigenvector of \( \mA \) corresponding to the eigenvalue in the \( i \)-th row of \( \mathbf{\Lambda} \).
Any full-rank square matrix can be factorized this way.
This factorization of the matrix into its eigenvalues and eigenvectors is known as the eigendecomposition of the matrix.
For real symmetric matrices, eigendecomposition is guaranteed.
The Spectral Theorem states that the eigendecomposition of a real symmetric matrix leads to orthogonal eigenvectors \( \mA = \mQ \mLambda \mQ^T \).
When an eigendecomposition exits, it may not be unique.
If there are duplicate eigenvalues, then any orthogonal vector in the span of the corresponding eigenvectors is also an eigenvector.
Thus, an alternate \( \mQ \) can be chosen.
Eigendecomposition is one of the approaches to finding the inverse of a matrix that we alluded to earlier.
If a matrix can be eigendecomposed, then finding its inverse is quite easy. Using properties of inverses listed before.
$$ \inv{\mA} = \inv{(\mathbf{Q}\mathbf{\Lambda}\inv{\mathbf{Q}})} = \mathbf{Q} \inv{\mathbf{\Lambda}} \inv{\mathbf{Q}} $$
We know from previous discussion that the inverse of the orthogonal matrix \( \mQ \) is cheap and it always exists.
Since \( \mathbf{\Lambda} \) is a diagonal matrix, its inverse, \( \inv{\mathbf{\Lambda}} \), can also be easily computed.
It is merely another diagonal matrix whose entries are the reciprocals of the values along the diagonal of the original matrix.
$$
\mLambda^{-1} = \begin{bmatrix}
\frac{1}{\lambda_1} & \ldots & 0 \\
\vdots & \ddots & \vdots \\
0 & \ldots & \frac{1}{\lambda_n}
\end{bmatrix}
$$
Now here's the interesting part. If any of the eigenvalues is zero, then the diagonal \( \mLambda \) is not invertible (its determinant is zero!), and by consequence, \( \mA \) is also not invertible.
In fact,
Eigendecomposition appears repeatedly in machine learning literature, sometimes as the key step of the learning algorithm itself.
Check out the next interactive demo to understand eigendecomposition visually.
The column on the right shows the space transformed by \( \mA \).
The bottom row shows the same transformation, arrived at by sequentially multiplying with SVD components.
It is presented in the following order: \( \vx \rightarrow \mQ^{-1}\vx \rightarrow \mLambda\mQ^{-1}\vx \rightarrow \mQ \mLambda \mQ^{-1} \vx \).
Note how the first transformation \( \mQ^{-1}\vx \) mostly rotates the space into an orientation that is amenable to the appropriate stretch or shrinkage by the \( \mLambda \) in the second step. In the final step, the transformation matrix \( \mQ \), being an inverse of the first step, reverses the rotation from the first step.
Try to modify the matrix such that one of the elements of the diagonal matrix \( \mLambda \) becomes almost zero. Notice how the space gets squashed along that direction. Also, observe what happens when you end up with a zero eigenvalue in the \( \mLambda \) matrix.