Now that we understand the geometry of the matrix multiplication, it is time to understand it's counterpart — the matrix inverse.
Now that we understand the geometry of the matrix multiplication, it is time to understand it's counterpart — the matrix inverse.
To understand matrix inverse, we recommend familiarity with the concepts in
Follow the above links to first get acquainted with the corresponding concepts.
The inverse of a matrix \( \mA \) is a matrix \( \inv{\mA} \), such that
$$ \mA \inv{\mA} = \inv{\mA}\mA = \mI $$
We studied earlier that systems of linear equations can be represented in a concise form as
$$ \mA \vx = \vy $$
where, \( \mA \) is the matrix of coefficients, \( \vx \) is the vector of variables (that need to be solved for), and \( \vy \) is the vector of outputs.
As defined above, \( \inv{\mA} \mA = \mI \). Therefore
\begin{aligned} \mA \vx = \vy& \\\\ \implies& \inv{\mA} \mA \vx = \inv{\mA} \vy \\\\ \implies& \vx = \inv{\mA} \vy \\\\ \end{aligned}
Thus, \( \vx = \inv{\mA}\vy \), is the solution to our equations in variables \( \vx \). In other words, multiplication with a matrix inverse, reverts the effect of the original matrix multiplication, and recovers the original input vector, the desired solution.
Moreover, given the requirements for matrix multiplication, the inverse \( \inv{\mA} \) is also a square matrix of the same size as of the original matrix \( \mA \).
Do all matrices have an inverse? Let's find out.
Now we know from our earlier discussion on systems of linear equations that \( \mA \vx = \vy \) has a solution if and only if \( \mA \) is full rank.
So, for \( \mA \) to be invertible or for the inverse to exist, it needs to be square and full-rank.
No inverse, no solution.
No solution, no inverse.
Easy as that.
A matrix inverse is only defined for a square full-rank matrix.
We saw earlier how to solve equations in \( n \) unknowns by Gaussian elimination. Well, finding the inverse of a \( n \times n \) matrix requires solving for \( n^2 \) unknowns. So, how do we do it?
Well, here's the key information we know. \( \mA \inv{\mA} = \mI\). This is a system of \( n^2 \) equations on \( n^2 \) unknowns. So, as long as \( \mA \) is full rank, we should be able to solve it. Also, we can use plain old Gaussian elimination to do it. Same old row multiplication, subtraction, variable elimination.
Let's build some intuition from the geometric perspective with this demo. A few observations to note:
To build further intuition, check out this demo involving matrix inverse. The right-most column of charts shows the space transformations \( \vx, \mA\vx, \inv{\mA}\mA \vx \), in that order. The bottom-most row of charts shows the space transformations \( \vx, \inv{\mA} \vx, \mA \inv{\mA}\vx \).
Notice how the inverse of the matrix rotates the space in a direction opposite that by the input matrix.
Also notice that when the matrix stretches the space along a direction, its inverse shrinks it along the same direction. The effect is reversed when the matrix shrinks it.
Thus, the net effect of \( \inv{\mA} \mA \vx \) is the recovery of the original space \( \vx \), as evidenced in the last panel of the bottom row of charts.
Cool, isn't it?
The inverse of an orthogonal matrix is particularly interesting.
Let \( \mQ \) is an orthogonal matrix.
Remember that the rows and columns of an orthogonal matrix are orthonormal vectors. So a dot product of a row-vector with itself is 1. The dot product between any pair of mutually orthonormal vectors is 0.
We know that the following must be true for an orthogonal matrix \( \mQ \inv{\mQ} = \mI \).
In \(\mQ \inv{\mQ} \), we take the dot product of rows of \( \mQ \) with the columns of \( \inv{\mQ} \). The result \( \mQ \inv{\mQ} = \mI \) implies that the \( i \)-th row of \( \mQ \) is same as the \(i\)-th column of \( \inv{\mQ} \).
Moreover, the \( i \)-the row is orthogonal to the rest of the columns \( j \ne i, \forall j \).
This would be possible if \( \inv{\mQ} = \mQ^T \).
Thus, the transpose of an orthogonal matrix is its inverse. And because a matrix always has a transpose, an orthogonal matrix is always invertible and never singular.
This is a key result. It makes it desirable to factorize or decompose a matrix such that some components are orthogonal matrices. We will study two such decompositions: Eigendecomposition and Singular value decomposition.
The Moore-Penrose pseudoinverse, denoted as \( \mA^+ \), is a generalization of the inverse for rectangular matrices.
We do not deal with that often in machine learning literature, so this will not be covered here.
You may choose to explore other advanced topics linear algebra.
Already feeling like an expert in linear algebra? Move on to other advanced topics in mathematics or machine learning.
Help us create more engaging and effective content and keep it free of paywalls and advertisements!
Please share your comments, questions, encouragement, and feedback.