# Singular value decomposition

##### Linear Algebra

Eigendecomposition is only defined for square matrices. For rectangular matrices, we turn to singular value decomposition (SVD).

In this article, we will try to provide a comprehensive overview of singular value decomposition and its relationship to eigendecomposition.

## Prerequisites

To understand singular value decomposition, we recommend familiarity with the concepts in

Follow the above links to first get acquainted with the corresponding concepts.

## Singular value decomposition

Eigendecomposition is only defined for square matrices. For rectangular matrices, we turn to singular value decomposition.

Every real matrix $\mA \in \real^{m \times n}$ can be factorized as follows

$$\mA = \mU \mD \mV^T$$

Such formulation is known as the Singular value decomposition (SVD).

Here, the columns of $\mU$ are known as the left-singular vectors of matrix $\mA$. $\mU \in \real^{m \times m}$ is an orthogonal matrix.

The columns of $\mV$ are known as the right-singular vectors of the matrix $\mA$. $\mV \in \real^{n \times n}$ is an orthogonal matrix.

And $\mD \in \real^{m \times n}$ is a diagonal matrix containing singular values of the matrix $\mA$.

Note that $\mU$ and $\mV$ are square matrices The diagonal matrix $\mD$ is not square, unless $\mA$ is a square matrix.

### The relationship between SVD and Eigendecomposition

In SVD, the roles played by $\mU, \mD, \mV^T$ are similar to those of $\mQ, \mLambda, \mQ^{-1}$ in eigendecomposition. But that similarity ends there.

Here's an important statement that people have trouble remembering. SVD of a square matrix may not be the same as its eigendecomposition.

In fact, the SVD and eigendecomposition of a square matrix coincide if and only if it is symmetric and positive definite (more on definiteness later). In that case,

$$\mA = \mU \mD \mV^T = \mQ \mLambda \mQ^{-1} \implies \mU = \mV = \mQ \text{ and } \mD = \mLambda$$

In general though, the SVD and Eigendecomposition of a square matrix are different. The most important differences are listed below

• Singular values are always non-negative, but eigenvalues can be negative
• The matrices $\mU$ and $\mV$ in an SVD are always orthogonal. But the matrix $\mQ$ in an eigendecomposition may not be orthogonal. A symmetric matrix guarantees orthonormal eigenvectors, other square matrices do not.

For rectangular matrices, some interesting relationships hold.

Now, we know that for any rectangular matrix $\mA$, the matrix $\mA^T \mA$ is a square symmetric matrix. So, eigendecomposition is possible. Moreover, it has real eigenvalues and orthonormal eigenvectors

\begin{align} & \mA^T \mA = \mQ \mLambda \mQ^T \\ & \implies \left(\mU \mD \mV^T \right)^T \left(\mU \mD \mV^T\right) = \mQ \mLambda \mQ^T \\ & \implies \mV \mD \mU^T \mU \mD \mV^T = \mQ \mLambda \mQ^T \\ & \implies \mV \mD^2 \mV^T = \mQ \mLambda \mQ^T \\ \end{align}

Here, we have used the fact that $\mU^T \mU = I$ since $\mU$ is an orthogonal matrix.

Thus, the columns of $\mV$ are actually the eigenvectors of $\mA^T \mA$.

Moreover, the singular values along the diagonal of $\mD$ are the square roots of the eigenvalues in $\mLambda$ of $\mA^T \mA$.

A similar analysis leads to the result that the columns of $\mU$ are the eigenvectors of $\mA \mA^T$.

### Understanding $\mU, \mD, \text{ and } \mV$ in SVD

Now that we know that eigendecomposition is different from SVD, time to understand the individual components of the SVD.

We saw in an earlier interactive demo that orthogonal matrices rotate and reflect, but never stretch. So that's the role of $\mU$ and $\mV$, both orthogonal matrices.

But, $\mU \in \real^{m \times m}$ and $\mV \in \real^{n \times n}$. So they perform the rotation in different spaces.

Since $\mU$ and $\mV$ are strictly orthogonal matrices and only perform rotation or reflection, any stretching or shrinkage has to come from the diagonal matrix $\mD$.

Hence, the diagonal non-zero elements of $\mD$, the singular values, are non-negative. Any dimensions with zero singular values are essentially squashed. Dimensions with higher singular values are more dominant (stretched) and conversely, those with lower singular values are shrunk. And therein lies the importance of SVD.

By focusing on directions of larger singular values, one might ensure that the data, any resulting models, and analyses are about the dominant patterns in the data. This is achieved by sorting the singular values in magnitude and truncating the diagonal matrix to dominant singular values. That will entail corresponding adjustments to the $\mU$ and $\mV$ matrices by getting rid of the rows or columns that correspond to lower singular values.

So, if we are focused on the $r$ top singular values, then we can construct an approximate or compressed version $\mA_r$ of the original matrix $\mA$ as follows:

$$\mA_r = \mU_r \mD_r \mV_r^T$$

This is a great way of compressing a dataset while still retaining the dominant patterns within.

In fact, the number of non-zero or positive singular values of a matrix is equal to its rank.

Machine learning is all about working with the generalizable and dominant patterns in data. Some details might be lost. In fact, in some cases, it is desirable to ignore irrelevant details to avoid the phenomenon of overfitting. And this is where SVD helps.

As a consequence, the SVD appears in numerous algorithms in machine learning. In the upcoming learning modules, we will highlight the importance of SVD for processing and analyzing datasets and models.

## Where to next?

In this section, we have merely defined the various matrix types. To really build intuition about what these actually mean, we first need to understand the effect of multiplying a particular type of matrix. We present this in matrix as a transformer.

You may also choose to explore other advanced topics linear algebra.

Already feeling like an expert in linear algebra? Move on to other advanced topics in mathematics or machine learning.