An interactive tutorial of Gram-Schmidt orthogonalizaton of vectors.

This learning module has many interactive demos. It is easier to work with them on a larger screen. Bookmark and revisit if you are currently on a small screen device.

\(\DeclareMathOperator*{\argmax}{arg\,max} \DeclareMathOperator*{\argmin}{arg\,min} \DeclareMathOperator*{\asterisk}{\ast} \newcommand{\sup}{\text{sup}} \newcommand{\inf}{\text{inf}} \newcommand{\min}{\text{min}\;} \newcommand{\max}{\text{max}\;} \newcommand{\maxunder}[1]{\underset{#1}{\max}} \newcommand{\minunder}[1]{\underset{#1}{\min}} \newcommand{\real}{\mathbb{R}} \newcommand{\natural}{\mathbb{N}} \newcommand{\integer}{\mathbb{Z}} \newcommand{\rational}{\mathbb{Q}} \newcommand{\irrational}{\mathbb{I}} \newcommand{\complex}{\mathbb{C}} \newcommand{\cardinality}[1]{|#1|} \newcommand{\vec}[1]{\mathbf{#1}} \newcommand{\mat}[1]{\mathbf{#1}} \newcommand{\star}[1]{#1^*} \newcommand{\inv}[1]{#1^{-1}} \newcommand{\indicator}[1]{\mathcal{I}(#1)} \renewcommand{\BigO}[1]{\mathcal{O}(#1)} \renewcommand{\BigOsymbol}{\mathcal{O}} \renewcommand{\smallo}[1]{\mathcal{o}(#1)} \renewcommand{\smallosymbol}[1]{\mathcal{o}} \newcommand{\set}[1]{\mathbb{#1}} \newcommand{\complement}[1]{#1^c} \newcommand{\powerset}[1]{\mathcal{P}(#1)} \newcommand{\setdiff}{\setminus} \newcommand{\setsymmdiff}{\oplus} \newcommand{\dash}[1]{#1^{'}} \newcommand{\permutation}[2]{{}_{#1} \mathrm{ P }_{#2}} \newcommand{\combination}[2]{{}_{#1} \mathrm{ C }_{#2}} \newcommand{\prob}[1]{P(#1)} \newcommand{\pmf}[1]{P(#1)} \newcommand{\pdf}[1]{p(#1)} \newcommand{\cdf}[1]{F(#1)} \newcommand{\expect}[2]{E_{#1}\left[#2\right]} \newcommand{\entropy}[1]{\mathcal{H}\left[#1\right]} \newcommand{\expe}[1]{\mathrm{e}^{#1}} \newcommand{\textexp}[1]{\text{exp}\left(#1\right)} \def\independent{\perp\!\!\!\perp} \def\notindependent{\not\!\independent} \newcommand{\yhat}{\hat{y}} \newcommand{\vs}{\vec{s}} \newcommand{\vt}{\vec{t}} \newcommand{\vu}{\vec{u}} \newcommand{\vv}{\vec{v}} \newcommand{\vw}{\vec{w}} \newcommand{\vx}{\vec{x}} \newcommand{\vy}{\vec{y}} \newcommand{\vz}{\vec{z}} \newcommand{\va}{\vec{a}} \newcommand{\vb}{\vec{b}} \newcommand{\vc}{\vec{c}} \newcommand{\vd}{\vec{d}} \newcommand{\ve}{\vec{e}} \newcommand{\vg}{\vec{g}} \newcommand{\vh}{\vec{h}} \newcommand{\vi}{\vec{i}} \newcommand{\vk}{\vec{k}} \newcommand{\vo}{\vec{o}} \newcommand{\vp}{\vec{p}} \newcommand{\vq}{\vec{q}} \newcommand{\vr}{\vec{r}} \newcommand{\vs}{\vec{s}} \newcommand{\vmu}{\vec{\mu}} \newcommand{\vsigma}{\vec{\sigma}} \newcommand{\vphi}{\vec{\phi}} \newcommand{\vtau}{\vec{\tau}} \newcommand{\vtheta}{\vec{\theta}} \newcommand{\mA}{\mat{A}} \newcommand{\mB}{\mat{B}} \newcommand{\mC}{\mat{C}} \newcommand{\mD}{\mat{D}} \newcommand{\mE}{\mat{E}} \newcommand{\mH}{\mat{H}} \newcommand{\mK}{\mat{K}} \newcommand{\mP}{\mat{P}} \newcommand{\mQ}{\mat{Q}} \newcommand{\mR}{\mat{R}} \newcommand{\mS}{\mat{S}} \newcommand{\mU}{\mat{U}} \newcommand{\mV}{\mat{V}} \newcommand{\mW}{\mat{W}} \newcommand{\mX}{\mat{X}} \newcommand{\mY}{\mat{Y}} \newcommand{\mZ}{\mat{Z}} \newcommand{\mI}{\mat{I}} \newcommand{\mLambda}{\mat{\Lambda}} \newcommand{\mSigma}{\mat{\Sigma}} \newcommand{\mTheta}{\mat{\theta}} \newcommand{\setsymb}[1]{#1} \newcommand{\sA}{\setsymb{A}} \newcommand{\sB}{\setsymb{B}} \newcommand{\sC}{\setsymb{C}} \newcommand{\sO}{\setsymb{O}} \newcommand{\sP}{\setsymb{P}} \newcommand{\sQ}{\setsymb{Q}} \newcommand{\sH}{\setsymb{H}} \newcommand{\sX}{\setsymb{X}} \newcommand{\sY}{\setsymb{Y}} \newcommand{\norm}[2]{||{#1}||_{#2}} \newcommand{\infnorm}[1]{\norm{#1}{\infty}} \newcommand{\fillinblank}{\text{ }\underline{\text{ ? }}\text{ }} \newcommand{\lbrace}{\left\{} \newcommand{\rbrace}{\right\}} \newcommand{\set}[1]{\lbrace #1 \rbrace} \newcommand{\seq}[1]{\left( #1 \right)} \newcommand{\ndim}{N} \newcommand{\ndimsmall}{n} \newcommand{\dataset}{\mathbb{D}} \newcommand{\ndata}{D} \newcommand{\ndatasmall}{d} \newcommand{\labeledset}{\mathbb{L}} \newcommand{\nlabeled}{L} \newcommand{\nlabeledsmall}{l} \newcommand{\unlabeledset}{\mathbb{U}} \newcommand{\nunlabeled}{U} \newcommand{\nunlabeledsmall}{u} \newcommand{\nclass}{M} \newcommand{\nclasssmall}{m} \newcommand{\loss}{\mathcal{L}} \newcommand{\sign}{\text{sign}} \newcommand{\Gauss}{\mathcal{N}} \newcommand{\hadamard}{\circ} \newcommand{\doh}[2]{\frac{\partial #1}{\partial #2}} \newcommand{\dox}[1]{\doh{#1}{x}} \newcommand{\doy}[1]{\doh{#1}{y}} \newcommand{\doxx}[1]{\doh{#1}{x^2}} \newcommand{\doyy}[1]{\doh{#1}{y^2}} \newcommand{\doxy}[1]{\frac{\partial #1}{\partial x \partial y}} \newcommand{\doyx}[1]{\frac{\partial #1}{\partial y \partial x}} \newcommand{\qed}{\tag*{$\blacksquare$}}\)

        Orthogonalization of vectors
        Linear Algebra
      

Introduction

In this article, we will study the Gram-Schmidt approach to orthogonalization of vectors. In the process, we will build an intuition to operations from a geometric perspective with an interactive demonstration.

Prerequisites

To understand orthogonalization of vectors, we recommend familiarity with the concepts in

Follow the above links to first get acquainted with the corresponding concepts.

Orthogonality

We saw in the earlier demo that the dot product of two vectors is zero if the angle between them is 90 degrees, since $ \cos 90^\circ = 0 $. Such non-zero vectors with a zero dot product are said to be mutually orthogonal.

Also, if two orthogonal vectors are also unit vectors, then they are said to be orthonormal.

If a set of vectors is the basis for a space and they are also mutually orthonormal, then they are known as the orthonormal basis of that space. Note that the X-axis and the Y-axis are the orthonormal basis of a Cartesian coordinate system and any point on that space can be defined in terms of their X and Y coordinates.

An understanding of orthogonal or orthonormal vectors is important for ML applications.

Orthogonalization

Data or models seldom automatically lead to orthogonal vectors. But many applications, such as machine learning, benefit from the representation of the dataset along orthonormal basis. For such applications of linear algebra, it is a common practice to discover the orthonormal basis for any given set of vectors.

Orthogonalization is the process used to arrive at orthogonal vectors from any given set of vectors. The popular algorithm used for this approach is known as the Gram-Schmidt approach.

Gram-Schmidt Algorithm

Here's how the Gram-Schmidt approach for orthgonalization works for a set of vectors $ \{\va_1, \ldots, \va_n \} $ into a set of mutually orthogonal vectors $ \{\vx_1, \ldots, \vx_n \} $

Let $ \vx_1 = \va_1 $.
Compute the projection of the next vector $ \va_2 $ onto $ \vx_2 $. Intuitively, this is a portion of $ \va_2 $ in the direction of $ \vx_1 $. $$ \text{proj}_{\vx_1}(\va_2) = \frac{\va_2 \cdot \vx_1}{\vx_1 \cdot \vx_1} \vx_1. $$
Now we need to just figure out the portion of $ \va_2 $ that cannot be explained by the projection $ \va_2(\vx_1) $. Remaining means subtraction, and that's what we will do. $$ \vx_2 = \va_2 - \text{proj}_{\vx_1}(\va_2) $$
Continue the process for the next vector $ \va_i $, but this time identify the portion that is left after removing components for all previous orthogonal vectors. $$ \vx_i = \va_i - \sum_{j = 1}^{i-1} \text{proj}_{\vx_j}(\va_i) $$

Orthogonalization: demo

Interact with the orthogonalization demo next to get a good intuitive feel for the process. Check out what happens when you change either of the input vectors. Note the change in the projection and the resulting orthogonalization result.

In this demo, $ \text{proj}_{\va}(\vb) $ is the projection of $ \vb $ on $ \va $. It is calculated as

$$ \text{proj}_{\va}(\vb) = \frac{ \va \cdot \vb}{ \vb \cdot \vb} \va $$

Also, $ \text{orth}_{\va}(\vb) $ is the component of $ \vb $ that is orthogonal to $ \va $. It is calculated as

$$ \text{orth}_{\va}(\vb) = \vb - \text{proj}_{\va}(\vb) $$

Drag the cricle to change the vector

Thus, intuitively, it can be seen that $ \vb $ can be split into two components — one that is along the same direction as $ \va $ and the remaining which is orthogonal to that direction. This important property is used in several linear algebra techniques.

Where to next?

Now that you are an expert in vectors, it is time to build expertise in other topics in linear algebra.

Please support us

Help us create more engaging and effective content and keep it free of paywalls and advertisements!

Please donate

Let's connect

Please share your comments, questions, encouragement, and feedback.