## Invoking the Bayes rule

The conditional probability \( P(C_m|\vx) \) for each class is computed using the Bayes rule.

\begin{equation}
P(C_m | \vx) = \frac{P(\vx | C_m) P(C_m)}{P(\vx)}
\label{eq:class-conditional-prob}
\end{equation}

In this equation, \(P(C_m) \) is the class-marginal probability.
The marginal \( P(C_m) \) is estimated as the fraction of training instances that belong to the class \( C_m \).

$$ P(C_m) = \frac{\text{Number of training instances belonging to } C_m}{\text{Total number of training examples}} $$

In Equation \eqref{eq:class-conditional-prob}, the term \( P(\vx) \) is the marginal probability of the instance \( \vx \).
Since it must belong to some class, it is sum of the probabilities of it being generated from any of the classes.

$$ P(\vx) = \sum_{m \in \set{1,\ldots,M}} P(C_m)P(\vx|C_m) $$

Now, they key quantity remaining is \( P(\vx|C_m) \). Note that \( \vx \) is \( n\)-dimensional, where \( n \) can be quite large.
Depending on the value of \( n \) it may be infeasible to model a multivariate distribution, for example multivariate Gaussian, directly for the estimation of this probability.
Also, the probability in such a joint multivariate model could be miniscule and the resulting calculation could suffer from underflow errors.