Naive Bayes classification

Naive Bayes classifiers can be used for example, to mark email (spam or not), to classify news articles, to check a piece of text for it containing expressions of positive or negative emotions, and in face recognition applications.

Naive Bayes classifiers are based on the Bayes Theorem:


\begin{equation}
\label{eq:bayes1}
\nonumber
P(\textbf{H}|\textbf{D}) = P(\textbf{H}) \frac{P(\textbf{D}|\textbf{H})}{P(\textbf{D})}
\end{equation}

with

$P(\textbf{H}|\textbf{D})$ Posterior: The probability of our hypothesis being true given the data collected.
$P(\textbf{H})$ Prior: The probability of our hypothesis being true before collecting data.
$P(\textbf{D}|\textbf{H})$ Likelihood: Probability of collecting this data when our hypothesis is true.
$P(\textbf{D})$ Marginal: The probability of collecting this data under all possible hypotheses.

May seem complex. It isn't.

The infamous dot example

There are 15 orange dots and 30 green dots in this set of dots.

Twice as many green dots as orange dots. The priors for Green resp Orange:


\begin{equation}
\label{eq:bayes2}
\nonumber
P(\textbf{Green}) =  \frac{\textbf{Number of green dots}}{\textbf{Total number of dots}} = \frac{\textbf{30}}{\textbf{45}} = 2/3
\end{equation}


\begin{equation}
\label{eq:bayes3}
\nonumber
P(\textbf{Orange}) =  \frac{\textbf{Number of orange dots}}{\textbf{Total number of dots}} = \frac{\textbf{15}}{\textbf{30}} = 1/3
\end{equation}


When a new dot appears, we'd expect it to be twice as likely to be green as orange.

The infamous dot example continues ...

Suppose we wish to classify a new object, a white dot. The dots seem neatly clustered, so it is reasonable to assume that the more green (or orange) dots in the vicinity of a new dot X, the more likely it is that the new dot X belongs to that particular colour. To measure this likelihood, draw a circle around X which encompasses a number (to be chosen a priori) of dots irrespective of class labels. Then calculate the number of dots in the circle belonging to each class label. Calculating likelihoods:


\begin{equation}
\label{eq:bayes4}
\nonumber
\textbf{Likelihood of X given Green} = \frac{\textbf{Number of Green dots in the vicinity of X}}{\textbf{Total number of Green dots}} = 1/30
\end{equation}


\begin{equation}
\label{eq:bayes5}
\nonumber
\textbf{Likelihood of X given Orange} = \frac{\textbf{Number of Orange dots in the vicinity of X}}{\textbf{Total number of Orange dots}} = 4/15
\end{equation}


The prior probabilities indicate that X may belong to Green (because there are twice as many Green dots compared to Orange dots) and these likelihoods indicate otherwise; that it is more likely for class membership of X to be Orange (because there are more Orange dots in the vicinity of X than Green dots).

In Bayesian analysis, the final classification is produced by combining both sources of information, the prior and the likelihood, to form a posterior probability using the so-called Bayes' rule (the equation at the start):


\begin{equation}
\label{eq:bayes6}
\nonumber
\textbf{Posterior probability of X being Green} = 
\textbf{Prior probability of Green}  \times  \textbf{Likelihood of X given Green} 
\end{equation}

= 2/3 x 1/30 = 1/45

\begin{equation}
\label{eq:bayes7}
\nonumber
\textbf{Posterior probability of X being Orange} = 
\textbf{Prior probability of Orange}  \times  \textbf{Likelihood of X given Orange} 
\end{equation}

= 1/3 x 4/15 = 4/45

In Bayesian analysis the White dot X is classified as Orange since its class membership achieves the largest posterior probability. Note that it is white though. And think facebook.


 
 
  • Last modified: 2020/02/17 18:18