Table of Contents
What is Classification?
In classification, the first step is to understand the problem and identify potential features and labels. Features are those characteristics or attributes which affect the results of the label.The classification has two phases, a learning phase, and the evaluation phase. In the learning phase, classifier trains its model on a given dataset and in the evaluation phase, it tests the classifier performance. Performance is evaluated on the basis of various parameters such as accuracy, error, precision, and recall.
Introduction to Naive Bayes Classification
Naive Bayes is an extremely simple, probabilistic classification classifier algorithm that uses Bayes’ Theorem. It is a probabilistic classifier that makes classifications using the Maximum A Posteriori decision rule in a Bayesian setting. It can also be represented using a very simple Bayesian network.
In simple terms, a naive Bayes classifier assumes that the presence of a particular feature in a class is unrelated to the presence of any other feature.
And if you are wondering why the word ‘Naive’ is with this algorithm. This article from Intel explains it well.
For example, a ball may be considered a soccer ball if it is hard, round, and about seven inches in diameter. Even if these features depend on each other or upon the existence of the other features, naive Bayes believes that all of these properties independently contribute to the probability that this ball is a soccer ball. This is why it is known as naive.
In spite of their apparently over-simplified assumptions, naive Bayes classifiers have worked quite well in many real-world situations, famously document classification and spam filtering. They require a small amount of training data to estimate the necessary parameters.
We will be using the Scikit learn library. Sciekit learn’s official documentation define Bayes algorithm as:
Naive Bayes’s methods are a set of supervised learning algorithms based on applying Bayes’ theorem with the “naive” assumption of conditional independence between every pair of features given the value of the class variable.
The different naive Bayes classifiers differ mainly by the assumptions they make regarding the distribution of P(xi∣y)
We would be using Gaussian Naive Bayes in our tutorial.
Naive Bayes classifiers can perform well even with high-dimensional data points and/or a large number of data points. Note that there is very little explicit training in Naive Bayes compared to other common classification methods.
What is Naive Bayes Algorithm
To make classifications, we need to use X to predict Y. In other words, given a data point X=(x1,x2,…,xn), what the odd of Y being y.
Using Naive Bayes classifier using Scikit Learn library
To use Naive Bayes classifier we will be using Sci-kit python library and a great thing about Scikit is that it has great documentation to get started with. If you search Scikit documentation for Naive Bayes Classification, it has the following example given for analyzing IRIS dataset(Already included with Scikit datasets).
Number of mislabeled points out of a total 75 points : 4