# Lesson 2: Naive Bayes

Lesson 2 of the Udacity Course UD120 – Intro to Machine Learning deals with Naive Bayes classification.

## Mini project

For the mini project you should fork https://github.com/udacity/ud120-projects and clone it. It is recommended to install a python 2.7 64bit version because ML is heavy data processing and can easily rip up more than 2GB of memory.

### Dependecies

After cloning the repo I would recommend setting up a venv and install the requirements:

• sklearn
• numpy
• scipy
• matplotlib

### The Code

The code itself is pretty straightforward:

• Instantiate the classifier
• Train (fit) the Classifier
• Predict
• Calculate accuracy
```# training
print("Start training")
t0 = time()
clf = GaussianNB()
clf.fit(features_train, labels_train)
print("training time:", round(time() - t0, 3), "s")

# prediction
print("start predicting")
t0 = time()
prediction = clf.predict(features_test)
print("predict time:", round(time() - t0, 3), "s")

# accuracy
print("Calculating accuracy")
accuracy = accuracy_score(labels_test, prediction)
print("Accuracy calculated, and the accuracy is", accuracy)```

The output on my machine:

```training time: 1.762 s
start predicting
predict time: 0.286 s
Calculating accuracy
Accuracy calculated, and the accuracy is 0.9732650739476678```

The simple Gaussian Naive Bayes is pretty accurate with 97.3%