Feature Scaling

What is Feature Scaling? Feature Scaling is an important pre-processing step for some machine learning algorithms. Imagine you have three friends of whom you know the individual weight and height. You would like to deduce Christian’s  t-shirt size from David’s and Julia’s by looking at the height and weight. Name Height in m Weight in…

Receiver Operating Characteristic

ROC Curve As we already introduced Precision and Recall  the ROC curve is another way of looking at the quality of classification algorithms. ROC stands for Receiver Operating Characteristic The ROC curve is created by plotting the true positive rate (TPR) on the y-axis against the false positive rate (FPR) on the x-axis at various…

Intro to OpenCV with Python

Installation To work with OpenCV from python, you need to install it first: pip install opencv-python Reading Images from file After we import cv2 we can directly work with images like so: import cv2 img = cv2.imread(“doc_brown.png”) For showing the image, it is recommended to use matplotlib import matplotlib.pyplot as plt img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)…

New Blog Post

Confusion Matrix

Too confused of the confusion matrix? Let me bring some clarity into this topic! Let’s take the example from Precision and Recall: y_true = [“dog”, “dog”, “non-dog”, “non-dog”, “dog”, “dog”] y_pred = [“dog”, “non-dog”, “dog”, “non-dog”, “dog”, “non-dog”] When we look at the prediction we can count the correct and incorrect classifications: dog correctly classified…

New Blog Post

numpy random choice

With numpy you can easily create test data with random_integers and randint. numpy.random.randint(low, high=None, size=None, dtype=’l’) numpy.random.random_integers(low, high=None, size=None) random_integers includes the high boundary while randint does not. >>> import numpy as np >>> np.random.random_integers(5) 4 >>> np.random.random_integers(5, size=(5)) array([5, 3, 4, 1, 4]) >>>np.random.random_integers(5, size=(5, 4)) array([[2, 3, 3, 5], [1, 3, 1, 3],…

Classification: Precision and Recall

In the realms of Data Science you’ll encounter sooner or the later the terms “Precision” and “Recall”. But what do they mean? Clarification Living together with little kids You very often run into classification issues: My daughter really likes dogs, so seeing a dog is something positive. When she sees a normal dog e.g. a…

UD120 – Intro to Machine Learning

One part of my bucket list for 2018 is finishing the Udacity Course UD120: Intro to Machine Learning. The host of this course are Sebastian Thrun, ex-google-X and founder of Udacity and Katie Malone, creator of the Linear digressions podcast. The course consists of 17 lessons. Every lesson has a couple of hours of video…