Receiver Operating Characteristic

ROC Curve As we already introduced Precision and Recall  the ROC curve is another way of looking at the quality of classification algorithms. ROC stands for Receiver Operating Characteristic The ROC curve is created by plotting the true positive rate (TPR) on the y-axis against the false positive rate (FPR) on the x-axis at various…

Introduction to Pandas

Pandas is a data analyzing tool. Together with numpy and matplotlib it is part of the data science stack You can install it via pip install pandas Working with real data The data set we are using is the astronauts data set from kaggle: Download Data Set NASA Astronauts from Kaggle During this introduction we…

Python Pipfile and pipenv

  If You already read Python pip and virtualenv you are familiar with the way python handles requirements. but lo and behoild there is a new kid in town or actually two new kids on the block: Pipfile and Pipenv – both with with a capital “P”. If you are tired of creating and maintaining…

Intro to OpenCV with Python

Installation To work with OpenCV from python, you need to install it first. We additionally install numpy and matplotlib as well pip install opencv-python numpy matplotlib Reading Images from file After we import cv2 we can directly work with images like so: import cv2 img = cv2.imread(“doc_brown.png”) For showing the image, it is recommended to…

Python data classes

A cool new feature made its way into Python 3.7: Data classes. When You’ve already read my article about Lombok the concept isn’t so new at all: With the new decorator @dataclass You can save a huge amount of time because the methods __init__() __repr__() __eq__() are created for you! from dataclasses import dataclass @dataclass…

New Blog Post

Confusion Matrix

Too confused of the confusion matrix? Let me bring some clarity into this topic! Let’s take the example from Precision and Recall: y_true = [“dog”, “dog”, “non-dog”, “non-dog”, “dog”, “dog”] y_pred = [“dog”, “non-dog”, “dog”, “non-dog”, “dog”, “non-dog”] When we look at the prediction we can count the correct and incorrect classifications: dog correctly classified…

New Blog Post

numpy random choice

With numpy you can easily create test data with random_integers and randint. numpy.random.randint(low, high=None, size=None, dtype=’l’) numpy.random.random_integers(low, high=None, size=None) random_integers includes the high boundary while randint does not. >>> import numpy as np >>> np.random.random_integers(5) 4 >>> np.random.random_integers(5, size=(5)) array([5, 3, 4, 1, 4]) >>> np.random.random_integers(5, size=(5, 4)) array([[2, 3, 3, 5], [1, 3, 1,…

Classification: Precision and Recall

In the realms of Data Science you’ll encounter sooner or the later the terms “Precision” and “Recall”. But what do they mean? Clarification Living together with little kids You very often run into classification issues: My daughter really likes dogs, so seeing a dog is something positive. When she sees a normal dog e.g. a…