Lesson 3: Support Vector Machines

The term Support Vector Machines or SVM is a bit misleading. It is just a name for a very clever algorithm invented by two Russians. in the 1960s. SVMs are used for classification and regression. SVM do that by finding a hyperplane between two classes of data which separates both classes best.

Linear Algebra with numpy

Numpy is a package for scientific computing in Python. It is blazing fast due to its implementation in C. It is often used together with pandas, matplotlib and Jupyter notebooks. Often these packages are referred to as the datascience stack. Installation You can install numpy via pip pip install numpy Basic Usage In the datascience…

The Normal Distribution

Diving deeper into data science I started to brush up my knowledge about math especially statistics. The Mother of all Distributions The normal distribution was formulated by Carl Friedrich Gauß in 1809 and can be implemented in Python like the following : def normal_distribution_pdf(x, mu=0, sigma=1): sqrt_two_pi = math.sqrt(2*math.pi) return math.exp(-(x-mu)**2 / 2 / sigma**2)…

What is Cross-Validation in Data Science?

Motivation Cross-validation is a technique to validate the quality of your machine learning model. For validating your model you split your training data into a training and a test data set. ———————————————– | | | | training data | test data | | | | ———————————————– More training data means a better model, more test…

Introduction to Jupyter Notebook

JuPyteR Do You know the feeling of being already late to a party when encountering something new? But when you actually start telling others about it, you realize that it is not too common knowledge at all, e.g. Jupyter Notebooks. What is a Jupyter notebook? In my own words: a browser-based document-oriented command line style…

Data Science Overview

Questions Data Science tries to answer one of the following questions: Classification -> “Is it A or B?” Clustering -> “Are there groups which belong together?” Regression -> “How will it develop in the future?” Association -> “What is happening very often together?” There are two ways to tackle these problem domains with machine learning:…

SQL – the dark side

Sometimes your RDBMS does not allow you to do certain changes like updating a table without using a WHERE clause that uses a key column. When you are really sure what you want to do: SET SQL_SAFE_UPDATES = 0; Now the dirty brown magic can begin! Back to the SQL-Tutorial