The Normal Distribution

Diving deeper into data science I started to brush up my knowledge about math especially statistics. The Mother of all Distributions The normal distribution was formulated by Carl Friedrich Gauß in 1809 and can be implemented in Python like the following : def normal_distribution_pdf(x, mu=0, sigma=1): sqrt_two_pi = math.sqrt(2*math.pi) return math.exp(-(x-mu)**2 / 2 / sigma**2)…

What is Cross-Validation in Data Science?

Motivation Cross-validation is a technique to validate the quality of your machine learning model. For validating your model you split your training data into a training and a test data set. ———————————————– | | | | training data | test data | | | | ———————————————– More training data means a better model, more test…

Introduction to Jupyter Notebook

JuPyteR Do You know the feeling of being already late to a party when encountering something new? But when you actually start telling others about it, you realize that it is not too common knowledge at all, e.g. Jupyter Notebooks. What is a Jupyter notebook? In my own words: a browser-based document-oriented command line style…

pip optional dependencies

Sometimes you want to make your python package usable for different situations, e.g. flask or bottle or django. If You want to minimize dependencies You can use an optional dependency in setup.py: extras_require={ ‘flask’: [‘Flask>=0.8’, ‘blinker>=1.1’] } Now you can install the library with: pip install raven[flask]  

Numpy linspace function

To create e.g. x-axis indices you can use the linspace function from numpy. You give it a range (e.g. 0 to 23) and the number of divisions and it will distribute the values evenly across that range. The stop values is included in the resulting value array by default. Example: import numpy as np np.linspace(0,…

Removing pyc files on server

Sometimes Python gives You a hard time when You deploy code to a server after you changed directory structures or simply moved files. With the following command You can remove the pyc files in the working directory and subdirectories: find . -name \*.pyc -delete

Python 3 – there shall be just int

Trying to contribute to the Flask plugin flask-login I just added these lines: if isinstance(duration, (int, long)): duration = timedelta(seconds=duration) Looking quite plausible, isn’t it? But lo and behold: it doesn’t work under Python 3.x. Dang! The reason: Python 2 has two integer types: int and long. In Python 3 there is only int, which…

Division in Python 2 vs 3

One major change in Python 3 is the implementation of the division operator /. In Python 2 the division yielded a floor rounded integer when dividing two integers but a float when using a float as divider or divisor. Due to Python’s weakly typed nature this behavior could lead to some issues. So PEP-238 changed…

Bringing AJAX to Flask

Flask is a micro web framework which is really fun to use. With the following snippet You have a complete web app working within seconds. from flask import Flask # 1 app = Flask(__name__) # 2 @app.route(‘/’) # 3 def hello_world(): return ‘Hello World!’ if __name__ == ‘__main__’: app.run() #4 All this snippet does is…