Python pip and virtualenv

After working for a couple of years with Python and external dependencies I’ve ran again and again into the same kind of problems.

Bad habits

Say you have a global python installation under e.g. C:\Python36 on Windows. When you start working on your first python project you want to use external packages and you encounter pip as dependency management tool. (pip is part of the python installation since 2.7.9 / 3.4) So far so good.

But you keep installing all the packages into your global python installation.

This will work when you are working on just one project or when you have the strategy to always update all your projects to the latest version of the specific packages in use or never update at all.

As soon as you have two different projects where project A needs package X in version 1.2.3 and project B needs the same package in version 2.3.4 you run into trouble.

And along comes – virtualenv

virtualenv is short for virtual environment and is able to create local copies of your python installation. But first you have to install it globally once 🙂

$ pip install virtualenv

After that You can create a new environment via

$ virtualenv <Project_A>/venv

And activate it via

$ cd <Project_A>/venv
$ source bin/activate

Notes: You can create the virtual environments anywhere you like, but I consider it a good practice to place them under your project root directory.

When You are working with multiple venv for e.g. migrating from Python 2 to Python 3 you can also name them accordingly: venv27 and venv36 or the like.

The name in brackets in front of the prompt shows that you are in an active venv. With the command

$ deactivate

you can leave the environment.

IDE Support

When using the PyCharm IDE the new project wizard can help you setting up the venv:

A good requirement handling workflow

Sooner or later you discover that you can place all your project dependencies in a text like requirements.txt and install them from there with:

pip install -r requirements.txt

But you have to decide if you want to add versions for the referenced packages or not.

Variant A

# requirements.txt

Variant B

# requirements.txt

Variant A has the advantage that you have a reproducible environment, Variant B let’s you easily update to the latest versions of your packages (think: get security patches) Now you have to chose 🙂

But when there is an Option A and an option B there is always an option C as well.

A good approach comes from Kenneth Reitz:

It’s very simple: instead of having one requirements file, you have two:

  • requirements-to-freeze.txt

  • requirements.txt

In requirements_to_freeze.txt you just put the top level dependencies, requirements.txt will contain all dependencies with exact versions.

Workflow for new projects

  • Create and activate virtual environment
  • Add dependencies to requirements_to_freeze.txt
  • pip install -r requirements_to_freeze.txt
  • pip freeze > requirements.txt
  • Check both files into version control
  • Your colleague just runs
    • pip install -r requirements.txt

The big advantage of this approach: you have a kind of lockfile with the exact versions of your packages pinned down but you can easily update by installing from requirements_to_freeze.txt

Automatic Dependency Tracking in PyCharm

Again PyCharm is helpful when you already set up a project and you want to keep track of your dependencies:

You can hint to your requirements file and PyCharm will tell You when You haven’t installed a listed requirement.