Manage Dependencies for a Python Service

09/04/2020 posted in  Languages

Why Dependency Management

When I was beginner of python, I would install packages using pip install to my system's python environment and I can use this global python environment to develop my projects.

But how can I publish a project?
How can I deploy a python service to a cluster of hosts?
It's not feasible to login to each host to download / update dependencies.

That's why we need a way to manage dependencies:

  • When you deploy your service's source code to a new host, it can automatically download dependencies.
  • The versions of the dependencies should be consistent with your development environment.
  • You can develop several projects in your local dev desktop and they can have different version sets.

Compare Java Way and Python Way of Dependency Management

Based on my personal experience, there are two ways of dependency management - Classpath and Virtual Environment.
In Java world, JDK doesn't management dependencies for you. Your service uses Maven/Gradle/Ant to download jars in the building phase of deployment, and these jars are added to the classpath when JDK starts the service in command line.
In Python world, pip downloads dependencies to a python's global library path by default (e.g. /usr/local/lib/python3.7/site-packages) and so we need to use a virtual environment as a local library path (e.g. ./venv/lib/python3.7/site-packages).

NodeJS is using the Virtual Environment way - Manage dependencies with a package.json file and install dependencies to a local node_modules folder instead of globally.

virtualenv

Create venv folder with virtualenv

Assuming python3.7 is already installed to dev or prod host.

# Install virtualenv to global environment
$ pip3 install virtualenv

# create a folder named venv as a local python environment
$ virtualenv venv --python=python3.7

# activate the local environment
$ source ./venv/bin/activate

Now ./venv/bin/ is added to the front of PATH so the command python or pip is the ones under this path. When you use pip to install dependencies, they will go to ./venv/lib/python3.7/site-packages

Deploy dependencies

Include a text file of dependencies with versions in your project.
Conventionally it is named requirements.txt.

You can manually create and edit the file, or use pip freeze > requirements.txt to generate the file with current snapshot of your venv.

An example of requirements.txt:

aiohttp==3.0.5
requests==2.18.4
beautifulsoup4==4.6.0
urllib3==1.22

In the startup script[Assuming you are deploying your service to a cluster of hosts, you want to include all steps to run the service in a startup script] of your service, you use pip install -r requirements.txt to download all dependency libraries, before running the service code.

  • Question: can we define only major versions and let pip find the most recent minor version?

pipenv

A newer, better and easier virtual environment management tool is pipenv.

Create virtual enviroment with pipenv

# install `pipenv` to global environment
$ pip3 install pipenv

# It can install dependencies in `requirements.txt` or `Pipfile` automatically if either one exists.
$ pipenv install

Note that there is no venv folder in your project path. Instead there will be a folder with a random name under ~/.local/share/viritualenvs/. In fact you never need to touch it or look at it.

Deploy dependencies

The actual dependency definition file is Pipfile with Pipfile.lock, automatically generated with pipenv install. We don't need requirements.txt anymore, Instead we include Pipfile and Pipfile.lock into our project source code.

While Pipfile.lock works like a lock of minor versions using sha256, Pipfile also specifies a full version number when generated. We want to modify Pipfile with major version ranges so that we can have recent minor version updates of dependencies in our local dev environment and lock them into Pipfile.lock.

An example of Pipfile snippet.

[[source]]
name = "pypi"
url = "https://pypi.org/simple"
verify_ssl = true

[dev-packages]

[packages]
flask = ">=1.1.0, <1.2.0"
flask-sqlalchemy = ">=2.4.0, <2.5.0"
flask-login = ">=0.5.0, <0.6.0"

[requires]
python_version = "3.8"

On prod host, your startup script will be like this:

$ pipenv install
$ pipenv run python service.py