- Why Dependency Management
- Compare Java Way and Python Way of Dependency Management
Why Dependency Management
When I was beginner of python, I would install packages using
pip install to my system's python environment and I can use this global python environment to develop my projects.
But how can I publish a project?
How can I deploy a python service to a cluster of hosts?
It's not feasible to login to each host to download / update dependencies.
That's why we need a way to manage dependencies:
- When you deploy your service's source code to a new host, it can automatically download dependencies.
- The versions of the dependencies should be consistent with your development environment.
- You can develop several projects in your local dev desktop and they can have different version sets.
Compare Java Way and Python Way of Dependency Management
Based on my personal experience, there are two ways of dependency management - Classpath and Virtual Environment.
In Java world, JDK doesn't management dependencies for you. Your service uses
Maven/Gradle/Ant to download
jars in the building phase of deployment, and these
jars are added to the classpath when JDK starts the service in command line.
In Python world,
pip downloads dependencies to a python's global library path by default (e.g.
/usr/local/lib/python3.7/site-packages) and so we need to use a virtual environment as a local library path (e.g.
NodeJS is using the Virtual Environment way - Manage dependencies with a
package.json file and install dependencies to a local
node_modules folder instead of globally.
venv folder with
Assuming python3.7 is already installed to dev or prod host.
# Install virtualenv to global environment $ pip3 install virtualenv # create a folder named venv as a local python environment $ virtualenv venv --python=python3.7 # activate the local environment $ source ./venv/bin/activate
./venv/bin/ is added to the front of
PATH so the command
pip is the ones under this path. When you use
pip to install dependencies, they will go to
Include a text file of dependencies with versions in your project.
Conventionally it is named
You can manually create and edit the file, or use
pip freeze > requirements.txt to generate the file with current snapshot of your
An example of
aiohttp==3.0.5 requests==2.18.4 beautifulsoup4==4.6.0 urllib3==1.22
In the startup script[Assuming you are deploying your service to a cluster of hosts, you want to include all steps to run the service in a startup script] of your service, you use
pip install -r requirements.txt to download all dependency libraries, before running the service code.
- Question: can we define only major versions and let
pipfind the most recent minor version?
A newer, better and easier virtual environment management tool is
Create virtual enviroment with
# install `pipenv` to global environment $ pip3 install pipenv # It can install dependencies in `requirements.txt` or `Pipfile` automatically if either one exists. $ pipenv install
Note that there is no
venv folder in your project path. Instead there will be a folder with a random name under
~/.local/share/viritualenvs/. In fact you never need to touch it or look at it.
The actual dependency definition file is
Pipfile.lock, automatically generated with
pipenv install. We don't need
requirements.txt anymore, Instead we include
Pipfile.lock into our project source code.
Pipfile.lock works like a lock of minor versions using sha256,
Pipfile also specifies a full version number when generated. We want to modify
Pipfile with major version ranges so that we can have recent minor version updates of dependencies in our local dev environment and lock them into
An example of
[[source]] name = "pypi" url = "https://pypi.org/simple" verify_ssl = true [dev-packages] [packages] flask = ">=1.1.0, <1.2.0" flask-sqlalchemy = ">=2.4.0, <2.5.0" flask-login = ">=0.5.0, <0.6.0" [requires] python_version = "3.8"
On prod host, your startup script will be like this:
$ pipenv install $ pipenv run python service.py