A lightweight pip workflow — tech.reversedelay.net

In my Python projects, I’ve been using the simple dependency management workflow that Kenneth Reitz called A Better Pip Workflow™. Basically, instead of one requirements.txt file as is the basic convention, I maintain two files in my projects’ root directory:

requirements-spec.txt: a list of my project’s direct dependencies (a.k.a. top-level or first-order dependencies) which are the packages that my code directly interacts with. The dependency listing must have a pinned version, and must include any extras that the project needs. For example, in a Django app, I could have a line for psycopg[c]==3.1.9.
requirements-lock.txt: the list created by pip freeze > requirements-lock.txt after pip install -r requirements-spec.txt has been executed in a development environment. This contains the complete set of version-pinned packages my project uses, including indirect dependencies. For example, it could include pylint-plugin-utils==0.8.2, which is required by a direct dependency pylint==2.17.4.

Note that Reitz used requirements-to-freeze.txt and requirements.txt as the filenames for these, respectively.

Details

The motivation for this workflow is explained well in A Better Pip Workflow™, but there are some practical aspects not covered by the blog post. The following clarifications based on my experience should be helpful:

It doesn’t really need mentioning, but to be completely explicit: you must use this workflow in conjunction with a Python virtual environment. That’s the other basic practice that comes with dependency pinning. For this I prefer the standard library’s venv tool over the older, third-party virtualenv.
In Reitz’s original description, explicit versions are optional in requirements-spec.txt. For me, they are required. One of the main benefits of the two-file system really is to segregate the dependencies directly relevant to my project, from those that are not. This way, I can focus my dependency management and integration testing on the direct dependencies, for which the versions should be controlled carefully. That makes pinning more valuable for that set, not less, so explicitly specifying versions should not be optional.
requirements-lock.txt should not be edited directly; it should always be generated with pip freeze only. Here is the basic procedure for any package environment updates:
1. Edit requirements-spec.txt, adding or removing packages or updating the versions of existing items.
2. Run pip install -r requirements-spec.txt.
3. Run pip freeze > requirements-lock.txt to save the updated snapshot of the environment.
You can have a separate requirements-spec-dev.txt for dependencies relevant only to dev environments, e.g. coverage and pylint, but this implies an additional requirements-lock-dev.txt too, and similar duplication of install & freeze commands. Personally, I maintain just one pair of requirements files and just let the dev dependencies get installed even in production deployments, though I do separate the listing of these dev packages in the spec file with a line break and comment. (I am not aware of any risks to the production deployment, and the added disk space usage is acceptable.)
To update all the indirect dependencies at once, you can just delete the current virtual environment, create a fresh one, run pip install -r requirements-spec.txt, and then get a new snapshot with pip freeze > requirements-lock.txt. This should fetch the latest versions of indirect dependencies that still satisfy your direct dependencies’ requirements specifications. The procedure might seem a bit crude, but it’s effective and can be quite fast. Of course this also presumes that testing is done after updating to ensure nothing is broken. In theory, the risk of breakage caused by updates to indirect dependencies is lower than that due to direct dependencies, and the updates to direct dependencies are the ones that should be watched closely and done carefully.
When removing direct dependencies, there might be a need to clean up orphaned indirect dependencies. (This can also happen with updates, when the new package versions drop obsolete dependencies.) For this case, the same process of recreating the virtual environment should work. This should get easier if or when pip gets a function like Debian’s apt autoremove.
Some tools and environments, such as the Cloudflare Pages build system, expect exactly the requirements.txt filename, and hence won’t pick up the list in requirements-lock.txt. For these cases, you can simply create a requirements.txt that only contains “-r requirements-lock.txt”.

Regarding my “spec” and “lock” filenames preference: I use these instead of Reitz’s requirements-to-freeze.txt + requirements.txt because these are more meaningful. I am usually only concerned with direct dependencies, so the spec file appropriately contains all the packages I normally pay attention to. As for requirements-lock.txt, it’s more instructive than seeing a plain requirements.txt sitting next to the spec file, signalling to devs browsing the project files that I’m employing a different pip workflow.

Why not use Pipenv?

In the original “Better Pip Workflow” blog post, Reitz says “I don’t want another tool in my toolchain; this should be possible with the tools available,” and yet, possibly due to the practical inconveniences I detailed above, he went on to make Pipenv, a now-popular and PyPA-recommended tool for solving the same problems this simple workflow is meant to address—and then some.

That is well and good, and some of those added features should prove valuable for certain projects, perhaps those with large numbers of dependencies. For me, the most compelling selling point is the claimed security benefits of Pipenv’s use of hash-checking. But if one’s threat model does not require such a measure, then the original lightweight workflow might be sufficient for dependency pinning, without the added complexities of introducing yet another tool into the environment.

For many projects, that could truly be the better pip workflow.