Deterministic Dependency Management (CN)#

Overview#

“Deterministic dependency” is a principle in software development for production projects. This means that if the code is not changed, building the project at any time will result in exactly the same bytes-by-bytes. For a production-ready project, this is a must-have feature to ensure software integrity and consistent delivery.

However, the popular pip install command does not provide deterministic dependencies and has several drawbacks:

  • If specific versions of dependencies are not defined, pip install may install the latest version, which is constantly changing over time.

  • Even if specific versions of dependencies are defined, it is not practical to specify the version for all of the dependencies that they rely on.

  • If there is a conflict between dependencies at either the root or child dependency level, pip cannot resolve it.

  • Even when using the pip freeze command to export specific versions of all dependencies, it does not provide integrity checks. Hackers may compromise the public PyPI repository and replace a version with malicious code, which has already happened many times in the industry.

Poetry is one of the tools that can ensure deterministic dependencies. Other alternatives include PDM, which is more powerful and adopts the latest Python standards, but is a younger project. And pipenv, an older tool with worse performance that may not always work correctly and is not recommended. For a comprehensive comparison of these tools, please see this blog post. In this project, I choose poetry to manage the Python dependencies.

From June 2020, Python community start using pyproject.toml file for project dependencies definition. In this project, I define four groups of dependencies:

  • main: core dependencies to run the application

  • dev: optional dependencies to speed up the development

  • test: dependencies for testing

  • automation: dependencies for DevOps automation scripts

Everytime you changed the pyproject.toml, you should run poetry lock to resolve the dependencies tree, and store the Deterministic results in the poetry.lock file and committed it to Git as a record. This allows us to rollback to a historical version of the dependencies if needed.