Third-party dependencies
How to use third-party Python libraries in your project.
Pants handles dependencies with more precision than traditional Python workflows. Traditionally, you have a single heavyweight virtual environment that includes a large set of dependencies, whether or not you actually need them for your current task.
Instead, Pants understands exactly which dependencies every file in your project needs, and efficiently uses just that subset of dependencies needed for the task.
$ ./pants dependencies src/py/app.py
//:flask
$ ./pants dependencies src/py/util.py
//:flask
//:requests
Among other benefits, this precise and automatic understanding of your dependencies gives you fine-grained caching. This means, for example, that if none of the dependencies for a particular test file have changed, the cached result can be safely used.
Teaching Pants your "universe" of dependencies
For Pants to know which dependencies each file uses, it must first know which specific dependencies are in your "universe", i.e. all the third-party dependencies your project uses.
Each third-party dependency is modeled by a python_requirement
target:
python_requirement(
name="Django",
requirements=["Django==3.2.1"],
)
To minimize boilerplate, Pants has macros to generate python_requirement
targets for you:
python_requirements
forrequirements.txt
.poetry_requirements
for Poetry projects.
requirements.txt
The python_requirements()
macro parses a requirements.txt
-style file to produce a python_requirement
target for each entry.
For example:
- requirements.txt
- BUILD
flask>=1.1.2,<1.3
requests[security]==2.23.0
dataclasses ; python_version<'3.7'
python_requirements()
# The above macro is equivalent to explicitly setting this:
python_requirement(
name="flask",
requirements=["flask>=1.1.2,<1.3"],
)
python_requirement(
name="requests",
requirements=["requests[security]==2.23.0"],
)
python_requirement(
name="dataclasses",
requirements=["dataclasses ; python_version<'3.7'"],
)
If the file uses a different name than requirements.txt
, set source
like this:
python_requirements(source="reqs.txt")
requirements.txt
?You can put the file wherever makes the most sense for your project.
In smaller repositories that only use Python, it's often convenient to put the file at the "build root" (top-level), as used on this page.
For larger repositories or multilingual repositories, it's often useful to have a 3rdparty
or 3rdparty/python
directory. Rather than the target's address being //:my_requirement
, its address would be 3rdparty/python:my_requirement
, for example.
BUT: if you place your requirements.txt
in a non-standard location (or give it another name via the python_requirements(source=..)
argument), you will need to configure pantsd
to restart for edits to the non-standard filename: see #9946.
Poetry
The poetry_requirements()
macro parses the Poetry section in pyproject.toml
to produce a python_requirement
target for each entry.
- pyproject.toml
- BUILD
[tool.poetry.dependencies]
python = "^3.8"
requests = {extras = ["security"], version = "~1"}
flask = "~1.12"
[tool.poetry.dev-dependencies]
isort = "~5.5"
poetry_requirements()
# The above macro is equivalent to explicitly setting this:
python_requirement(
name="requests",
requirements=["requests[security]>=1,<2.0"],
)
python_requirement(
name="flask",
requirements=["flask>=1.12,<1.13"],
)
python_requirement(
name="isort",
requirements=["isort>=5.5,<5.6"],
)
See the section "Lockfiles" below for how you can also hook up poetry.lock
to Pants.
How dependencies are chosen
Once Pants knows about your "universe" of dependencies, it determines which subset should be used through dependency inference. Pants will read your import statements, like import django
, and map it back to the relevant python_requirement
target. Run ./pants dependencies path/to/file.py
or ./pants dependencies path/to:target
to confirm this works.
If dependency inference does not work—such as because it's a runtime dependency you do not import—you can explicitly add the python_requirement
target to the dependencies
field, like this:
python_sources(
name="lib",
dependencies=[
# We don't have an import statement for this dep, so inference
# won't add it automatically. We add it explicitly instead.
"//:psyscopg2-binary",
],
)
Use modules
and module_mapping
when the module name is not standard
Some dependencies expose a module different than their project name, such as beautifulsoup4
exposing bs4
. Pants assumes that a dependency's module is its normalized name—i.e. My-distribution
exposes the module my_distribution
. If that default does not apply to a dependency, it will not be inferred.
Pants already defines a default module mapping for some common Python requirements, but you may need to augment this by teaching Pants additional mappings:
# `modules` and `module_mapping` is only needed for requirements where the defaults do not work.
python_requirement(
name="my_distribution",
requirements=["my_distribution==4.1"],
modules=["custom_module"],
)
python_requirements(
module_mapping={"my_distribution": ["custom_module"]},
)
poetry_requirements(
module_mapping={"my_distribution": ["custom_module"]},
)
If the dependency is a type stub, and the default does not work, set type_stub_modules
on the python_requirement
target, and type_stubs_module_mapping
on the python_requirements
and poetry_requirements
macros. (The default for type stubs is to strip off types-
, -types
, -stubs
, and stubs-
. So, types-requests
gives type stubs for the module requests
.)
Warning: multiple versions of the same dependency
When you have multiple targets for the same dependency—such as a target with Django==2
and another with Django==3
—dependency inference will not work due to ambiguity. Instead, you'll have to manually choose which target to use and add to the dependencies
field. Pants will warn when dependency inference is ambiguous.
This ambiguity is often a problem when you have 2+ requirements.txt
or pyproject.toml
files in your project, such as project1/requirements.txt
and project2/requirements.txt
both specifying Django
. If the version of the overlapping constraints are the same, you may wish to move the common requirements into a common requirements file.
It can, however, be helpful to have multiple targets for the same dependency. For example, this allows part of your repository to upgrade to Django 3 even if the rest of your code still uses Django 2:
python_requirement(
name="Django2",
requirements=["Django>=2.2.24,<3"]
)
python_requirement(
name="Django3",
requirements=["Django>=3.2.7,<4"]
)
python_source(
name="new-app",
source="new_app.py",
dependencies=[":Django3"],
)
python_source(
name="legacy-app",
source="legacy_app.py",
dependencies=[":Django2"],
)
Lockfiles
We strongly recommend using lockfiles because they make your builds more stable so that new releases will not break your project.
User lockfile
This lockfile is used for your own code, such as when packaging into a PEX binary or resolving dependencies for your tests.
We've been redesigning user lockfile support to be more ergonomic and flexible. In the meantime, there are limitations:
- Only one lockfile may be used for the whole repository.
- If you must use conflicting versions for the same dependency, such as
Django==2
andDjango==3
, then you'll need to leave the dependency out of the lockfile.
- If you must use conflicting versions for the same dependency, such as
- You can leave off dependencies, which means the lockfile might not be comprehensive.
- The lockfile must be manually generated.
- The lockfile cannot include
--hash
to ensure that the downloaded files are what is expected.
As explained above, Pants only uses the subset of the "universe" of your dependencies that is actually needed for a build, such as running tests and packaging a wheel file. This gives fine-grained caching and has other benefits like built packages (e.g. PEX binaries) only including their true dependencies. However, this also means that you may need to resolve dependencies multiple times, which can be slow.
If you use a lockfile, Pants will optimize to only resolve your requirements one time for your project. Then, for each build, Pants will extract from that resolve the exact subset needed. This greatly speeds up the performance of goals like test
, run
, package
, and repl
.
If the build's dependencies are not all included in the lockfile, this performance optimization will not be used.
To use a lockfile, first generate a constraints.txt
file, such as by using one of these techniques:
Technique | Command | Limitations |
---|---|---|
venv + pip freeze | Create a script like the one in "Set up a virtual environment" below. | The lockfile may not work on platforms and Python versions other than what was used to create the virtual env. |
pip-compile | pip-compile --allow-unsafe --strip-extras -o constraints.txt requirements.txt | The lockfile may not work on platforms and Python versions other than what was used to run pip-compile .Will not capture any python_requirement targets declared explicitly in BUILD files or in pyproject.toml . |
Poetry | poetry export --dev --without-hashes -o constraints.txt | Requires that you are using Poetry for dependency management. Will not capture any python_requirement targets declared explicitly in BUILD files or in requirements.txt . |
Then, set the option [python].requirements_constraints
like this:
- pants.toml
- constraints.txt
[python]
requirement_constraints = "constraints.txt"
certifi==2019.6.16
chardet==3.0.2
idna==2.7
requests==2.23.0
urllib3==1.25.6
Pants will then ensure that pip always uses the versions pinned in your lockfile when installing requirements.
Tool lockfiles
Each Python tool Pants runs for you—like Black, Pytest, Flake8, and MyPy—has a dedicated lockfile.
Pants distributes a lockfile with each tool by default. However, if you change the tool's version
and extra_requirements
—or you change its interpreter constraints to not be compatible with our default lockfile—you will need to use a custom lockfile. Set the lockfile
option in pants.toml
for that tool, and then run ./pants generate-lockfiles
.
[flake8]
version = "flake8==3.8.0"
lockfile = "3rdparty/flake8_lockfile.txt" # This can be any path you'd like.
[pytest]
extra_requirements.add = ["pytest-icdiff"]
lockfile = "3rdparty/pytest_lockfile.txt"
$ ./pants generate-lockfiles
19:00:39.26 [INFO] Completed: Generate lockfile for flake8
19:00:39.27 [INFO] Completed: Generate lockfile for pytest
19:00:39.29 [INFO] Wrote lockfile for the resolve `flake8` to 3rdparty/flake8_lockfile.txt
19:00:39.30 [INFO] Wrote lockfile for the resolve `pytest` to 3rdparty/pytest_lockfile.txt
You can also run ./pants generate-lockfiles --resolve=tool
, e.g. --resolve=flake8
, to only generate that tool's lockfile rather then generating all lockfiles.
Anytime your custom lockfile becomes stale, such as because you changed the version
again, Pants will error with instructions to regenerate the lockfile.
To disable lockfiles, set lockfile = "<none>"
for that tool, although we do not recommend this!
Advanced usage
Requirements with undeclared dependencies
Sometimes a requirement does not properly declare in its packaging metadata the other dependencies it depends on, so those will not be installed. It's especially common to leave off dependencies on setuptools
, which results in import errors like this:
import pkg_resources
ModuleNotFoundError: No module named 'pkg_resources'
To work around this, you can use the dependencies
field of python_requirement
, so that anytime you depend on your requirement, you also bring in the undeclared dependency.
# First, make sure you have a `python_requirement` target for
# the undeclared dependency.
python_requirement(
name="setuptools",
requirements=["setuptools"],
)
python_requirement(
name="mongomock",
requirements=["mongomock"],
dependencies=[":setuptools"],
)
If you are using the python_requirements
and poetry_requirements
target generators, you can use the overrides
field to do the same thing:
- requirements.txt
- BUILD
setuptools
mongomock
python_requirements(
overrides={
"mongomock": {"dependencies": [":setuptools"]},
},
)
Version control and local requirements
You might be used to using pip's proprietary VCS-style requirements for this, like git+https://github.com/django/django.git#egg=django
. However, this proprietary format does not work with Pants.
Instead of pip VCS-style requirements:
git+https://github.com/django/django.git#egg=Django
git+https://github.com/django/django.git@stable/2.1.x#egg=Django
git+https://github.com/django/django.git@fd209f62f1d83233cc634443cfac5ee4328d98b8#egg=Django
Use direct references from PEP 440:
Django@ git+https://github.com/django/django.git
Django@ git+https://github.com/django/django.git@stable/2.1.x
Django@ git+https://github.com/django/django.git@fd209f62f1d83233cc634443cfac5ee4328d98b8
You can also install from local files using PEP 440 direct references. You must use an absolute path to the file, and you should ensure that the file exists on your machine.
Django @ file:///Users/pantsbuild/prebuilt_wheels/django-3.1.1-py3-none-any.whl
Pip still works with these PEP 440-compliant formats, so you won't be losing any functionality by switching to using them.
When using version controlled direct references hosted on private repositories with SSH access:
target@ git+ssh://[email protected]:/myorg/myrepo.git@myhash
...you may see errors like:
Complete output (5 lines):
[email protected]: Permission denied (publickey).
fatal: Could not read from remote repository.
Please make sure you have the correct access rights
and the repository exists.
----------------------------------------
To fix this, Pants needs to be configured to pass relevant SSH specific environment variables to processes by adding the following to pants.toml
:
[subprocess-environment]
env_vars.add = [
"SSH_AUTH_SOCK",
]
Custom repositories
If you host your own wheels at a custom index (aka "cheese shop"), you can instruct Pants to use it with the option indexes
in the [python-repos]
scope.
[python-repos]
indexes.add = ["https://custom-cheeseshop.net/simple"]
To exclusively use your custom index—i.e. to not use PyPI—use indexes = [..]
instead of indexes.add = [..]
.
You can also add Python repositories with the option repos
in the [python-repos]
scope.
[python-repos]
repos = ["https://your/repo/here"]
Indexes are assumed to have a nested structure (like http://pypi.org/simple), whereas repos are flat lists of packages.
Tip: Set up a virtual environment (optional)
While Pants itself doesn't need a virtualenv, it may be useful to create one for working with your tooling outside Pants, such as your IDE.
You can create a virtualenv using standard Python tooling—such as Python's builtin venv
module—along with running Pants to query for all of your Python requirements.
For example, this script will first create a venv, and then generate a constraints.txt
file to use as a lockfile.
#!/usr/bin/env bash
set -euo pipefail
# You can change these constants.
PYTHON_BIN=python3
VIRTUALENV=build-support/.venv
PIP="${VIRTUALENV}/bin/pip"
REQUIREMENTS_FILE=requirements.txt
CONSTRAINTS_FILE=constraints.txt
"${PYTHON_BIN}" -m venv "${VIRTUALENV}"
"${PIP}" install pip --upgrade
# Install all our requirements.txt, and also any 3rdparty
# dependencies specified outside requirements.txt, e.g. via a
# handwritten python_requirement_library target.
"${PIP}" install \
-r "${REQUIREMENTS_FILE}" \
-r <(./pants dependencies :: |
xargs ./pants filter --target-type=python_requirement |
xargs ./pants peek |
jq -r '.[]["requirements"][]')
echo "# Generated by build-support/generate_constraints.sh on $(date)" > "${CONSTRAINTS_FILE}"
"${PIP}" freeze --all >> "${CONSTRAINTS_FILE}"
(Note that this requires installing the tool jq.)
Unless the dependency was specified in requirements.txt
, Pants will only capture it if it is actually used by your code.