Third-party dependencies
How to use third-party Python libraries in your project.
Pants handles dependencies with more precision than traditional Python workflows. Traditionally, you have a single heavyweight virtual environment that includes a large set of dependencies, whether or not you actually need them for your current task.
Instead, Pants understands exactly which dependencies every file in your project needs, and efficiently uses just that subset of dependencies needed for the task.
❯ ./pants dependencies src/py/util.py
3rdparty/py#requests
❯ ./pants dependencies --transitive src/py/app.py
3rdparty/py#flask
3rdparty/py#requests
Among other benefits, this precise and automatic understanding of your dependencies gives you fine-grained caching. This means, for example, that if none of the dependencies for a particular test file have changed, the cached result can be safely used.
First, turn off old-style macros
If you're starting a new project, set the below. This will become the default in Pants 2.11.
[GLOBAL]
use_deprecated_python_macros = false
If you already are using Pants, follow the instructions Pants prints when upgrading to Pants 2.10 to upgrade to the new mechanism when ready.
Teaching Pants your "universe"(s) of dependencies
For Pants to know which dependencies each file uses, it must first know which specific dependencies are in your "universe", i.e. all the third-party dependencies your project directly uses.
By default, Pants uses a single universe for your whole project, but it's possible to set up multiple. See the header "Multiple resolves" in the "Lockfiles" section.
Each third-party dependency you directly use is modeled by a python_requirement
target:
python_requirement(
name="django",
requirements=["Django==3.2.1"],
)
You do not need a python_requirement
target for transitive dependencies, i.e. requirements that you do not directly import.
To minimize boilerplate, Pants has target generators to generate python_requirement
targets for you:
python_requirements
forrequirements.txt
.poetry_requirements
for Poetry projects.
requirements.txt
The python_requirements()
target generator parses a requirements.txt
-style file to produce a python_requirement
target for each entry.
For example:
- requirements.txt
- BUILD
flask>=1.1.2,<1.3
requests[security]==2.23.0
dataclasses ; python_version<'3.7'
# This will generate three targets:
#
# - //:reqs#flask
# - //:reqs#requests
# - //:reqs#dataclasses
python_requirements(name="reqs")
# The above target generator is spiritually equivalent to this:
python_requirement(
name="flask",
requirements=["flask>=1.1.2,<1.3"],
)
python_requirement(
name="requests",
requirements=["requests[security]==2.23.0"],
)
python_requirement(
name="dataclasses",
requirements=["dataclasses ; python_version<'3.7'"],
)
If the file uses a different name than requirements.txt
, set source
like this:
python_requirements(source="reqs.txt")
requirements.txt
?You can name the file whatever you want, and put it wherever makes the most sense for your project.
In smaller repositories that only use Python, it's often convenient to put the file at the "build root" (top-level), as used on this page.
For larger repositories or multilingual repositories, it's often useful to have a 3rdparty
or 3rdparty/python
directory. Rather than the target's address being //:reqs#my_requirement
, its address would be 3rdparty/python:reqs#my_requirement
, for example; or 3rdparty/python#my_requirement
if you leave off the name
field for python_requirements
. See Target Generation.
Poetry
The poetry_requirements()
target generator parses the Poetry section in pyproject.toml
to produce a python_requirement
target for each entry.
- pyproject.toml
- BUILD
[tool.poetry.dependencies]
python = "^3.8"
requests = {extras = ["security"], version = "~1"}
flask = "~1.12"
[tool.poetry.dev-dependencies]
isort = "~5.5"
# This will generate three targets:
#
# - //:poetry#flask
# - //:poetry#requests
# - //:poetry#dataclasses
poetry_requirements(name="poetry")
# The above target generator is spiritually equivalent to this:
python_requirement(
name="requests",
requirements=["requests[security]>=1,<2.0"],
)
python_requirement(
name="flask",
requirements=["flask>=1.12,<1.13"],
)
python_requirement(
name="isort",
requirements=["isort>=5.5,<5.6"],
)
See the section "Lockfiles" below for how you can also hook up poetry.lock
to Pants.
How dependencies are chosen
Once Pants knows about your "universe"(s) of dependencies, it determines which subset should be used through dependency inference. Pants will read your import statements, like import django
, and map it back to the relevant python_requirement
target. Run ./pants dependencies path/to/file.py
or ./pants dependencies path/to:target
to confirm this works.
If dependency inference does not work—such as because it's a runtime dependency you do not import—you can explicitly add the python_requirement
target to the dependencies
field, like this:
python_sources(
name="lib",
dependencies=[
# We don't have an import statement for this dep, so inference
# won't add it automatically. We add it explicitly instead.
"3rdparty/python#psyscopg2-binary",
],
)
Use modules
and module_mapping
when the module name is not standard
Some dependencies expose a module different than their project name, such as beautifulsoup4
exposing bs4
. Pants assumes that a dependency's module is its normalized name—i.e. My-distribution
exposes the module my_distribution
. If that default does not apply to a dependency, it will not be inferred.
Pants already defines a default module mapping for some common Python requirements, but you may need to augment this by teaching Pants additional mappings:
# `modules` and `module_mapping` is only needed for requirements where
# the defaults do not work.
python_requirement(
name="my_distribution",
requirements=["my_distribution==4.1"],
modules=["custom_module"],
)
python_requirements(
name="reqs",
module_mapping={"my_distribution": ["custom_module"]},
)
poetry_requirements(
name="poetry",
module_mapping={"my_distribution": ["custom_module"]},
)
If the dependency is a type stub, and the default does not work, set type_stub_modules
on the python_requirement
target, and type_stubs_module_mapping
on the python_requirements
and poetry_requirements
target generators. (The default for type stubs is to strip off types-
, -types
, -stubs
, and stubs-
. So, types-requests
gives type stubs for the module requests
.)
Warning: multiple versions of the same dependency
It's invalid in Python to have conflicting versions of the same requirement, e.g. Django==2
and Django==3
. Instead, Pants supports "multiple resolves" (i.e. multiple lockfiles), as explained in the below section on lockfiles.
When you have multiple targets for the same dependency and they belong to the same resolve ("lockfile"), dependency inference will not work due to ambiguity. If you're using lockfiles—which we strongly recommend—the solution is to set the resolve
field for problematic python_requirement
targets so that each resolve has only one requirement and there is no ambiguity.
This ambiguity is often a problem when you have 2+ requirements.txt
or pyproject.toml
files in your project, such as project1/requirements.txt
and project2/requirements.txt
both specifying django
. You may want to set up each poetry_requirements
/python_requirements
target generator to use a distinct resolve so that there is no overlap. Alternatively, if the versions are the same, you may want to consolidate the requirements into a common file.
Lockfiles
We strongly recommend using lockfiles because they make your builds more stable so that new releases of dependencies will not break your project. They also reduce the risk of supply chain attacks.
Pants has two types of lockfiles:
- User lockfiles, for your own code such as packaging binaries and running tests.
- Tool lockfiles, to install tools that Pants runs like Pytest and Flake8.
With both types of lockfiles, Pants can generate the lockfile for you with the generate-lockfiles
goal. However, there are several situations where this does not work properly, and you may need to generate the lockfile manually. This will be improved in future Pants versions. See the below section for more information.
User lockfiles
First, set [python].enable_resolves
in pants.toml
:
[python]
enable_resolves = true
By default, Pants will write the lockfile to 3rdparty/python/default.lock
. If you want a different location, change [python].resolves
like this:
[python]
enable_resolves = true
resolves = { python-default = "lockfile_path.txt" }
Then, use ./pants generate-lockfiles
to generate the lockfile.
❯ ./pants generate-lockfiles
19:00:39.26 [INFO] Completed: Generate lockfile for python-default
19:00:39.29 [INFO] Wrote lockfile for the resolve `python-default` to 3rdparty/python/default.lock
Alternatively, if you are manually generating the lockfile, set [python].resolves_generate_lockfiles
, and point [python].resolves
to the path of your lockfile. Pants will still consume it like normal, only it will not manage it for you such as checking when it needs to be regenerated.
[python]
enable_resolves = true
resolves_generate_lockfiles = false
resolves = { python-default = "lockfile_path.txt" }
As explained at the top of these docs, Pants only uses the subset of the "universe" of your dependencies that is actually needed for a build, such as running tests and packaging a wheel file. This gives fine-grained caching and has other benefits like built packages (e.g. PEX binaries) only including their true dependencies. However, this also means that you may need to resolve dependencies multiple times, which can be slow.
If you use lockfiles, Pants will optimize to only resolve your requirements one time for your project. Then, for each build, Pants will extract from that resolve the exact subset needed.
This greatly speeds up performance and improves caching for goals like test
, run
, package
, and repl
.
Multiple lockfiles
While it's often desirable to have a single lockfile for the whole project for simplicity and consistency, sometimes you may need multiple. This is necessary, for example, when you have conflicting versions of requirements, such as part of your code using Django 2 and other parts using Django 3.
Start by defining multiple "resolves", which are logical names for lockfile paths. For example:
[python]
enable_resolves = true
default_resolve = "web-app"
[python.resolves]
data-science = "3rdparty/python/data_science_lock.txt"
web-app = "3rdparty/python/web_app_lock.txt"
Then, teach Pants which resolves every python_requirement
target belongs to through the resolve
field. It will default to [python].default_resolve
.
python_requirement(
name="ansicolors",
requirements=["ansicolors==1.18"],
resolve="web-app",
)
# Often, you will want to set `resolve` on the
# `poetry_requirements` and `python_requirements`
# target generators.
poetry_requirements(
name="poetry",
resolve="data-science",
# You can use `overrides` if you only want to change
# some targets.
overrides={"requests": {"resolve": "web-app"}},
)
If you want the same requirement to show up in multiple resolves, you currently need to create a distinct target per resolve. This will be improved in Pants 2.11 through a new parametrize()
mechanism.
# The same requirement in multiple resolves:
python_requirement(
name="ansicolors_web-app",
requirements=["ansicolors==1.18"],
resolve="web-app",
)
python_requirement(
name="ansicolors_data-science",
requirements=["ansicolors==1.18"],
resolve="data-science",
)
# Note that because BUILD files are Python, you could de-duplicate
# this by defining variables. You can also add a
# macro: https://www.pantsbuild.org/v2.10/docs/macros
Then, run ./pants generate-lockfiles
to generate the lockfiles. If the results aren't what you'd expect, adjust the prior step.
Finally, update your first-party targets like python_source
/ python_sources
, python_test
/ python_tests
, and pex_binary
to set their resolve
field. As before, the resolve
field defaults to [python].default_resolve
.
python_sources(
resolve="web-app",
)
python_tests(
name="tests",
resolve="web-app",
# You can use `overrides` to change certain generated targets
overrides={"test_utils.py": {"resolve": "data-science"}},
)
pex_binary(
name="main",
entry_point="main.py",
resolve="web-app",
)
If a first-party target is compatible with multiple resolves—such as some utility code—you must for now create one target per resolve. This will be improved with Pants 2.11's parametrize
feature.
All transitive dependencies of a target must use the same resolve. Pants's dependency inference already handles this for you by only inferring dependencies on targets that share the same resolve. If you incorrectly add a target from a different resolve to the dependencies
field, Pants will error with a helpful message when building your code with goals like test
, package
, and run
.
Tool lockfiles
Pants distributes a lockfile with each tool by default. However, if you change the tool's version
and extra_requirements
—or you change its interpreter constraints to not be compatible with our default lockfile—you will need to use a custom lockfile. Set the lockfile
option in pants.toml
for that tool, and then run ./pants generate-lockfiles
.
[flake8]
version = "flake8==3.8.0"
lockfile = "3rdparty/flake8_lockfile.txt" # This can be any path you'd like.
[pytest]
extra_requirements.add = ["pytest-icdiff"]
lockfile = "3rdparty/pytest_lockfile.txt"
❯ ./pants generate-lockfiles
19:00:39.26 [INFO] Completed: Generate lockfile for flake8
19:00:39.27 [INFO] Completed: Generate lockfile for pytest
19:00:39.29 [INFO] Wrote lockfile for the resolve `flake8` to 3rdparty/flake8_lockfile.txt
19:00:39.30 [INFO] Wrote lockfile for the resolve `pytest` to 3rdparty/pytest_lockfile.txt
You can also run ./pants generate-lockfiles --resolve=tool
, e.g. --resolve=flake8
, to only generate that tool's lockfile rather than generating all lockfiles.
To manually manage lockfiles, set the option [tool].lockfile
to your lockfile path like normal. Do not run the generate-lockfiles
goal, and also set [python].invalid_lockfile_behavior
so that Pants does not look for its metadata header:
[python]
invalid_lockfile_behavior = "ignore"
To disable lockfiles entirely for a tool, set [tool].lockfile = "<none>"
for that tool. Although we do not recommend this!
generate-lockfiles
goal vs manual lockfile generation
generate-lockfiles
limitations
Categorically, the generate-lockfiles
goal cannot yet handle three use cases:
- Does not support
[python-repos]
if you have a custom index or repository other than PyPI. - Does not support
[GLOBAL].ca_certs_path
. - Does not support VCS (Git) requirements and local file requirements.
If you use any of these three features for a certain lockfile, unfortunately, you must manually generate that lockfile. Support for these use cases is coming in future Pants releases by teaching Pex to generate lockfiles via pip.
Several users have also had issues with generate-lockfiles
returning a lockfile that gets generated successfully, but then errors due to missing transitive dependencies when Pants tries to install it. This is especially common with user lockfiles. For example:
Failed to resolve requirements from PEX environment @ /home/pantsbuild/.cache/pants/named_caches/pex_root/unzipped_pexes/42735ba5593c0be585614e50072f765c6a45be15.
Needed manylinux_2_28_x86_64-cp-37-cp37m compatible dependencies for:
1: colorama<0.5.0,>=0.4.0
Required by:
FingerprintedDistribution(distribution=rich 11.0.0 (/home/pantsbuild/.cache/pants/named_caches/pex_root/installed_wheels/4ce6259e437af26bac891ed2867340d4163662b9/rich-11.0.0-py3-none-any.whl), fingerprint='ff22612617b194af3cd95380174413855aad7240')
But this pex had no 'colorama' distributions.
Usually, the transitive dependency is in the lockfile, but it doesn't get installed because it has nonsensical environment markers, like this:
colorama==0.4.4; sys_platform == "win32" and python_version >= "3.6" and python_full_version >= "3.6.2" and python_full_version < "4.0.0" and (python_version >= "3.6" and python_full_version < "3.0.0" or python_full_version >= "3.5.0" and python_version >= "3.6") and (python_version >= "3.6" and python_full_version < "3.0.0" and sys_platform == "win32" or sys_platform == "win32" and python_version >= "3.6" and python_full_version >= "3.5.0") and (python_version >= "3.6" and python_full_version < "3.0.0" and platform_system == "Windows" or python_full_version >= "3.5.0" and python_version >= "3.6" and platform_system == "Windows")
For user lockfiles, the workaround is to treat the problematic transitive dependencies as direct inputs to the resolve by creating a python_requirement
target, which usually causes the lockfile generator to handle things correctly. For example:
python_requirement(
name="bad_transitive_dependencies_workaround",
requirements=[
"colorama",
"zipp",
],
# This turns off dependency inference for these
# requirements, which you may want to do as they
# are transitive dependencies that should not be directly imported.
modules=[],
# If you are using multiple resolves, you may need to set the
# `resolve` field.
)
For tool lockfiles, add the problematic transitive dependencies to [tool].extra_requirements
. For example:
[pylint]
version = "pylint>=2.11.0,<2.12"
extra_requirements.add = ["colorama"]
Then, regenerate the lock with generate-lockfiles
.
You can also try manually removing the problematic environment markers, although you will need to remember to do this again whenever re-running generate-lockfiles
.
Manual lockfile generation techniques
Pants is agnostic to how your lockfile is generated, as long as it's a valid requirements.txt-style file.
Users have had success with these three techniques to generate their user lockfiles:
Technique | Command | Limitations |
---|---|---|
venv + pip freeze | Create a script like the one below. If you have multiple resolves, run this script once per resolve. | The lockfile will not have --hash , which is less secure for supply chain attacks. This does allow you to use VCS (Git) requirements, however.The lockfile may not work on platforms and Python versions other than what was used to create the virtual env. |
pip-compile | pip-compile --generate-hashes --allow-unsafe -o lock.txt requirements.txt | The lockfile may not work on platforms and Python versions other than what was used to run pip-compile .Will not capture any python_requirement targets declared explicitly in BUILD files or in pyproject.toml .Does not account for multiple resolves. |
Poetry | poetry export --dev -o lock.txt | Requires that you are using Poetry for dependency management. Will not capture any python_requirement targets declared explicitly in BUILD files or in requirements.txt .Does not account for multiple resolves. |
Script to manually generate a user lockfile via pip freeze
:
#!/usr/bin/env bash
set -euo pipefail
# You can change these constants.
PYTHON_BIN=python3
VIRTUALENV=build-support/.venv
PIP="${VIRTUALENV}/bin/pip"
LOCKFILE=lockfile.txt
"${PYTHON_BIN}" -m venv "${VIRTUALENV}"
"${PIP}" install pip --upgrade
# Install all our requirements.txt, and also any 3rdparty
# dependencies specified outside requirements.txt, e.g. via a
# handwritten python_requirement_library target.
"${PIP}" install \
-r <(./pants dependencies :: |
xargs ./pants filter --target-type=python_requirement |
xargs ./pants peek |
jq -r '.[]["requirements"][]')
echo "# Generated by build-support/generate_constraints.sh on $(date)" > "${CONSTRAINTS_FILE}"
"${PIP}" freeze --all >> "${CONSTRAINTS_FILE}"
# If you are using multiple resolves, you will need to use JQ to filter to all
# requirements from a single resolve. For most resolves, use this JQ snippet:
#
# '.[] | select(.resolve == "my-resolve") | .["requirements"][]'
#
# If the resolve is the default, you must also add `or .resolve == null`, like this:
#
# '.[] | select(.resolve == "python-default" or .resolve == null) | .["requirements"][]'
Users have usually had more success using the generate-lockfiles
goal to generate tool lockfiles, so no one has yet written a script to manually generate tool lockfiles. You can grab the requirements used by Pants, though, by inspecting ./pants help-advanced $tool
. Or you can use ./pants help-all
to get JSON that you can query with JQ, e.g. ./pants help-all | jq -r '.scope_to_help_info.isort.advanced'
.
Advanced usage
Requirements with undeclared dependencies
Sometimes a requirement does not properly declare in its packaging metadata the other dependencies it depends on, so those will not be installed. It's especially common to leave off dependencies on setuptools
, which results in import errors like this:
import pkg_resources
ModuleNotFoundError: No module named 'pkg_resources'
To work around this, you can use the dependencies
field of python_requirement
, so that anytime you depend on your requirement, you also bring in the undeclared dependency.
# First, make sure you have a `python_requirement` target for
# the undeclared dependency.
python_requirement(
name="setuptools",
requirements=["setuptools"],
)
python_requirement(
name="mongomock",
requirements=["mongomock"],
dependencies=[":setuptools"],
)
If you are using the python_requirements
and poetry_requirements
target generators, you can use the overrides
field to do the same thing:
- BUILD
- requirements.txt
python_requirements(
name="reqs",
overrides={
"mongomock": {"dependencies": [":reqs#setuptools"]},
},
)
setuptools
mongomock
Version control and local requirements
You might be used to using pip's proprietary VCS-style requirements for this, like git+https://github.com/django/django.git#egg=django
. However, this proprietary format does not work with Pants.
Instead of pip VCS-style requirements:
git+https://github.com/django/django.git#egg=Django
git+https://github.com/django/django.git@stable/2.1.x#egg=Django
git+https://github.com/django/django.git@fd209f62f1d83233cc634443cfac5ee4328d98b8#egg=Django
Use direct references from PEP 440:
Django@ git+https://github.com/django/django.git
Django@ git+https://github.com/django/django.git@stable/2.1.x
Django@ git+https://github.com/django/django.git@fd209f62f1d83233cc634443cfac5ee4328d98b8
You can also install from local files using PEP 440 direct references. You must use an absolute path to the file, and you should ensure that the file exists on your machine.
Django @ file:///Users/pantsbuild/prebuilt_wheels/django-3.1.1-py3-none-any.whl
Pip still works with these PEP 440-compliant formats, so you won't be losing any functionality by switching to using them.
When using version controlled direct references hosted on private repositories with SSH access:
target@ git+ssh://[email protected]:/myorg/myrepo.git@myhash
...you may see errors like:
Complete output (5 lines):
[email protected]: Permission denied (publickey).
fatal: Could not read from remote repository.
Please make sure you have the correct access rights
and the repository exists.
----------------------------------------
To fix this, Pants needs to be configured to pass relevant SSH specific environment variables to processes by adding the following to pants.toml
:
[subprocess-environment]
env_vars.add = [
"SSH_AUTH_SOCK",
]
Custom repositories
If you host your own wheels at a custom index (aka "cheese shop"), you can instruct Pants to use it with the option indexes
in the [python-repos]
scope.
[python-repos]
indexes.add = ["https://custom-cheeseshop.net/simple"]
To exclusively use your custom index—i.e. to not use PyPI—use indexes = [..]
instead of indexes.add = [..]
.
You can also add Python repositories with the option repos
in the [python-repos]
scope.
[python-repos]
repos = ["https://your/repo/here"]
Indexes are assumed to have a nested structure (like http://pypi.org/simple), whereas repos are flat lists of packages.
Tip: use ./pants export
to create a virtual environment for IDEs
See Setting up an IDE for more information on ./pants export
. This will create a virtual environment for your user code for compatibility with the rest of the Python ecosystem, e.g. IDEs like Pycharm.