Skip to main content
Version: 2.21 (deprecated)

Installing tools

Various methods for Pants to access the tools your plugin needs.


BinaryPaths: Find already installed binaries

For certain tools that are hard to automatically install—such as Docker or language interpreters—you may want to assume that the user already has the tool installed on their machine.

The simplest approach is to assume that the binary is installed at a fixed absolute path, such as /bin/echo or /usr/bin/perl. In the argv for your Process, use this absolute path as your first element.

If you instead want to allow the binary to be located anywhere on a user's machine, you can use BinaryPaths to search certain directories—such as a user's $PATH—to find the absolute path to the binary.

from pants.core.util_rules.system_binaries import (
BinaryPathRequest,
BinaryPaths,
ProcessResult,
Process,
)

@rule
async def demo(...) -> Foo:
docker_paths = await Get(
BinaryPaths,
BinaryPathRequest(
binary_name="docker",
search_path=["/usr/bin", "/bin"],
)
)
docker_bin = docker_paths.first_path
if docker_bin is None:
raise OSError("Could not find 'docker'.")
result = await Get(ProcessResult, Process(argv=[docker_bin.path, ...], ...))

BinaryPaths has a field called paths: tuple[BinaryPath, ...], which stores all the discovered absolute paths to the specified binary. Each BinaryPath object has the fields path: str, such as /usr/bin/docker, and fingerprint: str, which is used to invalidate the cache if the binary changes. The results will be ordered by the order of search_path, meaning that earlier entries in search_path will show up earlier in the result.

BinaryPaths also has a convenience property called first_path: BinaryPath | None, which will return the first matching path, if any.

In this example, the search_path is hardcoded. Instead, you may want to create a subsystem to allow users to override the search path through a dedicated option. See pex_environment.py for an example that allows the user to use the special string <PATH> to read the user's $PATH environment variable.

Checking for valid binaries (recommended)

When setting up a BinaryPathsRequest, you can optionally pass the argument test: BinaryPathTest. When discovering a binary, Pants will run your test and only use the binary if the return code is 0. Pants will also fingerprint the output and invalidate the cache if the output changes from before, such as because the user upgraded the version of the tool.

Why do this? This is helpful to ensure that all discovered binaries are valid and safe. This is also important for Pants to be able to detect when the user has changed the binary, such as upgrading its version.

BinaryPathTest takes the argument args: Iterable[str], which is the arguments that Pants should run on your binary to ensure that it's a valid program. Usually, you'll set args=["--version"].

from pants.core.util_rules.system_binaries import BinaryPathRequest, BinaryPathTest

BinaryPathRequest(
binary_name="docker",
search_path=["/usr/bin", "/bin"],
test=BinaryPathTest(args=["--version"]),
)

You can optionally set fingerprint_stdout=False to the BinaryPathTest constructor, but usually, you should keep the default of True.

ExternalTool: Install pre-compiled binaries

If your tool has a pre-compiled binary available online, Pants can download and use that binary automatically for users. This is often a better user experience than requiring the users to pre-install the tool. This will also make your build more deterministic because everyone will be using the same binary.

First, manually download the file. Typically, the downloaded file will be an archive like a .zip or .tar.xz file, but it may also be the actual binary. Then, run shasum -a 256 on the downloaded file to get its digest ID, and wc -c to get its number of bytes.

If the downloaded file is an archive, you will also need to find the relative path within the archive to the binary, such as bin/shellcheck. You may need to use a tool like unzip to inspect the archive.

With this information, you can define a new ExternalTool:

from pants.core.util_rules.external_tool import ExternalTool
from pants.engine.platform import Platform

class Shellcheck(ExternalTool):
options_scope = "shellcheck"
help = "A linter for shell scripts."

default_version = "v0.7.1"
default_known_versions = [
"v0.7.1|macos_arm64 |b080c3b659f7286e27004aa33759664d91e15ef2498ac709a452445d47e3ac23|1348272",
"v0.7.1|macos_x86_64|b080c3b659f7286e27004aa33759664d91e15ef2498ac709a452445d47e3ac23|1348272",
"v0.7.1|linux_arm64 |b50cc31509b354ab5bbfc160bc0967567ed98cd9308fd43f38551b36cccc4446|1432492",
"v0.7.1|linux_x86_64|64f17152d96d7ec261ad3086ed42d18232fcb65148b44571b564d688269d36c8|1443836",
]

def generate_url(self, plat: Platform) -> str:
platform_mapping = {
"macos_arm64": "darwin.x86_64",
"macos_x86_64": "darwin.x86_64",
"linux_arm64": "linux.aarch64",
"linux_x86_64": "linux.x86_64",
}
plat_str = platform_mapping[plat.value]
return (
f"https://github.com/koalaman/shellcheck/releases/download/{self.version}/"
f"shellcheck-{self.version}.{plat_str}.tar.xz"
)

def generate_exe(self, _: Platform) -> str:
return f"./shellcheck-{self.version}/shellcheck"

You must define the class properties default_version and default_known_versions. default_known_versions is a list of pipe-separated strings in the form version|platform|sha256|length. Use the values you found earlier by running shasum and wc for sha256 and length, respectively. platform should be one of linux_arm64, linux_x86_64, macos_arm64, and macos_x86_64.

You must also define the methods generate_url, which is the URL to make a GET request to download the file, and generate_exe, which is the relative path to the binary in the downloaded digest. Both methods take plat: Platform as a parameter.

Because an ExternalTool is a subclass of Subsystem, you must also define an options_scope. You may optionally register additional options from pants.option.option_types.

In your rules, include the ExternalTool as a parameter of the rule, then use Get(DownloadedExternalTool, ExternalToolRequest) to download and extract the tool.

from pants.core.util_rules.external_tool import DownloadedExternalTool, ExternalToolRequest
from pants.engine.platform import Platform

@rule
async def demo(shellcheck: Shellcheck, ...) -> Foo:
shellcheck = await Get(
DownloadedExternalTool,
ExternalToolRequest,
shellcheck.get_request(platform)
)
result = await Get(
ProcessResult,
Process(argv=[shellcheck.exe, ...], input_digest=shellcheck.digest, ...)
)

A DownloadedExternalTool object has two fields: digest: Digest and exe: str. Use the .exe field as the first value of a Process's argv, and use the .digest in the Process's input_digest. If you want to use multiple digests for the input, call Get(Digest, MergeDigests) with the DownloadedExternalTool.digest included.

Pex: Install binaries through pip

If a tool can be installed via pip - e.g., Pytest or Black - you can install and run it using Pex.

from pants.backend.python.target_types import ConsoleScript
from pants.backend.python.util_rules.interpreter_constraints import InterpreterConstraints
from pants.backend.python.util_rules.pex import (
Pex,
PexProcess,
PexRequest,
PexRequirements,
)
from pants.engine.process import FallibleProcessResult

@rule
async def demo(...) -> Foo:
pex = await Get(
Pex,
PexRequest(
output_filename="black.pex",
internal_only=True,
requirements=PexRequirements(["black==19.10b0"]),
interpreter_constraints=InterpreterConstraints([">=3.6"]),
main=ConsoleScript("black"),
)
)
result = await Get(
FallibleProcessResult,
PexProcess(pex, argv=["--check", ...], ...),
)

When defining a PexRequest for a tool, you must give arguments for output_filename, internal_only, requirements, main, and usually interpreter_constraints.

Set internal_only if the PEX is only used as an internal tool, rather than distributed to users (e.g. the package goal). This speeds up performance when building the PEX.

The main argument can be one of:

  • ConsoleScript("scriptname"), where scriptname is a console_script that the tool installs
  • EntryPoint.parse("module"), which executes the given module
  • EntryPoint.parse("module:func"), which executes the given nullary function in the given module.

There are several other optional parameters that may be helpful.

The resulting Pex object has a digest: Digest field containing the built .pex file. This digest should be included in the input_digest to the Process you run.

Instead of the normal Get(ProcessResult, Process), you should use Get(ProcessResult, PexProcess), which will set up the environment properly for your Pex to execute. There is a predefined rule to go from PexProcess -> Process, so Get(ProcessResult, PexProcess) will cause the engine to run PexProcess -> Process -> ProcessResult.

PexProcess requires arguments for pex: Pex, argv: Iterable[str], and description: str. It has several optional parameters that mirror the arguments to Process. If you specify input_digest, be careful to first use Get(Digest, MergeDigests) on the pex.digest and any of the other input digests.

Use PythonToolBase when you need a Subsystem

Often, you will want to create a Subsystem for your Python tool to allow users to set options to configure the tool. You can subclass PythonToolBase, which subclasses Subsystem, to do this:

from pants.backend.python.subsystems.python_tool_base import PythonToolBase
from pants.backend.python.target_types import ConsoleScript
from pants.option.option_types import StrOption

class Black(PythonToolBase):
options_scope = "black"
help = "The Black Python code formatter (https://black.readthedocs.io/)."

default_main = ConsoleScript("black")

register_interpreter_constraints = True
default_interpreter_constraints = ["CPython>=3.8,<3.9"]

default_lockfile_resource = ("pants.backend.python.lint.black", "black.lock")

config = StrOption(
default=None,
advanced=True,
help="Path to Black's pyproject.toml config file",
)

You must define the class properties options_scope and default_main, and a default lockfile at the location referenced by default_lockfile_resource.

Then, you can set up your Pex like this:

@rule
async def demo(black: Black, ...) -> Foo:
pex = await Get(Pex, PexRequest, black.to_pex_request())