Debugging and benchmarking
Some techniques to figure out why Pants is behaving the way it is.
Benchmarking with hyperfine
We use hyperfine
to benchmark, especially comparing before and after to see the impact of a change: https://github.com/sharkdp/hyperfine.
When benchmarking, you must decide if you care about cold cache performance vs. warm cache (or both). If cold, use --no-pantsd --no-local-cache
. If warm, use hyperfine's option --warmup=1
.
For example:
❯ hyperfine --warmup=1 --runs=5 'pants list ::`
❯ hyperfine --runs=5 'pants --no-pantsd --no-local-cache lint ::'
Profiling with py-spy
py-spy
is a profiling sampler which can also be used to compare the impact of a change before and after: https://github.com/benfred/py-spy.
To profile with py-spy
:
- Activate Pants' development venv
source ~/.cache/pants/pants_dev_deps/<your platform dir>/bin/activate
- Add Pants' code to Python's path
export PYTHONPATH=src/pants:$PYTHONPATH
- Run Pants with
py-spy
(be sure to disablepantsd
)py-spy record --subprocesses -- python -m pants.bin.pants_loader --no-pantsd <pants args>
The default output is a flamegraph. py-spy
can also output speedscope (https://github.com/jlfwong/speedscope) JSON with the --format speedscope
flag. The resulting file can be uploaded to https://www.speedscope.app/ which provides a per-process, interactive, detailed UI.
Additionally, to profile the Rust code the --native
flag can be passed to py-spy
as well. The resulting output will contain frames from Pants Rust code.
Debugging rule
code with a debugger
Running pants with the PANTS_DEBUG
environment variable set will use debugpy
(https://github.com/microsoft/debugpy)
to start a Debug-Adapter server (https://microsoft.github.io/debug-adapter-protocol/) which will
wait for a client connection before running Pants.
You can connect any Debug-Adapter-compliant editor (Such as VSCode) as a client, and use breakpoints,
inspect variables, run code in a REPL, and break-on-exceptions in your rule
code.
NOTE: PANTS_DEBUG
doesn't work with the pants daemon, so --no-pantsd
must be specified.
Identifying the impact of Python's GIL (on macOS)
Obtaining Full Thread Backtraces
Pants runs as a Python program that calls into a native Rust library. In debugging locking and deadlock issues, it is useful to capture dumps of the thread stacks in order to figure out where a deadlock may be occurring.
One-time setup:
- Ensure that gdb is installed.
- Ubuntu:
sudo apt install gdb
- Ubuntu:
- Ensure that the kernel is configured to allow debuggers to attach to processes that are not in the same parent/child process hierarchy.
echo 0 | sudo tee /proc/sys/kernel/yama/ptrace_scope
- To make the change permanent, add a file to /etc/sysctl.d named
99-ptrace.conf
with contentskernel.yama.ptrace_scope = 0
. Note: This is a security exposure if you are not normally debugging processes across the process hierarchy.
- Ensure that the debug info for your system Python binary is installed.
- Ubuntu:
sudo apt install python3-dbg
- Ubuntu:
Dumping thread stacks:
- Find the pants binary (which may include pantsd if pantsd is enabled).
- Run:
ps -ef | grep pants
- Run:
- Invoke gdb with the python binary and the process ID:
- Run:
gdb /path/to/python/binary PROCESS_ID
- Run:
- Enable logging to write the thread dump to
gdb.txt
:set logging on
- Dump all thread backtraces:
thread apply all bt
- If you use pyenv to mange your Python install, a gdb script will exist in the same directory as the Python binary. Source it into gdb:
source ~/.pyenv/versions/3.8.5/bin/python3.8-gdb.py
(if using version 3.8.5)
- Dump all Python stacks:
thread apply all py-bt