Debugging and benchmarking
Some techniques to figure out why Pants is behaving the way it is.
Benchmarking with hyperfine
We use hyperfine
to benchmark, especially comparing before and after to see the impact of a change: https://github.com/sharkdp/hyperfine.
When benchmarking, you must decide if you care about cold cache performance vs. warm cache (or both). If cold, use --no-pantsd --no-process-execution-local-cache
. If warm, use hyperfine's option --warmup=1
.
For example:
❯ hyperfine --warmup=1 --runs=5 './pants list ::`
❯ hyperfine --runs=5 './pants --no-pantsd --no-process-execution-local-cache lint ::'
Profiling with PySpy (TODO)
Identifying the impact of Python's GIL (on macOS)
Obtaining Full Thread Backtraces
Pants runs as a Python program that calls into a native Rust library. In debugging locking and deadlock issues, it is useful to capture dumps of the thread stacks in order to figure out where a deadlock may be occurring.
One-time setup:
- Ensure that gdb is installed.
- Ubuntu:
sudo apt install gdb
- Ubuntu:
- Ensure that the kernel is configured to allow debuggers to attach to processes that are not in the same parent/child process hierarchy.
echo 0 | sudo tee /proc/sys/kernel/yama/ptrace_scope
- To make the change permanent, add a file to /etc/sysctl.d named
99-ptrace.conf
with contentskernel.yama.ptrace_scope = 0
. Note: This is a security exposure if you are not normally debugging processes across the process hierarchy.
- Ensure that the debug info for your system Python binary is installed.
- Ubuntu:
sudo apt install python3-dbg
- Ubuntu:
Dumping thread stacks:
- Find the pants binary (which may include pantsd if pantsd is enabled).
- Run:
ps -ef | grep pants
- Run:
- Invoke gdb with the python binary and the process ID:
- Run:
gdb /path/to/python/binary PROCESS_ID
- Run:
- Enable logging to write the thread dump to
gdb.txt
:set logging on
- Dump all thread backtraces:
thread apply all bt
- If you use pyenv to mange your Python install, a gdb script will exist in the same directory as the Python binary. Source it into gdb:
source ~/.pyenv/versions/3.8.5/bin/python3.8-gdb.py
(if using version 3.8.5)
- Dump all Python stacks:
thread apply all py-bt