pikepdf does a complex job in providing bindings from Python to a C++ library,
both of which have different ideas about how to manage memory. This page
documents some methods that may help should it be necessary to debug the Python
C++ extension (
Using gdb to debug C++ and Python
Current versions of gdb can debug Python and C++ code simultaneously. See the Python developer’s guide on gdb Support. To use this effectively, a debug build of pikepdf and QPDF should be created.
Compiling a debug build of QPDF
To download QPDF and compile a debug build:
# in QPDF source tree
cmake -S . -B build -DENABLE_QTC=ON -DCMAKE_BUILD_TYPE=Debug
cmake --build build -j
Enabling QPDF tracing
For builds of QPDF having ENABLE_QTC=ON, setting the environment variables
TC_FILENAME=your_log_file.txt will cause libqpdf to
log debug messages to the designated file. For example:
env TC_SCOPE=qpdf TC_FILENAME=libqpdf_log.txt python my_pikepdf_script.py
Valgrind may also be helpful - see the Python documentation for information on setting up Python and Valgrind.
The standard Python profiling tools in
cProfile work fine for many
purposes but cannot explore inside pikepdf’s C++ functions.
The py-spy program can effectively profile time spent in Python or executing C++ code and demangle many C++ names to the appropriate symbols.
Happily it also does not require recompiling in any special mode, unless one desires more symbol information than libqpdf or the C++ standard library exports.
For best results, use py-spy to generate speedscope files and use the speedscope application to view them. py-spy’s SVG output is illegible due to long C++ template names as of this writing.
To install profiling and use profiling software:
# From a virtual environment with pikepdf installed...
pip install py-spy
npm install -g speedscope # may need sudo to install this
# Run profile on a script that executes some pikepdf code we want to profile
py-spy record --native --format speedscope -o profile.speedscope -- python some_script.py
# View results (this will open a browser window)
To profile pikepdf’s test suite, ensure that you run
pytest -n0 to disable
multiple CPU usage, since py-spy cannot trace inside child processes.
pymemtrace is another helpful tool for diagnosing memory leaks.