Debugging
pikepdf does a complex job in providing bindings from Python to a C++ library,
both of which have different ideas about how to manage memory. This page
documents some methods that may help should it be necessary to debug the Python
C++ extension (pikepdf._qpdf
).
Enabling QPDF tracing
Setting the environment variables TC_SCOPE=qpdf
and
TC_FILENAME=your_log_file.txt
will cause libqpdf to log debug messages to the
designated file. For example:
env TC_SCOPE=qpdf TC_FILENAME=libqpdf_log.txt python my_pikepdf_script.py
Using gdb to debug C++ and Python
Current versions of gdb can debug Python and C++ code simultaneously. See the Python developer’s guide on gdb Support.
Compiling a debug build of QPDF
It may be helpful to create a debug build of QPDF.
Download QPDF and compile a debug build:
# in QPDF source tree
cd $QPDF_SOURCE_TREE
./configure CFLAGS='-g -O0' CPPFLAGS='-g -O0' CXXFLAGS='-g -O0'
make -j
Compile and link against QPDF source tree
Build pikepdf._qpdf
against the version of QPDF above, rather than the
system version:
env QPDF_SOURCE_TREE=<location of QPDF> python setup.py build_ext --inplace
In addition to building against the QPDF source, you’ll need to force your operating system to load the locally compiled version of QPDF instead of the installed version:
# Linux
env LD_LIBRARY_PATH=$QPDF_SOURCE_TREE/libqpdf/build/.libs python ...
# macOS - may require disabling System Integrity Protection
env DYLD_LIBRARY_PATH=$QPDF_SOURCE_TREE/libqpdf/build/.libs python ...
On macOS you can make the library persistent by changing the name of the library to use in pikepdf’s binary extension module:
install_name_tool -change /usr/local/lib/libqpdf*.dylib \
$QPDF_SOURCE_TREE/libqpdf/build/.libs/libqpdf*.dylib \
src/pikepdf/_qpdf.cpython*.so
You can also run Python through a debugger (gdb
or lldb
) in this manner,
and you will have access to the source code for both pikepdf’s C++ and QPDF.
Valgrind
Valgrind may also be helpful - see the Python documentation for information on setting up Python and Valgrind.
Profiling pikepdf
The standard Python profiling tools in cProfile
work fine for many
purposes but cannot explore inside pikepdf’s C++ functions.
The py-spy program can effectively profile time spent in Python or executing C++ code and demangle many C++ names to the appropriate symbols.
Happily it also does not require recompiling in any special mode, unless one desires more symbol information than libqpdf or the C++ standard library exports.
For best results, use py-spy to generate speedscope files and use the speedscope application to view them. py-spy’s SVG output is illegible due to long C++ template names as of this writing.
To install profiling and use profiling software:
# From a virtual environment with pikepdf installed...
# Install
pip install py-spy
npm install -g speedscope # may need sudo to install this
# Run profile on a script that executes some pikepdf code we want to profile
py-spy record --native --format speedscope -o profile.speedscope -- python some_script.py
# View results (this will open a browser window)
speedscope profile.speedscope
To profile pikepdf’s test suite, ensure that you run pytest -n0
to disable
multiple CPU usage, since py-spy cannot trace inside child processes.