(This is a scaled-back version of a proposal I sent to this list a couple of months ago [1])
There are various configuration flags that can be used when building Python.
Currently we have a configuration aimed at the typical use-case: as much optimization as reasonable.
However, upstream Python supports a number of useful debug options which use more RAM and CPU cycles, but make it easier to track down bugs [2] Typically these are of use to people working on Python C extensions, for example, for tracking down awkward reference-counting mistakes. I've had at least three developers whose opinion I value very highly ask me for these (for example John Palmieri is currently working on the PyGI stack, and is running into difficult reference-counting issues). Indeed, Debian and Ubuntu have had these alternate builds available for a couple of years now. [3]
I've looked through Debian's patch [4], and come up with a somewhat modified version that does mostly the same thing, though (I think) somewhat better fitting our build process.
The python.spec now configures and builds, and installs the python sources twice, once with the regular optimized settings, and again with debug settings. (in most cases the files are identical between the two installs, and for the files that are different, they get separate paths)
I've been testing with this on my machine and it works fine; I've also been able to successfully use distutils to build extension modules
So I've decided to try this in Rawhide for F-14; the latest build is here: http://koji.fedoraproject.org/koji/buildinfo?buildID=174357
The relevant CVS commit is here: http://cvs.fedoraproject.org/viewvc/rpms/python/devel/python.spec?r1=1.184&a... http://cvs.fedoraproject.org/viewvc/rpms/python/devel/python-2.6.5-debug-bui...
and the specfile comment contains more detailed implementation notes.
The builds are set up so that they can share the same .py and .pyc files - they have the same bytecode format.
However, they are incompatible at the machine-code level: the extra debug-checking options change the layout of Python objects in memory, so the configurations have different shared library ABIs. A compiled C extension built for one will not work with the other.
The key to keeping the different module ABIs separate is that module "foo.so" for the standard optimized build will instead be "foo_d.so" i.e. gaining a "_d" suffix to the filename, and this is what the "import" routine will look for. This convention is from the Debian patch, and ultimately comes from the way the Windows build is set up in the upstream build process.
Similarly, the optimized libpython2.6.so.1.0 now has a libpython2.6_d.so.1.0 cousin for the debug build: all of the extension modules are linked against the appropriate libpython, and there's a /usr/include/python2.6-debug directory, parallel with the /usr/include/python2.6 directory. There's a new "sys.pydebug" boolean to distinguish the two configurations, and the distutils module uses this to supply the appropriate header paths ,and linker flags when building C extension modules.
Finally, the debug build's python binary is /usr/bin/python2.6-debug, hardlinked as /usr/bin/python-debug (as opposed to /usr/bin/python2.6 and /usr/bin/python)
It's easy to spot the debug build: the interactive mode tells you the total reference count of all live Python objects after each command:
[david@surprise devel]$ python-debug Python 2.6.5 (r265:79063, May 19 2010, 18:20:14) [GCC 4.4.3 20100422 (Red Hat 4.4.3-18)] on linux2 Type "help", "copyright", "credits" or "license" for more information.
print "hello world"
hello world [28748 refs]
[28748 refs] [15041 refs]
So the debug build shares _most_ of the files with the regular build (.py/.pyc/.pyo files; directories; support data; documentation); the only differences are the ELF files (binaries/shared libraries), and infrastructure relating to configuration (Include files, Makefile, python-config => python-debug-config, etc) that are different.
I've tested building the "coverage" module against both runtimes, and it works; it installs shared .py/.pyc files and a pair of tracer.so/tracer_d.so files.
I tried a few different ways of packaging the debug configuration: I considered (a) adding it to the python-devel subpackage, or (b) to the python-debuginfo subpackage (Debian adds it to their python-dbg packages, which are kind of the equivalent of our -debuginfo rpms), alternatively (c) building out a "debug" subpackage for each of the subpackages within the python specfile, doubling the number of subpackages
The approach I favor (option (d), I guess), is to have a single "python-debug" subpackage, holding everything to do with the debug configuration: equivalent to all of the subpackaes from the regular configuration, and requiring them all (since they leverage the shared .py files, for instance). My reasoning here is that this feature is aimed at advanced Python developers, and if you want some of it you probably want all of it - so just one subpackage for simplicitly - but you don't need it for regular builds or debugging, so it's seems better to keep separate from the -devel and -debuginfo subpackages.
This is a scaled-back version of my earlier proposal (in which I proposed entirely parallel stacks, and varying the unicode settings) This is far simpler. In particular, the optimized build should be unaffected: all of the paths and the ELF metadata for the standard build should be unchanged compared to how they were before adding the debug configuration.
I would like to build out some of our compiled extension modules so that we can add -debug subpackages, in an analogous way to the core python package, but I think it should purely be a voluntary thing: I don't want to burden people packaging Python modules with additional work. Having said that, if you do find yourself debugging a nasty reference counting issue inside an extension module, you'll need a debug build of every C extension module that your reproducer script uses, so the more the better. For reference, Ubuntu do this for all of the Python code in a typical GNOME desktop [3]. We should figure out sane RPM conventions for packaging these (sorry: yes I want to change the python packaging guidelines again, hopefully less invasive than the Python 3 change though)
I'm tracking all of this work here: https://fedoraproject.org/wiki/DaveMalcolm/DebugPythonStacks
I hope for it to be a Fedora 14 feature. It's debatable whether it should be a feature: this is an area where we're somewhat behind other distributions, so not so good from a marketing perspective - but a good thing to get fixed.
I plan to work next on doing the same for our python3 src.rpm. I need to try to get this upstream in some form as well.
Hope this seems sane - thoughts? (thanks for reading this far; I know this email is too long)
Dave
[1] http://lists.fedoraproject.org/pipermail/python-devel/2010-March/000213.html [2] http://svn.python.org/projects/python/trunk/Misc/SpecialBuilds.txt [3] https://wiki.ubuntu.com/PyDbgBuilds [4] http://patch-tracker.debian.org/patch/series/view/python2.6/2.6.5-2/debug-bu... and http://patch-tracker.debian.org/patch/series/view/python2.6/2.6.5-2/pydebug-...
On Thu, 2010-05-20 at 15:37 -0400, David Malcolm wrote:
[snip]
I'm tracking all of this work here: https://fedoraproject.org/wiki/DaveMalcolm/DebugPythonStacks
[snip]
I plan to work next on doing the same for our python3 src.rpm. I need to try to get this upstream in some form as well.
I've done this now for python3; python3-3.1.2-6.fc14 has a python3-debug subpackage (the build is here: http://koji.fedoraproject.org/koji/buildinfo?buildID=174936 , and the CVS commit here: http://cvs.fedoraproject.org/viewvc/rpms/python3/devel/python3.spec?r1=1.21&...
I'm working next on enabling more of the debug flags in the debug builds; you can see status on the under-construction feature page here: https://fedoraproject.org/wiki/DaveMalcolm/DebugPythonStacks
Dave
On Mon, 2010-05-24 at 20:11 -0400, David Malcolm wrote:
On Thu, 2010-05-20 at 15:37 -0400, David Malcolm wrote:
[snip]
I'm working next on enabling more of the debug flags in the debug builds; you can see status on the under-construction feature page here: https://fedoraproject.org/wiki/DaveMalcolm/DebugPythonStacks
I've now enabled WITH_TSC, with COUNT_ALLOCS, and CALL_PROFILE for both python and python3 (see the wiki page above for the specific builds).
[david@surprise devel]$ python3-debug Python 3.1.2 (r312:79147, May 25 2010, 12:21:20) [GCC 4.4.3 20100422 (Red Hat 4.4.3-18)] on linux2 Type "help", "copyright", "credits" or "license" for more information.
import sys ; from pprint import pprint
[32089 refs]
pprint(sys.getcounts())
[('ImportError', 2, 2, 1), ('_Helper', 1, 0, 1), ('_Printer', 3, 0, 3), [snip] ('dict', 3671, 3325, 389), ('str', 16194, 12860, 3334), ('tuple', 13995, 12300, 1740)] [32101 refs]
One issue is that COUNT_ALLOCS seems to unconditionally log debug information to stdout on exit: memoryview alloc'd: 2, freed: 2, max in use: 1 ImportError alloc'd: 2, freed: 2, max in use: 1 _Helper alloc'd: 1, freed: 1, max in use: 1 _Printer alloc'd: 3, freed: 3, max in use: 3
which is likely to break scripts that capture stdout from python scripts.
It seems useful to have sys.getcounts(), so perhaps we should talk with upstream and instead emit counts to stderr instead, or make this only happen if an envvar is set, or simply omit it.
Thoughts? Dave
python-devel@lists.fedoraproject.org