Hi people,
At the top of our Python packaging page (https://fedoraproject.org/wiki/Packaging:Python), there's a note which reads 'In theory /usr/lib/rpm/pythondeps.sh would also automatically generate "Provides" lines'
This is also true for python modules : distutils and setuptools have a way to specify provides and requires, but those are not reflected in the RPM. It means that they must be entered and maintained by hand in the spec file. In my opinion, this is not optimal.
The perl modules have an interesting way of reflecting their dependencies in the rpm: they add a dependency on "perl(Module::Name)". I believe the same system can be applicable to python modules.
Currently, there are two main packaging systems in python : distutils and setuptools. Distutils declares the dependencies in an egg-info file, which is RFC-822-formatted. Setuptools turns this file into a PKG-INFO file in a subdirectory, and adds a requires.txt files with additional dependencies, in a different format.
The pythondeps.sh script should be able to extract requires from these files. To match the dependencies, pythondeps.sh should create virtual provides like "python(ModuleName) = version".
The good news is : I've written it already ;-) It's based on the pythondeps.sh script from Git master (which changed a little bit due to python3, see bug 532118). Also, the script does not try to be too smart with versionned dependencies, because the format is a little bit different in python and in rpm. For those complicated cases, handwritten requirements can still be added to the spec file. The script only covers the usual cases.
For reference, the dependency format in distutils is described here: http://www.python.org/dev/peps/pep-0314/ The dependency format in setuptools is described here: http://peak.telecommunity.com/DevCenter/setuptools#declaring-dependencies
A patch would not make much sense due to the size of the addition, so here's the full script: http://aurelien.bompard.org/projects/divers/pythondeps.sh
I've tested it with quite a few packages, but to make sure I've written a few unit tests (very simple, bash-based) : http://aurelien.bompard.org/projects/divers/test-pythondeps.sh
I do believe it would be a valuable addition to RPM (of course, even if accepted shortly, I don't expect it to land in F-13 since it requires recompiling all the python packages).
Do you think it's a good idea ? What about the implementation ?
If it looks good to you I'll propose it to rpm.org.
Cheers, Aurélien
In general, +1 to this.
On Sun, Mar 21, 2010 at 11:25:18AM +0100, Aurelien Bompard wrote:
Hi people,
At the top of our Python packaging page (https://fedoraproject.org/wiki/Packaging:Python), there's a note which reads 'In theory /usr/lib/rpm/pythondeps.sh would also automatically generate "Provides" lines'
This is also true for python modules : distutils and setuptools have a way to specify provides and requires, but those are not reflected in the RPM. It means that they must be entered and maintained by hand in the spec file. In my opinion, this is not optimal.
The perl modules have an interesting way of reflecting their dependencies in the rpm: they add a dependency on "perl(Module::Name)". I believe the same system can be applicable to python modules.
Currently, there are two main packaging systems in python : distutils and setuptools. Distutils declares the dependencies in an egg-info file, which is RFC-822-formatted. Setuptools turns this file into a PKG-INFO file in a subdirectory, and adds a requires.txt files with additional dependencies, in a different format.
Note that in some corner cases, the egg-info might not give you what you need.
For instance, if we were to subpackage pkg_resources separately from setuptools we would only have one upstream egg-info that referenced setuptools, nothing referencing pkg_resources.
Like I say, these are corner cases, though.
The pythondeps.sh script should be able to extract requires from these files. To match the dependencies, pythondeps.sh should create virtual provides like "python(ModuleName) = version".
In describing this, module name is probably not the best word. ProjectName or EggName might be better. (Because there can multiple modules but they are all described by a single name in the egg-info).
Also versions are more problematic than naming's corner cases.
* If we backport a bugfix or feature to an old version, the EggInfo may require a newer version than we provide even though it would work fine with our version. * Upstreams frequently put a minimum version in that they have tested with rather than the true minimum version that the package will work with. That means the setup.py file (and thus egginfo) may say it requires SqlAlchemy-0.5.5 but it really works with the SQLAlchemy-0.5.2 that we ship.
The good news is : I've written it already ;-) It's based on the pythondeps.sh script from Git master (which changed a little bit due to python3, see bug 532118). Also, the script does not try to be too smart with versionned dependencies, because the format is a little bit different in python and in rpm. For those complicated cases, handwritten requirements can still be added to the spec file. The script only covers the usual cases.
For reference, the dependency format in distutils is described here: http://www.python.org/dev/peps/pep-0314/ The dependency format in setuptools is described here: http://peak.telecommunity.com/DevCenter/setuptools#declaring-dependencies
Note: PEP about changing the version strings in distutils: http://www.python.org/dev/peps/pep-0386/
It's accepted but not yet implemented. When it is we will have something that we can map upstream versions to our version + release format (although, we'd probably only use the version portion in autodeps).
Also PEPs that change the metadata format in distutils: http://www.python.org/dev/peps/pep-0376/ http://www.python.org/dev/peps/pep-0390/
A patch would not make much sense due to the size of the addition, so here's the full script: http://aurelien.bompard.org/projects/divers/pythondeps.sh
Some specifics: # Handle alpha and rc releases: the version comparator will be rpm, not # python, so 1.0rc1 > 1.0. Deal with it by turning 1.0rc1 into 1.0-0.rc1 # (which is the recommended naming scheme anyway) echo $pyver | sed -e 's/([0-9.]+)([a-z].*)/\1-0.\2/g'
Actually, the naming scheme for Fedora would be: 1.0-0.1.rc1, 1.0-0.2.rc1, etc.
If I'm reading the code correctly, these versions just get put in the autodeps, though, so that shouldn't matter. However, what does matter is that this sequence won't be parsed so that the sequence of versions is correct:
0.9 => 0.9 1.0rc1 => 1.0-0.rc1 1.0 => 1.0 1.0post1=> 1.0-0.post1
That would order as 0.9, 1.0, 1.0-0.rc1, 1.0-0.post1
You want something more like this: 0.9 => 0.9-1 1.0rc1 => 1.0-0.rc1 1.0 => 1.0-1 1.0post1=> 1.0-1.post1
0 and 1 are always added and denote pre release versus post release.
The next problem is deciding which version strings you're going to target. Current distutils, setuptools, and the distutils PEP have three different algorithms for comparing versions. You can throw out current distutils because no one use it. Setuptools is something that the python community is trying to get rid of but it's the current de facto standard. It's version algorithm is a huge mess because it makes a lot of guesses about what you might mean. The distutils PEP is not in a python release yet but it is much better -- there's a few discreet keywords that mean pre release or post release and some methods of combining them.
I've tested it with quite a few packages, but to make sure I've written a few unit tests (very simple, bash-based) : http://aurelien.bompard.org/projects/divers/test-pythondeps.sh
I do believe it would be a valuable addition to RPM (of course, even if accepted shortly, I don't expect it to land in F-13 since it requires recompiling all the python packages).
Do you think it's a good idea ? What about the implementation ?
Idea for Provides: +1
Idea for Requires: I think the versioning landscape in python is pretty crazy right now. I'd leave off versions in Requires altogether until (hopefully) PEP-386 is implemented and becomes standard. (But including Requires: python(SQLAlchemy) will be an improvement by itself)
Implementation: The only bug I saw was how you were changing upstream version to an rpm orderable version-release string and any changes needed to remove version from the Requires generation.
-Toshio
In general, +1 to this.
Thanks for your very detailed reply !
In describing this, module name is probably not the best word. ProjectName or EggName might be better. (Because there can multiple modules but they are all described by a single name in the egg-info).
Right, I meant EggName (the Name: tag in the egg-info).
Also PEPs that change the metadata format in distutils: http://www.python.org/dev/peps/pep-0376/ http://www.python.org/dev/peps/pep-0390/
Yes, Tarek FTW !! :)
Idea for Provides: +1
OK
Idea for Requires: I think the versioning landscape in python is pretty crazy right now. I'd leave off versions in Requires altogether until (hopefully) PEP-386 is implemented and becomes standard.
Agreed, in the light of your explanations, maybe we should leave versions aside for now. We could only handle simple versions (only numbers and dots), but that would not solve the problem of backporting fixes or the ... say, casualness ... with which upstream authors add versionned deps ;-) Actually, I always had in mind to handle the most common cases and leave the corner cases for hand-made Requires, so I wholeheartedly agree.
Thanks for your review. I've updated the script in the previous URLs[1] if you want to have a look at it.
I'm going to open a ticket in RPM's trac instance. If you know of a better way, of course, I'm interested.
Regards,
Aurélien
[1] http://aurelien.bompard.org/projects/divers/pythondeps.sh http://aurelien.bompard.org/projects/divers/test-pythondeps.sh
python-devel@lists.fedoraproject.org