Hello all,
I was hoping for some Python packaging advice, as it relates to "porting" to Python 3-- I assume this is the right place to ask?
I currently maintain python-pdfminer. (https://github.com/euske/pdfminer), sadly only for Python 2. Recently I investigated what the status of Python 3 support would be; it seems that pdfminer upstream is uninterested in adding Python 3 compatibility (see https://github.com/euske/pdfminer/pull/71). That pull request was turned into a fork, pushed to PyPI under the name pdfminer.six, and in general it seems to be a lot more maintained at the moment than pdfminer.
I investigated packaging pdfminer.six (package name stylized as pdfminer-six). I set python2-pdfminer-six to obsolete pdfminer and made it available through this Copr for testing purposes: https://copr.fedoraproject.org/coprs/tc01/pdfminer.six/
Now, I don't know if anyone is currently using pdfminer in Fedora, and I am very hesitant to just replace a package with a fork.
What's the right thing to do here? Replace pdfminer? Ship python3-pdfminer-six, have it provide python3-pdfminer, and keep using the original package for Python 2? Do nothing, and wait and see what happens upstream?
Thanks for any suggestions in advance, Ben
First, I would suggest checking to see if anything even uses python-pdfminer. I use DNF's repoquery to identify things that use it. Here's an example command you can use to identify if something depends on it: * sudo dnf repoquery --queryformat "%{sourcerpm}: %{reponame}" --whatrequires "python-pdfminer"
Alternatively, if you want to know the *actual* RPMs that depend on it, you can use "%{name}-%{version}-%{release}.%{arch}" instead of "%{sourcerpm}". I find the source RPMs are more useful to see, though it depends on your case.
In any case, it looks like nothing depends on python-pdfminer, based on my repoquery. If the API is compatible and it uses the same module namespace, it sounds like it should be fine to replace it and have the appropriate Obsoletes/Provides set up.
On Sun, Dec 20, 2015 at 1:29 PM, Ben Rosser rosser.bjr@gmail.com wrote:
Hello all,
I was hoping for some Python packaging advice, as it relates to "porting" to Python 3-- I assume this is the right place to ask?
I currently maintain python-pdfminer. (https://github.com/euske/pdfminer), sadly only for Python 2. Recently I investigated what the status of Python 3 support would be; it seems that pdfminer upstream is uninterested in adding Python 3 compatibility (see https://github.com/euske/pdfminer/pull/71). That pull request was turned into a fork, pushed to PyPI under the name pdfminer.six, and in general it seems to be a lot more maintained at the moment than pdfminer.
I investigated packaging pdfminer.six (package name stylized as pdfminer-six). I set python2-pdfminer-six to obsolete pdfminer and made it available through this Copr for testing purposes: https://copr.fedoraproject.org/coprs/tc01/pdfminer.six/
Now, I don't know if anyone is currently using pdfminer in Fedora, and I am very hesitant to just replace a package with a fork.
What's the right thing to do here? Replace pdfminer? Ship python3-pdfminer-six, have it provide python3-pdfminer, and keep using the original package for Python 2? Do nothing, and wait and see what happens upstream?
Thanks for any suggestions in advance, Ben
python-devel mailing list python-devel@lists.fedoraproject.org http://lists.fedoraproject.org/admin/lists/python-devel@lists.fedoraproject....
On Sun, Dec 20, 2015 at 01:29:10PM -0500, Ben Rosser wrote:
What's the right thing to do here? Replace pdfminer? Ship python3-pdfminer-six, have it provide python3-pdfminer, and keep using the original package for Python 2? Do nothing, and wait and see what happens upstream?
It's a tough thing to decide as there's many unanswerable questions -- will the pdfminer upstream change their mind about having dual py2 and py3 compat? Will pdfminer.six development tail off and die? You can't predict the future so anything has some risk of backing the wrong horse.
Luckily, if circumstances change with the upstreams we can always change our minds about what the right thing to do is. If you are worried about that, I think implementing a plan that makes it as easy as possible to change which fork we're packaging and promoting later is optimal. So keeping the python-pdfminer package name but packaging the source for pdfminer-six seems to make sense to me. That way, as long as they stay API compatible, packagers of dependent packages don't have to do anything if you have to switch from one upstream fork to another.
-Toshio
On 21 December 2015 at 15:19, Toshio Kuratomi a.badger@gmail.com wrote:
On Sun, Dec 20, 2015 at 01:29:10PM -0500, Ben Rosser wrote:
What's the right thing to do here? Replace pdfminer? Ship python3-pdfminer-six, have it provide python3-pdfminer, and keep using the original package for Python 2? Do nothing, and wait and see what happens upstream?
It's a tough thing to decide as there's many unanswerable questions -- will the pdfminer upstream change their mind about having dual py2 and py3 compat? Will pdfminer.six development tail off and die? You can't predict the future so anything has some risk of backing the wrong horse.
Luckily, if circumstances change with the upstreams we can always change our minds about what the right thing to do is. If you are worried about that, I think implementing a plan that makes it as easy as possible to change which fork we're packaging and promoting later is optimal. So keeping the python-pdfminer package name but packaging the source for pdfminer-six seems to make sense to me. That way, as long as they stay API compatible, packagers of dependent packages don't have to do anything if you have to switch from one upstream fork to another.
+1 from me for this approach - since you're tackling this as the current maintainer of python-pdfminer, it's your call as to which upstream project that distro package actually represents.
With the pdfminer-six upstream package aiming to be "like PDF miner, only with Python 3 support", originally based on a pull request against the pdfminer code, and everything licensed under MIT/X, it makes a lot more sense to go down than path than it does to either:
* carry the Python 3 support as a downstream patch; or * expose the upstream complexity to downstream users
As Toshio notes, it's also possible that if the pdfminer.six fork sees sufficient interest, the original pdfminer maintainer may reconsider their willingness to accept the additional code complexity back into the main project.
Cheers, Nick.
On Mon, Dec 21, 2015 at 6:33 AM, Nick Coghlan ncoghlan@gmail.com wrote:
On 21 December 2015 at 15:19, Toshio Kuratomi a.badger@gmail.com wrote:
On Sun, Dec 20, 2015 at 01:29:10PM -0500, Ben Rosser wrote:
What's the right thing to do here? Replace pdfminer? Ship
python3-pdfminer-six,
have it provide python3-pdfminer, and keep using the original package
for
Python 2? Do nothing, and wait and see what happens upstream?
It's a tough thing to decide as there's many unanswerable questions --
will
the pdfminer upstream change their mind about having dual py2 and py3 compat? Will pdfminer.six development tail off and die? You can't
predict
the future so anything has some risk of backing the wrong horse.
Luckily, if circumstances change with the upstreams we can always change
our
minds about what the right thing to do is. If you are worried about
that,
I think implementing a plan that makes it as easy as possible to change which fork we're packaging and promoting later is optimal. So keeping
the
python-pdfminer package name but packaging the source for pdfminer-six
seems
to make sense to me. That way, as long as they stay API compatible, packagers of dependent packages don't have to do anything if you have to switch from one upstream fork to another.
+1 from me for this approach - since you're tackling this as the current maintainer of python-pdfminer, it's your call as to which upstream project that distro package actually represents.
With the pdfminer-six upstream package aiming to be "like PDF miner, only with Python 3 support", originally based on a pull request against the pdfminer code, and everything licensed under MIT/X, it makes a lot more sense to go down than path than it does to either:
- carry the Python 3 support as a downstream patch; or
- expose the upstream complexity to downstream users
As Toshio notes, it's also possible that if the pdfminer.six fork sees sufficient interest, the original pdfminer maintainer may reconsider their willingness to accept the additional code complexity back into the main project.
Cheers, Nick.
Makes sense to me. I guess I will take that route, then, and push an update to the current pdfminer package for Rawhide that switches over to using the pdfminer-six sources.
Thanks for the feedback, everyone! Ben Rosser
python-devel@lists.fedoraproject.org