Hello Pythonistas.
I'd like to be able to automatically handle Python "namespace" packages from our packaging macros.
The problem:
Several Python packages share a "namespace", let's take an artificial example with food.spam and food.eggs Python packages.
1. the Python packages both have site-packages/food 2. sometimes such packages also both have site-packages/food/__init__.py (usually empty or mostly empty, but with different mtimes etc.)
On RPM level, this means:
1. %{python3_sitelib}/food can be co-owned OR it can be in an artificial python3-food(-filesystem) package [0] OR it can be in an existing package that is always present [1] 2. %{python3_sitelib}/food/__init__.py and %{python3_sitelib}/food/__pycache__/__init__.*.pyc will conflict if present in multiple packages, they need to be removed or shared from the python3-food(-filesystem) package
I want to solve this once for all, define the best practice, document it in the packaging guidelines and possible automate this in %pyproject_save_files [2].
My current idea is:
- sharing directories is safe and easy, let's do that instead of artificial packages (those are hard to automate) - namespace packages should not need __init__py with modern Python 3, let's discourage that - If needed for %check, the __init__.py + .pyc should be %ghosted [3]
And with the %pyproject_save_files automation, let's say that if %pyproject_save_files is used with a dot:
%pyproject_save_files food.spam
The dots separates a namespace and:
- food folder is co-owned - food/__init__.py + .pyc is %ghosted if found, possibly with a warning - any other Python files in food/ except spam.py or spam/ are not included
In case of nested namespaces (I have never seen that in reality), this can be applied recursively.
Since %pyproject_save_files takes globs, I propose we split the argument on dot and treat each part as a separate glob.
An alternate proposal which is less magical, more explicit about the "namespace" situation but less explicit about what to include requires a special namespace flag:
%pyproject_save_files -N food
This says: Include food supackages, but food is a namespace package:
- food folder is co-owned - food/__init__.py + .pyc is %ghosted if found, possibly with a warning - all other Python files in food/ are included
Alternatively, this can be combined together somehow:
%pyproject_save_files -N food spam
But I don't like that.
Thoughts?
[0] https://src.fedoraproject.org/rpms/python-jaraco-packaging/blob/rawhide/f/py... [1] https://src.fedoraproject.org/rpms/python-sphinx/blob/rawhide/f/python-sphin... [2] https://bugzilla.redhat.com/show_bug.cgi?id=1935266 [3] https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org/...
On Wed, Apr 14, 2021 at 5:18 AM Miro Hrončok mhroncok@redhat.com wrote:
Hello Pythonistas.
I'd like to be able to automatically handle Python "namespace" packages from our packaging macros.
The problem:
Several Python packages share a "namespace", let's take an artificial example with food.spam and food.eggs Python packages.
- the Python packages both have site-packages/food
- sometimes such packages also both have site-packages/food/__init__.py (usually empty or mostly empty, but with different mtimes etc.)
On RPM level, this means:
- %{python3_sitelib}/food can be co-owned OR it can be in an artificial python3-food(-filesystem) package [0] OR it can be in an existing package that is always present [1]
- %{python3_sitelib}/food/__init__.py and %{python3_sitelib}/food/__pycache__/__init__.*.pyc will conflict if present in multiple packages, they need to be removed or shared from the python3-food(-filesystem) package
I want to solve this once for all, define the best practice, document it in the packaging guidelines and possible automate this in %pyproject_save_files [2].
My current idea is:
- sharing directories is safe and easy, let's do that instead of artificial packages (those are hard to automate)
- namespace packages should not need __init__py with modern Python 3, let's discourage that
This, unfortunately, needs to be done upstream, at least some of the time. There are three different ways to do namespace packages in Python. The modern Python 3 version does not require __init__.py files but the other two (from before Python 3.3 added namespace packages to the core interpreter. One is implemented via pkgutil from the python stdlib and the other is implemented via a setuptools feature) have logic in the __init__.py to turn the directory into a namespace. The Python 3.3+ and pkgutil methods of namespace packaging are largely compatible (Enough so I think your idea to convert pkgutil-based packages to Python-3.3+ versions will work) but the setuptools version is incompatible.[1]_
The problem with us trying to change the setuptools using python modules that we package to use the modern Python 3 occurs when a user installs a different package in the namespace from upstream. The user then has two packages which implement the namespace in incompatible ways. My testing shows that this will result in all of our packages failing to be found by python.[2]_
You could modify your proposal to deal with setuptools based namespaces in a different manner than the other two namespaces. This might cause more mistakes (as packagers will have to figure out if they're in the special case scenario of a setuptools based namespace) but it does simplify packaging in the other two cases.
.. [1]_: https://packaging.python.org/guides/packaging-namespace-packages/#creating-a... .. [2]_: Here's the procedure to test compatibility:
mkdir -p site-3.3/food/spam site-pkgutil/food/eggs site-setuptools/food/potato echo "print('spam')" > site-3.3/food/spam/__init__.py
echo "__path__ = __import__('pkgutil').extend_path(__path__, __name__)" > site-pkgutil/food/__init__.py echo "print('eggs')" > site-pkgutil/food/eggs/__init__.py
echo "__import__('pkg_resources').declare_namespace(__name__)" > site-setuptools/food/__init__.py echo "print('potato')" > site-pkgutil/food/eggs/__init__.py
# These are both compatible PYTHONPATH=site-3.3:site-pkgutil python3 -c 'import food.spam, food.eggs' PYTHONPATH=site-pkgutil:site-3.3 python3 -c 'import food.spam, food.eggs'
# The setuptools namespace makes it so Python does not register the 3.3-style namespace at all PYTHONPATH=site-3.3:site-setuptools python3 -c 'import food.spam, food.potato' PYTHONPATH=site-setuptools:site-3.3 python3 -c 'import food.spam, food.potato'
# The setuptools namespace causes the pkgutil namespace to silently fail PYTHONPATH=site-setuptools:site-pkgutil python3 -c 'import food.eggs, food.potato' PYTHONPATH=site-setuptools:site-pkgutil python3 -c 'import food.eggs, food.potato'
-Toshio
On 14. 04. 21 15:55, Toshio Kuratomi wrote:
On Wed, Apr 14, 2021 at 5:18 AM Miro Hrončok mhroncok@redhat.com wrote:
Hello Pythonistas.
I'd like to be able to automatically handle Python "namespace" packages from our packaging macros.
The problem:
Several Python packages share a "namespace", let's take an artificial example with food.spam and food.eggs Python packages.
- the Python packages both have site-packages/food
- sometimes such packages also both have site-packages/food/__init__.py (usually empty or mostly empty, but with different mtimes etc.)
On RPM level, this means:
- %{python3_sitelib}/food can be co-owned OR it can be in an artificial python3-food(-filesystem) package [0] OR it can be in an existing package that is always present [1]
- %{python3_sitelib}/food/__init__.py and %{python3_sitelib}/food/__pycache__/__init__.*.pyc will conflict if present in multiple packages, they need to be removed or shared from the python3-food(-filesystem) package
I want to solve this once for all, define the best practice, document it in the packaging guidelines and possible automate this in %pyproject_save_files [2].
My current idea is:
- sharing directories is safe and easy, let's do that instead of artificial packages (those are hard to automate)
- namespace packages should not need __init__py with modern Python 3, let's discourage that
This, unfortunately, needs to be done upstream, at least some of the time. There are three different ways to do namespace packages in Python. The modern Python 3 version does not require __init__.py files but the other two (from before Python 3.3 added namespace packages to the core interpreter. One is implemented via pkgutil from the python stdlib and the other is implemented via a setuptools feature) have logic in the __init__.py to turn the directory into a namespace. The Python 3.3+ and pkgutil methods of namespace packaging are largely compatible (Enough so I think your idea to convert pkgutil-based packages to Python-3.3+ versions will work) but the setuptools version is incompatible.[1]_
The problem with us trying to change the setuptools using python modules that we package to use the modern Python 3 occurs when a user installs a different package in the namespace from upstream. The user then has two packages which implement the namespace in incompatible ways. My testing shows that this will result in all of our packages failing to be found by python.[2]_
You could modify your proposal to deal with setuptools based namespaces in a different manner than the other two namespaces. This might cause more mistakes (as packagers will have to figure out if they're in the special case scenario of a setuptools based namespace) but it does simplify packaging in the other two cases.
.. [1]_: https://packaging.python.org/guides/packaging-namespace-packages/#creating-a... .. [2]_: Here's the procedure to test compatibility:
mkdir -p site-3.3/food/spam site-pkgutil/food/eggs site-setuptools/food/potato echo "print('spam')" > site-3.3/food/spam/__init__.py
echo "__path__ = __import__('pkgutil').extend_path(__path__, __name__)" > site-pkgutil/food/__init__.py echo "print('eggs')" > site-pkgutil/food/eggs/__init__.py
echo "__import__('pkg_resources').declare_namespace(__name__)" > site-setuptools/food/__init__.py echo "print('potato')" > site-pkgutil/food/eggs/__init__.py
# These are both compatible PYTHONPATH=site-3.3:site-pkgutil python3 -c 'import food.spam, food.eggs' PYTHONPATH=site-pkgutil:site-3.3 python3 -c 'import food.spam, food.eggs'
# The setuptools namespace makes it so Python does not register the 3.3-style namespace at all PYTHONPATH=site-3.3:site-setuptools python3 -c 'import food.spam, food.potato' PYTHONPATH=site-setuptools:site-3.3 python3 -c 'import food.spam, food.potato'
# The setuptools namespace causes the pkgutil namespace to silently fail PYTHONPATH=site-setuptools:site-pkgutil python3 -c 'import food.eggs, food.potato' PYTHONPATH=site-setuptools:site-pkgutil python3 -c 'import food.eggs, food.potato'
Thanks for the additional data, Toshio!
My idea was that if we %ghost the __init__.py, it won't be installed by the RPM package at all and essentially the entire pkg_resources/pkgutil thing will be removed.
However, I had not realized that when users pip-install another namespace package like this to a different location on sys.path, it will blow up :(
I think we can special-case the pkg_resources one and make sure the automation in %pyproject_save_files fails if it encounters the pkg_resources import in the to-be-ghosted __init__.py.
packaging@lists.fedoraproject.org