On Thu, Aug 24, 2023 at 12:15 PM Richard Fontana <rfontana(a)redhat.com> wrote:
When I look randomly at spec files of Fedora packages, I begin to
suspect that most Fedora package maintainers must have always ignored
this directive and have continued to ignore it after the rule was
recast in the post-July-2022 docs. In *most* cases of packages other
than possibly those coming from ecosystems or historical contexts
featuring highly uncomplicated licensing structures, there will be
some differences in the makeup of binary packages from a built source
code licensing standpoint. I only rarely see attempts to reflect this
via multiple License: fields. While in the scheme of things I only
look at a small sample of Fedora packages I suspect they are
representative.
If you only looked at a small sample, it may not have been
representative. The package reviews I have been involved in recently,
both as submitter and reviewer, have tried to faithfully reflect the
license of the binary package. And, really, a single year isn't
enough time for the new docs to have had a big effect. We have a huge
number of packages in Fedora, so changes to packaging guidelines take
quite awhile to propagate throughout the collection. I suspect that
if you narrow your sample to packages that have been reviewed in the
last 12 months, you might get a quite different impression.
I can conclude one of two things:
1. The license of the binary rule is too hard for most Fedora package
maintainers to comply with.
2. Fedora package maintainers are unaware of the rule and are
substituting their own intuition, which I think must be something like
"each RPM should have one License: field that reflects the makeup of
all the binary RPMs without attempting to distinguish among them".
Or, as I suggested above:
3. Fedora packagers are (by and large) aware of the binary rule, but
there's a lot of inertia to overcome in the existing corpus of
packages.
This puzzles and disappoints me since, as I have said, the license
of
the binary concept was in my view a major advance in the way people
were thinking about appropriate ways of representing licenses of
packages. If you look into SPDX, for example, SPDX doesn't even have
(as far as I can tell) a sophisticated way of distinguishing between
binary and source licensing. I believe this reflects the source
code-centric and non-packaging-centric world view of many of the
people who got involved with SPDX early on, but that may be unfair.
I don't think you should be disappointed. Give it more time. I think
the license of the binary concept is useful.
I'm deliberately ignoring most of the rest of your comments in
this
message because I think they raise some additional topics, because I
want to make sure there is some focus on this one. What do we do about
the "license of the binary" rule? If it is really too hard to comply
with, I think we can only conclude that it has to be replaced with
some other approach. Since I'm not a Fedora package maintainer I do
not have good intuition for what's too hard vs. what's merely annoying
or cumbersome. I know why I find it challenging to figure out what
source files map to a given binary RPM, but I don't really directly
understand why this is hard for a Fedora package maintainer who is
theoretically highly familiar with the code they are packaging and
theoretically has some expertise in the language(s) and build tools at
issue. I just see the evidence suggesting that it is.
There are cases that make it tricky. Florian mentioned C/C++ header
files that contain inline functions, for example. Figuring out which
header files have such definitions, and whether or not a given binary
actually uses any such definition is nontrivial.
As others have suggested, to make this tractable, we really need some
automation that can help us out. What I as a packager would really
like to do when working on package P is:
1. List all of the licenses introduced by P itself
2. List all of the packages that might inject something into the final
binary package
3. Use magic nonexistent tooling to automatically construct the final
list of relevant licenses by starting with (1), and then iterating
over the list of packages in (2) and extracting their licenses.
That will probably produce a list that is too big, as some package in
(2) may only inject a single artifact covered by a single license, but
itself be covered by a longer list of licenses. But it would be a
start. Tracking down transitive licenses is most of the work, in my
experience, and the results can be invalidated at any time by a change
in some other package.
--
Jerry James
http://www.jamezone.org/