On Thu, Dec 1, 2016 at 9:58 AM, Don Zickus <dzickus(a)redhat.com> wrote:
On Thu, Dec 01, 2016 at 07:53:06AM -0600, Justin Forbes wrote:
> On Wed, Nov 30, 2016 at 8:03 PM, Don Zickus <dzickus(a)redhat.com> wrote:
> > On Wed, Nov 30, 2016 at 04:25:30PM -0800, Laura Abbott wrote:
> > > > I don't think it would be a bad idea to enable in rawhide and see
> > it works out, from there it will trickle down as the stable releases get
> > rebased. While turning it on in theory shouldn't create a problem. I
> > honestly don't get warm fuzzies making such a change without at least some
> > time testing in rawhide. We are just a week or two away from 4.9 final now,
> > so it isn't a huge delay. The changes being proposed upstream are not even
> > in next yet, so it has some time to be shaken out before it would ever see
> > a stable release, though the feature would need to be enabled in rawhide
> > for testing as that happens.
> > > >
> > > > Justin
> > > >
> > > >
> > >
> > > I'm not opposed to turning it on but I'm a little bit wary
> > > of this causing unexpected problems for users. From
> > > experience, how likely is it that a module passes the version
> > > checks but something else has changed such that it no longer
> > > works? Even if we can't officially support 3rd party modules,
> > > I'd like to not make it too much worse within reason.
> > Hi Laura,
> > Thanks for the feedback!
> > That can always be the case, static inlines for example. But RHEL has been
> > relying on this since RHEL-5 with many 3rd party drivers. Various fixes
> > have gone in to the genksyms tool to make this interface fairly reliable.
> RHEL relying on this without major issues is very different than Fedora
> relying on it. Fedora 23 which will EOL this month released with a 4.2
> kernel and is currently using 4.8.10.
In that scenario, I would fully expect lots of symbols to break after each
major kernel version release. As a result a driver would fail to load and
would need to be rebuilt. No different than today.
I don't expect Fedora to change any process or policy here. I was just
trying to point out that the MODVERSIONS technology works well (despite the
upstream thread which broke things when enabling EXPORT_SYMBOL in asm
files due to a bad binutil version). :-)
Based on the upstream thread, it seems there is widespread frustration with
guaranteeing correct module load with different kernel versions.
MODVERSIONS is pretty good today, but folks want better. Red Hat would like
to help promote better technology here as kabi is something of a value add
It is easier to do that if we can sync up some of RHEL's process with Fedora
to aid in flushing out issues. That is really my main motivation here.
We can then use RH's internal testing and tooling to help verify things.
I believe it should have zero impact on Fedora. The upstream discussion is
now resolved for 4.9 if I understand things correctly. But feel free to
wait until 4.9 is actually released to make sure MODVERSIONS is no longer
depending on BROKEN.
I think Linus removed the dependency on BROKEN, but there are still
some unresolved issues that make MODVERSIONS not functional. Until
those get cleared up, I'm not sure it's safe to enable in 4.9.
Definitely something to keep an eye on as the release gets closer.
In case you want to understand the technology a little bit better.
Fedora, module loading is based on a version string check. If the module's
version string matches the kernel's string, it will load. No other sanity
check. So if a kernel is modified but the version string isn't updated, bad
things may happen.
MODVERSIONS takes a checksum of the _whole_ path of an EXPORT_SYMBOL and its
parameters, recursively diving through structs until it gets to the very
basic types of int, char, void, etc.. Sometimes 100 levels deep. This is
done by preprocessing the .c file into a .i fisrt. Once that extremely long
string is created, it is checksummed and stored in the .ko as the CRC.
Any out-of-tree module compiled has to go through the same steps and uses
the crc symbols from Modules.symvers as its dependency.
Upon module loading, if the CRCs don't match the module is blocked. If they
do match, it implies the structs, the offsets, the variable names,
everything has not changed for that EXPORT_SYMBOL (ignoring code use of
those struct elements). That is a pretty decent sanity check that the
driver should work on that kernel version. Nothing is 100%, I get that, but
with our experience on RHEL (and all the hairy rebased subsystems we do), it
has worked pretty well.
If Fedora continues to promote DKMS and akmod, then no one has to worry
about this issue as those drivers will get stuck in extras/ and only be
available to that kernel. :-)
Just to clarify something, Fedora doesn't promote either DKMS or
akmod. In fact, we don't promote anything around 3rd party modules at
all. The packages exist in the repo and that is what people use, but
we don't recommend they do so. It's a minor wording issue, but I
don't want people to get the wrong idea.