Shout out to my fellow Flocker, Matt...
-------- Original Message --------
Subject: RFC: Proposal for a more agile "Fedora.next" (draft of my
Flock talk)
Date: Mon, 22 Jul 2013 09:38:54 -0400
From: Matthew Miller <mattdm@fedoraproject.org>
Reply-To: Development discussions related to Fedora <devel@lists.fedoraproject.org>
To: Fedora Development List <devel@lists.fedoraproject.org>
<snip>
Obviously, no-bundled-libs is a crucial part of the packaging
guidelines
today. As a sysadmin, I know why it's important. This is not just
a noble
goal, but also something that pragmatically makes systems better.
But, it's
also keeping us from having software that people really use in
Fedora. Chef
and Hadoop are two big examples. This hurts us more than it helps
the world.
So, in some areas, we need a different approach.
The Big Data SIG is trying to adapt Hadoop 2.x
into Fedora for F20, and I'll be sharing our insights on this at Flock in a couple of
weeks. In Matt's conceptual architecture I suppose Hadoop Common
would live in the Ring 2-to-3 orbit somewhere. It is a core in it's
own right (it provides a distributed, replicated file system) in
that there is an every growing software ecosystem that has emerged
around it, and the SIG would like Fedora to be the OS of choice for
that ecosystem. Stable enough for deployment but a feature-rich,
current and productive environment for the developers in that
evolving ecosystem. The Hadoop runtime is an orchestration of
JVM-based daemons which can be viewed as system-level services, thus
an obvious candidate for well-defined integration with Fedora via
packaging: correct permissions, systemd scripts, logs, etc.
However, the root of that core is a set of older and deprecated Java
dependencies (e.g., Jetty 6, Tomcat 5.5) which are expressed via the
Apache Maven build tool. The "quick and dirty" label used by another
poster of a VERY popular build tool like Maven does it a disservice.
The fact is that it is exceedingly popular in the Java development
community and has been for some time. Anyway, the challenge for this
project is the reconciliation of it's stable dependencies versus the
ever-changing bleeding edge that is typically found in the latest
Fedora release. A lot of our efforts so far have been the various
API and build specification changes necessary to try to make Hadoop
fit into Fedora.
So far, so good...sort of. We can make the basic use case and tests
work with the modified dependencies but in doing so we risk giving
up parity with the Apache baseline (including the JRE) and
potentially lose out to other so-called "dirty RPMs". Ideally, we
wouldn't be forced into some of these adaptations and compromises if
there were Fedora packaging alternatives that would give us (a SIG
ring?) more control over the bundles needed by Hadoop as opposed to
the ones mandated by the latest Fedora release. Make no mistake:
patches are fed from the SIG to the Hadoop community to try to bump
the versions there. But the upstream project can't and won't chase
an ever-vanishing point in the distance. They view their lower
dependencies much like a stable OS such as RHEL and change should be
deliberated there.
I feel like Matt has at least kick-started the discussion around how
Fedora could evolve to support orthogonal dependency models that
more readily adapt to external projects like Hadoop. Not that our
SIG has any profound answers. :-)
Thus, we are very interested in any packaging architecture
proposals that could help relieve our initiative's pain points, and
look forward to further constructive discussion of the same.
My $0.02,
\Pete
--
Peter MacKinnon
MRG Grid/Big Data
Red Hat Inc.
Raleigh, NC