The Packaging Guidelines require that all binary programs and libraries be built from source code. How should this requirement be interpreted when some of the "source code" is itself automatically generated from other sources?
GTKada is an Ada binding to GTK+. In the latest version, some of the Ada files in the source tarball have been generated by a program that is included in the tarball. Future versions will have even more generated code. The input to the code generator is a GIR file, which as far as I understand is some kind of XML representation of the GTK+ API. The GIR file has in turn been generated from the C source code of GTK+. The GIR file is included in the GTKada tarball, but the GTK+ source is not.
Now I'm trying to figure out whether I can build the GTKada package from the distributed generated Ada code, or whether I have to run the code generation as a part of the build, possibly using the GIR file from the GTK+ package instead of the one in the GTKada tarball.
There are two reasons for the requirement listed in the guidelines:
· "Security: Pre-packaged program binaries and program libraries not built from the source code could contain parts that are malicious, dangerous, or just broken. Also, these are functionally impossible to patch." The generated Ada code is nicely formatted and legible, and no harder to review than hand-written source code. It would be possible to patch it, although such a patch would of course not be upstreamable.
· "Compiler Flags: Pre-packaged program binaries and program libraries not built from the source code were probably not compiled with standard Fedora compiler flags for security and optimization." This obviously doesn't apply to generated code that hasn't yet been through a compiler.
Thus, none of the stated reasons seem to be relevant to this case, and I can see only one thing that could mean that I have to run the code generation as a part of the build, namely the term "source code". My question is: Is it required that all the steps in the process from the actual source code to binary code take place on Fedora's build servers, or is it sufficient that binaries are built from human-readable code even if that code isn't the actual source code?
In other words: Should I read "source code" literally, as "the ultimate source code written by human programmers", or is it OK, for the purpose of this requirement, to read it as "human-readable code in a textual programming language"?
Björn Persson
=?ISO-8859-1?Q?Bj=F6rn?= Persson bjorn@xn--rombobjrn-67a.se writes:
The Packaging Guidelines require that all binary programs and libraries be built from source code. How should this requirement be interpreted when some of the "source code" is itself automatically generated from other sources?
[ details snipped ]
Thus, none of the stated reasons seem to be relevant to this case, and I can see only one thing that could mean that I have to run the code generation as a part of the build, namely the term "source code".
You are overlooking one good reason for running the code generator during package build: it ensures that what you compile actually matches the sources it's claimed to be generated from. I've seen more than a few cases where allegedly-automatically-built derived files shipped in an upstream tarball were not up to date.
Now, whether it's worth doing that during package build is a tradeoff. You have to think about what are the odds that this particular upstream could screw up in that fashion; depending on how much you know about their tarball creation and testing process, you might legitimately conclude that the odds of this scenario are too small to worry about. (Or you might be able to convince yourself that if the files *were* out of sync, you'd get a compile failure; this seems possibly relevant here, depending on how tightly tied these files are to the GTK+ API.) And you have to consider how much time it adds to the package build and whether the code generator's own needs will materially bloat the package's BuildRequires footprint. These costs are probably substantial, else upstream would not have chosen to ship derived files in the first place. It might be worth it, or it might not.
Anyway, this is just to point out that regenerating derived files does sometimes have practical value, quite independent of how narrowly somebody wants to read the "build from source" policy.
regards, tom lane
Tom Lane wrote:
You are overlooking one good reason for running the code generator during package build: it ensures that what you compile actually matches the sources it's claimed to be generated from. I've seen more than a few cases where allegedly-automatically-built derived files shipped in an upstream tarball were not up to date.
I'm aware of that. I didn't mention it because it wasn't mentioned in the list of reasons for this policy.
On the one hand the generated files might be outdated. On the other hand they might become *too* up to date if I regenerate them. There has always been version skew between GTKada and GTK+ in Fedora, and it will probably remain that way. It has at times been necessary to patch GTKada to get it to build with a newer GTK+. If I take the GIR file from Fedora's GTK+ package and feed that to the code generator, then there will also be version skew between the generated files and the hand-written parts of GTKada, which might be a problem or not depending on (among other things) how stable the GIR file is.
If it turns out that I have a choice, then I'll try to figure out which approach gives a lower risk of problems, but first I want to find out whether the policy allows me a choice at all.
Björn Persson
On 07/30/2012 05:44 AM, Björn Persson wrote:
If it turns out that I have a choice, then I'll try to figure out which approach gives a lower risk of problems, but first I want to find out whether the policy allows me a choice at all.
It is my interpretation here that you do have a choice, but I would strongly encourage you to generate the source code on each build unless there is a specific (and notable) downside to doing so.
~tom
== Fedora Project
packaging@lists.fedoraproject.org