On Thu, 2018-03-15 at 00:45 -0700, Cary Coutant wrote:
> > To inject explicit out-of-band data into the hash
> > could insert an object with nothing but a note section, or even use
> > --defsym to create a symbol table entry with your extra key(s).
> Fedora wants to insert extra data into the build-id of all packages in
> its repository, and it does so right now by post-processing the
> package. This is ugly, and it flat-out doesn't work for the kernel,
> which is apparently breaking a legitimate debugging use case.
> I think Fedora should be able to ask its tool chain to insert the
> extra data rather than hacking it in after the fact. Asking Fedora to
> use --defsym for this purpose is IMO a non-starter, as is asking
> Fedora to come up with some magic .o file and linking it into every
I understand your objection to the magic .o file, but why exactly is
--defsym a "non-starter"? It's pretty close to what you're asking for
(except for the spelling), it's already available, and it has the
advantage of adding the extra data to the binary in a form where it
can be easily extracted (e.g., with "nm").
I don't understand the need for forcing two otherwise-identical
binaries to have different build ids, simply because they're part of
different distributions. Perhaps it would help if you could explain
why you need that.
In theory two different build environments could produce the same
identical binary. Even different source files could if they express the
same algorithm and the compiler optimizes them the same way. But you
might still want to identify which came from which build. The build-id
is used to identify the build (environment) that produced the binary.
If the build environment is identical then in theory it should produce
identical binaries and build-ids. But this isn't necessarily the other
way around. For a distro it is nice to have unique build-id identifiers
for different package version builds. Then if you just have e.g. a core
file with a build-id in it, you can map it back to which package build
it came from.
The linker doesn't see the whole build environment, but it often has
enough to differentiate if there is debuginfo involved, which contains
the source file paths, compiler versions, command line options, etc. I
think in the case Andy is interested in, the vdso, just isn't "unique"
enough (it contains mostly assembly without debuginfo). The vdso is
also slightly special because it is build and then "inserted" into the
kernel image (so it can be mapped into the process space at runtime).
Which makes it difficult to "post-process".
If using new command line flags is out of the question, could the
linker use an environment variable as seed for the build-id hash
computation? Then a package build could just set e.g.
BUILD_ID_ENV="<package-version>". That could also be used for other
purposes to capture anything from the build environment that might make
a build unique.