ArOn 03/16/2018 11:37 AM, Cary Coutant wrote:
>>> Then I'm stating my case poorly. I want a way to inject additional
>>> data into the has computation.
>> At one point, we proposed doing this via a linker- or assembler-oriented
>> extra "salt" parameter, which would be hashed into the buildid. This
>> would most naturally be a n-v-r-a string, so the build is reproducible.
>> Such a salt could be naturally injected via an environment variable set
>> by rpmbuild.
>> https://bugzilla.redhat.com/show_bug.cgi?id=1002341 (salt!)
>> https://bugzilla.redhat.com/show_bug.cgi?id=1550152 (life without salt)
> I don't have a problem with adding a --build-id-salt="some arbitrary
> string" option, and I think "salt" is exactly the right term for this.
> I'd much prefer providing that than having you use a linker script.
> (I'm somewhat puzzled that you find the linker script option less
> objectionable than an object file with a note section.)
> As Nick said earlier, it's not that we don't care about your feature
> request. I simply wanted to explore the options, and I gave you a
> couple of options that require no new features at all.
> My other comments have been about the unnecessary conflating of a new
> option like --build-id-salt with the choice of hashing algorithm.
At least in the kernel we already have the infrastructure for
customizations to linker scripts so it's fairly easy to expand on that.
I have a prototype which should work, I just need to clean it up
for review to see if it's feasible to merge vs. adding a --build-id-salt
Normally when a kernel change causes a problem there is a compile error,
but I'm seeing a make problem. Builds work with
kernel-4.16.0-0.rc6.git0.2.fc28.x86_64, but not with
The error output is:
[bruno@cerberus src]$ make KERNELDIR=/lib/modules/4.16.0-0.rc6.git2.2.fc29.x86_64/build clean all
make: *** No rule to make target '/home/bruno/WireGuard/src/main.o', needed by '/home/bruno/WireGuard/src/wireguard.o'. Stop.
make: *** [Makefile:1556: _module_/home/bruno/WireGuard/src] Error 2
make: *** [Makefile:36: module] Error 2
Is this something Fedora specific? I wouldn't expect an rc6 change to
cause an issue like this.
I'm going to try building an upstream kernel to see if the problem is
upstream, but I probably won't have an answer until tomorrow.
Andy Lutomirski <luto(a)kernel.org> writes:
> Then I'm stating my case poorly. I want a way to inject additional
> data into the has computation.
At one point, we proposed doing this via a linker- or assembler-oriented
extra "salt" parameter, which would be hashed into the buildid. This
would most naturally be a n-v-r-a string, so the build is reproducible.
Such a salt could be naturally injected via an environment variable set
https://bugzilla.redhat.com/show_bug.cgi?id=1550152 (life without salt)
On 03/15/2018 06:32 AM, Nick Clifton wrote:
> Hi Mark,
>> That might be an interesting alternative. Could you use this for e.g.
>> inserting a .comment section fragment with an unique (version) string?
>> That would be stripped away, but should still count for the build-id
>> hash calculation.
> If you know the value you want to store ahead of time, then it is easy:
> % cat comment.t
> .comment (INFO) :
> BYTE (0x12);
> BYTE (0x34);
> BYTE (0x56);
> BYTE (0x78);
> % gcc hello.c -Wl,comment.t
> % readelf -x.comment a.out
> Hex dump of section '.comment':
> 0x00000000 4743433a 2028474e 55292037 2e332e31 GCC: (GNU) 7.3.1
> 0x00000010 20323031 38303133 30202852 65642048 20180130 (Red H
> 0x00000020 61742037 2e332e31 2d322900 4743433a at 7.3.1-2).GCC:
> 0x00000030 2028474e 55292037 2e322e31 20323031 (GNU) 7.2.1 201
> 0x00000040 37303931 35202852 65642048 61742037 70915 (Red Hat 7
> 0x00000050 2e322e31 2d322900 12345678 .2.1-2)..4Vx
> (Note how the value has been appended to the .comment section).
> Unfortunately the linker does not have a STRING() operator to insert
> ascii codes into a section, so you have to construct the bytes by
> hand. Eg:
> % cat comment.t
> .comment (INFO) :
> BYTE (0x41);
> BYTE (0x42);
> BYTE (0x43);
> BYTE (0x00);
> % gcc hello.c -Wl,comment.t
> % readelf -p.comment a.out
> String dump of section '.comment':
> [ 0] GCC: (GNU) 7.3.1 20180130 (Red Hat 7.3.1-2)
> [ 2c] GCC: (GNU) 7.2.1 20170915 (Red Hat 7.2.1-2)
> [ 58] ABC
> A simple perl or python script could be used to create the comment.t
> linker script fragment.
I think this approach looks promising. I'm going to see about prototyping
On Thu, 2018-03-15 at 11:36 +0000, Nick Clifton wrote:
> > > I think Fedora should be able to ask its tool chain to insert the
> > > extra data rather than hacking it in after the fact.
> I'll just note that another way to insert data into a linked binary
> is to use a linker script fragment referenced from the command line.
> I doubt that this would be any better than creating a "magic .o" file,
> but it is worth knowing.
That might be an interesting alternative. Could you use this for e.g.
inserting a .comment section fragment with an unique (version) string?
That would be stripped away, but should still count for the build-id
On Thu, 2018-03-15 at 00:45 -0700, Cary Coutant wrote:
> > > To inject explicit out-of-band data into the hash computation, you
> > > could insert an object with nothing but a note section, or even use
> > > --defsym to create a symbol table entry with your extra key(s).
> > Fedora wants to insert extra data into the build-id of all packages in
> > its repository, and it does so right now by post-processing the
> > package. This is ugly, and it flat-out doesn't work for the kernel,
> > which is apparently breaking a legitimate debugging use case.
> > I think Fedora should be able to ask its tool chain to insert the
> > extra data rather than hacking it in after the fact. Asking Fedora to
> > use --defsym for this purpose is IMO a non-starter, as is asking
> > Fedora to come up with some magic .o file and linking it into every
> > object.
> I understand your objection to the magic .o file, but why exactly is
> --defsym a "non-starter"? It's pretty close to what you're asking for
> (except for the spelling), it's already available, and it has the
> advantage of adding the extra data to the binary in a form where it
> can be easily extracted (e.g., with "nm").
> I don't understand the need for forcing two otherwise-identical
> binaries to have different build ids, simply because they're part of
> different distributions. Perhaps it would help if you could explain
> why you need that.
In theory two different build environments could produce the same
identical binary. Even different source files could if they express the
same algorithm and the compiler optimizes them the same way. But you
might still want to identify which came from which build. The build-id
is used to identify the build (environment) that produced the binary.
If the build environment is identical then in theory it should produce
identical binaries and build-ids. But this isn't necessarily the other
way around. For a distro it is nice to have unique build-id identifiers
for different package version builds. Then if you just have e.g. a core
file with a build-id in it, you can map it back to which package build
it came from.
The linker doesn't see the whole build environment, but it often has
enough to differentiate if there is debuginfo involved, which contains
the source file paths, compiler versions, command line options, etc. I
think in the case Andy is interested in, the vdso, just isn't "unique"
enough (it contains mostly assembly without debuginfo). The vdso is
also slightly special because it is build and then "inserted" into the
kernel image (so it can be mapped into the process space at runtime).
Which makes it difficult to "post-process".
If using new command line flags is out of the question, could the
linker use an environment variable as seed for the build-id hash
computation? Then a package build could just set e.g.
BUILD_ID_ENV="<package-version>". That could also be used for other
purposes to capture anything from the build environment that might make
a build unique.
The kernel still doesn't have 100% parallel debuginfo because we can't update
the vDSO binary embedded in the the image. I'd like to see about updating
debugedit to be smart enough to do the recalculation of the buildid for both
the vmlinux and the embedded vDSO.
I'd like to avoid too tight a coupling between the kernel and debugedit
so if we want/need to change how the vDSO is generated it won't break too
many things. My idea is to stick the location of the vDSO in an ELF note
so debugedit knows where to look. As long as the kernel can generate this
section correctly, debugedit can find the embedded build-id and update
Obviously this would need approval from a wider audience but I'm looking
to get some early feedback before I spend too much time prototyping
something that has no chance of going anywhere.
On Wed, Mar 14, 2018 at 6:01 PM, Alan Modra <amodra(a)gmail.com> wrote:
> On Wed, Mar 14, 2018 at 04:40:25PM -0700, Andy Lutomirski wrote:
>> I realize that the security issue here is barely relevant, but git’s use of SHA1 is *not* okay, and git is migrating away for a reason.
> Hmm, that's news to me. Heh, I've always been a bit suspicious of
> git's reliability. ;-)
I'm afraid Andy has listened to a few too many hard-liner security
people - the bad kind that don't know shades of gray, and the kind
that aren't generally worth listening to.
SHA1 with the known attack weakness fixed (aka "Hardened SHA1", the
way git already does) in a non-certificate environment is fine.
The fact is, data identification is different from some kind of
security that depends on the key. I wouldn't use even hardened SHA1
for some security certificate. But for file ID's? Andy is confused.