Em 1 de agosto de 2011 14:35, DJ Delorie <dj(a)redhat.com> escreveu:
> What I am trying to understand now is about choice of float abi.
Not much to understand - each project chooses the abi that best meets
their goals. If you want to learn the history, it's all in the mail
I am using some local branches of your scripts to build several
combinations for armv7 chroots, stoping at stage2 and building
a few rpms:
(calling hardfp for easier understanding and using vfpv3-d16 if
From my understanding, neon generates "prettier" objdump
output when looking at libm.so, but runtime of simple benchmarks
does not show any difference.
> git clone git://fedorapeople.org/~djdelorie/bootstrap.git
> Since I am still very "arm noob" :-) and just yesterday did
> the thumb build to learn about thumb, so far, my impression
> is that the best approach should be to use thumb+softfp.
If you want to do that, you don't need my bootstrap scripts. The
whole *point* of a bootstrap was to bring up an *incompatible* abi
from scratch. If you want to use a compatible abi, just keep using
the armv5 version of Fedora instead. It was decided long ago that the
armv7 version of Fedora would use the hardfp abi (hence the project
name "hardfp bootstrap"), but you can't build hardfp binaries on a
softfp platform, so we had to start from scratch to do hardfp.
Actually, I know now that I was also partially confused by
misunderstanding the --with-float=hard abi, so, I wrote a simple
program to better understand the calling conversion being generated.
For some reason I was thinking that it would use only two
vfp registers for arguments, but it can use up to 8.
But using softfp convention for variadic functions may be tough
for some applications; I wrote two "initial state" jits for arm:
and direct links to other, as it is not in a single project...
So, today after better understanding the ABI, I also made a
simple test case, to call 100 million times a function receiving
8 double arguments and return one. Compiled with -O0 or gcc
just optimizes out the call sequence and all timings become
identical, and I noticed a 20-25% faster execution, on what
should be where it should make most difference: 8 arguments
in registers and return in register, contrary to 2 in r0,r1,r2,r3,
converted to vfp, and 6 on stack, and then again the conversion
As Loïc Minier said in the other response (Thanks!) this
should be most of an issue when calling functions from
different libraries, where gcc cannot optimize much. And
presuming one is passing 2-8 float/double arguments a lot
in inner loops, and not in vectors...
It's also a fun exercise in bootstrapping, to make sure we still
With that I agree :-)
> I am kind of trying to figure what "The Industry"
says about it,
If you need someone else's approval, you've missed the point of Free
Software. Each project has their own goals, and there is no "The
Industry" to tell us what to do. If you want to be part of a project,
find the one that has the same goals as you do, and join them.
I did not express myself clearly. Attempting to better describe
the idea I tried to expose, but failed: By doing packages for
armv7, and assuming I am working for Mandriva, we are better
sticking to what upstream does and supports
(read "The Industry" -> "upstream"; I personally can hack here
and there, but not much else)
> If I understand correctly, neon will have better support for
> simd instructions right?
There are still some armv7 chips that don't have neon, though, so we
(Fedora) chose to avoid neon for now.
I did not learn much yet about it, but maybe using neon for
integer division could be a "huge win", as otherwise, there is
no division instruction (well, not in arm mode)...