On 03/26/2011 09:10 PM, Matthew Wilson wrote:
2. Armv7 / VFP / NEON support to squeeze a bit more performance out
(where appropriate to the h/w).
I suspect you'll find that NEON won't make a noticeable difference, at
least not until:
1) GCC can do worthwhile vectorization - I've yet to try 4.5.x branch
which supposedly has some work on that contributed from IBM, but given
that it's taken 14 years to get this far since MMX was first introduced,
I'm not too hopeful of a quantum leap overnight.
_AND_
2) Developers start to write code in a way that the compiler can
sensibly apply vectorization. Considering that people haven't really
done this after the best part of a decade of availability of decent
vectorizing compilers (ICC on x86), I suspect this will be a bigger
problem than 1). And very few people are likely to have a great interest
in rewriting something that works, no matter how poorly.
Don't bet on SIMD for anything but niche applications (e.g. scientific
number crunching and games - and for the former, ARM isn't exactly a
popular platform).
Other missing things that I would add to your list are:
- Lack of ported application. Most important one, IMO, is
OpenOffice/LibreOffice. Ubuntu has this on ARM, so there is very little
excuse for not having it, since it has been done. The build systems are,
unfortunately, sufficiently different to make this far from trivial
without the compatibility being worked on upstream. This is, IMO, the
key reason why Ubuntu is so far ahead in terms of shipping pre-installed
on ARM netbooks.
- Legacy bad programming. ARM, like SPARC, is susceptible to issues
arising from memory pointer dereference that isn't word-aligned. This
has been discussed here before, and I find it more than a little
surprising that while the GCC SPARC back-end seems to align all
structures and arrays to word boundary, the ARM back-end does not, and
this causes bugs to arise. One recent example that shocked me is just
how often this happens in code that is really critical (e2fsprogs),
where buffers get defined as arrays of char, and then the contents get
cast into structs. GCC will align char[] to a byte boundary, not a word
boundary, and thus cause all sorts of issues. The fact that some things
work at all is nothing short of a miracle. But like 2) I mentioned
above, that is a lot of code to rewrite correctly. The alternative is
similar to 1), in that the GCC ARM back end needs a parameter to force
alignment of all structures to a word boundary. Interestingly, Intel's
compiler for x86 as an option for this, despite the fact that x86 has
transparent hardware fix-up for this - because unaligned arrays cannot
be vectorized. I'm guessing that hasn't happened on GCC/ARM because most
ARM development has traditionally been on systems where saving a few
bytes of RAM is of paramount importance and the developers are competent
enough to hand-craft their code to make sure it works (embedded
systems). If ARM is to grow up into a desktop processor, it's compiler
has to do so, too. Even so, we're likely to be unpicking unalignment
access violations in existing code for years, right up until bad
programming gets enshrined in transparent hardware alignment fixup (e.g.
Cortex A / ARMv7).
</rant> ;)
Gordan