On Mon, Nov 29, 2004 at 01:30:11AM -0800, Nicholas Miell wrote:
On Mon, 2004-11-29 at 10:16 +0100, Arjan van de Ven wrote:
> On Mon, Nov 29, 2004 at 01:02:46AM -0800, Nicholas Miell wrote:
> > CMOVcc will use less space in the instruction cache than the Jcc/MOV
> > pair, though.
>
> only sometimes.... since cmov doens't work on all register/memory
> combinations extra code might be needed to glue that together...
>
>
> .... and we're suddenly talking about 0.01% performance ;)
Well, yeah. :)
There's also branch prediction and decode bandwidth issues that I didn't
bother to mention.
Although P4 have the ds/cs segment prefixes for static branch prediction,
the non-preproduction chips actually don't use it, so it only makes code
bigger.
http://gcc.gnu.org/ml/gcc-patches/2004-07/msg02200.html
But, if you're going to optimize for i686 or better for other
reasons,
there's no reason not to use CMOVcc instead of Jcc/MOV, where possible.
Well, there is a reason aside from some CPUs not having those insns at all:
on some recent Intel CPUs CMOVcc is actually slower than Jcc/MOV.
Jakub