Re: Intel compiler & code generation
To generate the absolute fastest code, you can tell the compiler to to target a specific processor and not to worry about executing on anything else. The options /QxM, /QxK, and /QxW force code generation for Pentium II, Pentium III, or Pentium 4 (M=MMX, K=Katmai, W=Willamette). For Athlon, I use /QxM.
The Microsoft compiler will not even assume that Pentium Pro instructions are available, and does not make use of CMOV and FCMOV (conditional moves instructions that eliminate short branches). The Intel code benefits from the use of these instructions.
Features like MMX, SIMD, and SIMD II require help from the programmer, in addition to including the compiler switch. I hope in the future the compiler will do more work in this area. For now, you have to tell the compiler what MMX/SIMD data type your variable holds, and then use intrinsic function macros to cause the operation, XOR for example, to be generated.
So far, I have not experimented with the more complicated situation of targeted code. You could tell the compiler that the instructions must execute on Pentium II, Pentium III, or Pentium 4, but to consider only Pentiun III, for example, when choosing optimizations.
The compiler has all sorts of advanced features I have not yet tried.
Whith the latest service pack and processor pack, Microsoft's compiler can handle MMX, SIMD, and maybe even SIMD II code generation, but still does allow you to force optimization to a particular processor. I guess that is because most Windows code is common for all processors. |