SI
SI
discoversearch

We've detected that you're using an ad content blocking browser plug-in or feature. Ads provide a critical source of revenue to the continued operation of Silicon Investor.  We ask that you disable ad blocking while on Silicon Investor in the best interests of our community.  If you are not using an ad blocker but are still receiving this message, make sure your browser's tracking protection is set to the 'standard' level.
Politics : Formerly About Advanced Micro Devices -- Ignore unavailable to you. Want to Upgrade?


To: Scumbria who wrote (59921)5/28/1999 1:28:00 PM
From: grok  Read Replies (1) | Respond to of 1573994
 
Re: <I agree that it is easier to utilize wide IA-64 hardware than a wide x86 superscalar design, and that it is possible to hand craft code which demonstrates this. My question is whether it is theoretically or practically possible to write a compiler for a Von Neumann architecture which can usefully dispatch more than 2 instructions per cycle.>

The thing is that today's micros only usefully dispatch and retire about 1 instruction per cycle. So getting this up to 2 would represent a massive breakthrough.

Re: <User source code inherently contains short loops, unpredictable branches, etc. As soon as you start bogging down the memory subsystem with speculative or look ahead load misses, you will see the performance degrade rapidly. The compiler people at IBM have been struggling with this problem for well over a decade.>

IA-64 has special support for short loops with rotating registers and modulo-scheduled loop support which provide the benefit of loop unrolling without the code expansion (they also work for dynamically scheduled loops) and a Loop Count register which allows perfect branch prediction since the termination of the loop is known.

Branch prediction rates should be much improved though other features as well including predication which may eliminate about half the branches altogether and branch hints provided by the compiler. I believe that branch hints can be modified dynamically by profiling but I haven't figured this out yet. Even mispredicted branches should be less painful since branch registers are used and implementations can prefetch instructions at targets.

But there's much more than I can post to you. It is time for you to move beyond this "Merced is traditional VLIW and traditional VLIW sucks" stage you're in and actually learn something about IA-64. Here's the link:
developer.intel.com

Afterall, you can't do a decent job of bashing it until you learn somemore about it.

Best regards, KZNerd