SI
SI
discoversearch

We've detected that you're using an ad content blocking browser plug-in or feature. Ads provide a critical source of revenue to the continued operation of Silicon Investor.  We ask that you disable ad blocking while on Silicon Investor in the best interests of our community.  If you are not using an ad blocker but are still receiving this message, make sure your browser's tracking protection is set to the 'standard' level.
Technology Stocks : Advanced Micro Devices - Moderated (AMD) -- Ignore unavailable to you. Want to Upgrade?


To: fastpathguru who wrote (239509)8/29/2007 2:52:35 AM
From: graphicsguruRead Replies (1) | Respond to of 275872
 
Sorry fpg, but I think Doug's correct.

The Intel scheme takes longer when it detects a situation that could
have created an error because things happened in the wrong order.
In that case, it has to unwind memory transactions.

How often does that happen? If each thread is running a different
process, then basically never. No two cores will conflict about
any cache line at all. So for spec_int_rate or spec_fp_rate, it is
a complete non-event.

For multithreaded code that shares a lot of read-only data, it will
also be a non-event.

The only situation where it could be bad is for code that relies on a
large number of mutex locks to ensure correct serialization.
But most good parallel code figures out ways to keep the mutex
locks to a minimum, often by keeping separate writable copies of data in
each thread. Code that relies heavily on mutexes generally performs
very poorly.

It seems to me that Intel has taken cache coherency to the next level
of complexity -- they're doing speculative coherency. In other words,
predicting that there are no race conditions, and doing a very complicated
unwinding when they're wrong. But actual order-dependent cache
race conditions happen on a very small fraction of execution cycles.
So this is probably a very significant performance improvement.
On the other hand, I'd hate to be responsible
for Q/A, given the complexity.