SI
SI
discoversearch

We've detected that you're using an ad content blocking browser plug-in or feature. Ads provide a critical source of revenue to the continued operation of Silicon Investor.  We ask that you disable ad blocking while on Silicon Investor in the best interests of our community.  If you are not using an ad blocker but are still receiving this message, make sure your browser's tracking protection is set to the 'standard' level.
Technology Stocks : All About Sun Microsystems -- Ignore unavailable to you. Want to Upgrade?


To: Charles Tutt who wrote (37553)11/8/2000 10:19:30 PM
From: Tony Viola  Respond to of 64865
 
Charles, >Beyond that, I think you would have to look at the probability distribution of alpha particle arrival times to reach some conclusion about the efficacy of different levels of ECC.

True. From experience with other large machines on the scale of Sun's high end, the damn things arrive far too often. And, it did take a while to figure out what was going on back then.

Tony



To: Charles Tutt who wrote (37553)11/11/2000 12:32:54 PM
From: rudedog  Respond to of 64865
 
Charles - ECC cache, if correctly designed, can completely eliminate the effects of random bit errors. In the event of an error which can not be corrected, the cache line is faulted and a read from memory is forced.

However, cache serves as a write mechanism as well as a read mechanism - so to be completely safe from these errors, the cache would be "write-through" rather than "write-back" - in other words, CPU writes always go to main memory before the CPU releases the write cycle. This is a big performance hit.

Since ECC cache is actually less likely to have an error than ECC main memory, this is usually viewed as a sufficiently unlikely event that the performance benefits of write-back win over protecting against an event with a statistical probability of occurring once in several hundred years.

Without ECC, the chances are more like once in a few months, which may be why Sun customers were seeing the problem. Sun's fix - to mirror the cache and force a cache fault on any difference between the cache mirrors - also requires that write-back be disabled, with the resulting performance hit. But that is better than a system crash.

This is not an academic difference - memory write performance is reduced by a factor of 16 or more with write-back disabled.