Dan, Re: "Stripped of the juvenile ranting, you've basically made my point for me."
It seems you've stripped too much, as you have still missed the point that applications are written such that cachelines aren't continually read along the same divisible boundaries. Any software designer that reads data along a 128KB boundary is an idiot. Simply changing the data structure to read along a 129KB boundary can completely avoid the problems that you are suggesting, and this is what software developers do when they work on improving the performance of their applications.
In case you haven't noticed, I know a thing or two about cache design. For one thing, the attributes of a cache can allow for different performance, depending on the CPU architecture. Results from a Carnegie Mellon case study can't be applied to just any CPU architecture. Different architectures have different access patterns, different timings, and different techniques that all create different results. Before a processor micro-architecture is built, extensive research is done by simulating actual software binaries to see how different cache values can affect performance. These values can adjust size, bandwidth, associativity, or behavior, and by the beginning of the actual design, the designer will definitively know the settings that offer the best performance for the right tradeoffs.
Therefore, you have no point if you are trying to prove that Intel should have gone with higher associativities in their caches because one can assume that they've already tested such a scenario, and found the cons to outweight the benefits. Now, reread my previous post, and you'll find where I disprove your point.
wanna_bmw |