Re: this is the first time you brought this up, how could you say I "keep ignoring" it.
If our places were reversed, you'd post "liar" in bold italics:
The cache controller uses the LSBs - least significant bits to determine which cache locations need to be checked. It uses the end of the address, the last few bits, as an index or hash code. So, if a PC had a 4-way cache with 256 locations, the cache would basically be divided into 4 pieces, each of which could store 64 locations. It would effectively divide the main memory into 64 blocks, each of which could have, at most, 4 locations stored.
So for any address ending in ******11 0000, say 01101100 11110000, or 01101100 12110000, there are only 4 cache locations available to store that information. This wouldn't matter all that much, except that memory is generally allocated by the operating system in blocks with the same LSBs, and modern OSs and programs taking advantage of object oriented design are loaded as many small modules, each of which is allocated a block (or blocks) of memory by the OS - which often start at the same LSBs.
So, particularly on complex or server applications, P4 (and Xeon, and Xeon MP), even with a very large 8-way cache, is going to be thrashing some of its cache locations long before Athlon, even if Athlon's 20-way cache is smaller. A discussion of cached designs is here: pcguide.com;
Saturday, Mar 16, 2002 3:39 PM Message 17206923 |