To: Cirruslvr who wrote (114226 ) 6/5/2000 1:26:00 AM From: milo_morai Respond to of 1577030
Some Good Benchmarks 10 to 25% improvements SYSMARK 2000 and Business Applications In Business Applications, the Thunderbird is approximately 16% faster than K75, a significant increase. At 1GHz, the cache of the Thunderbird operates at full CPU Frequency compared to 300MHz on the K75. Scientific and 3D Benchmarks In MolDyn, both processors are tied. MolDyn is mainly an FPU intensive benchmark, so we see that the FPU does not benefit much from the integrated L2 cache. 3DMark 2000 shows a marginal 4-6% increase depending on Software T&L and Hardware T&L modes. In the 3DMark CPU Software T&L tests, the K75 stays close to the Thunderbird but in hardware T&L CPU tests, the Thunderbird pulls ahead by 7%. In the OpenGL-based Quake3A, the Thunderbird seems to benefit by 11-13% over the K75, which is a markedly great improvement compared to Direct3D.The K75 manages to marginally outperform the Thunderbird in 3D Winbench although the Thunderbird wins in the CPU test by 6%. Synthetic Tests The SiSoft Sandra Scores are a bit weird because the K75 outperforms the Thunderbird in these tests. In Winbench 99, the integer test shows an 11% lead with Thunderbird and like Moldyn, the FPU test shows stagnant results between the two. Overall, the GlobalMark, which is the addition of all the relative percentages in our benchmarks, shows that the Thunderbird is approximately 5% faster than the Athlon Classic. Cachemem Click here to view Thunderbird Cachemem results and here to view results on the Athlon Classic. Calibrator (Cache and Memory) According to AMD, the L2 latency of Thunderbird is 11 cycles compared to 21 cycles of the K75. However, in the cachemem tests (looking at the 2nsd block of results and for 128kb-256kb on the left hand side and look for 64 at the top of this table), the program reports a 20 cycle latency for Thunderbird and a 30 cycle latency for Athlon Classic. Calibrator, a similar program, showed the same 20 cycle latency for Thunderbird but oddly, a 19 cycle latency for K75 (which we believe, is an error). The L2 latency in Calibrator is calculated by adding the "CPU Loop + L1" and the "L1 miss latency". So we asked AMD what's up with these results. In our Cachemem and Calibrator results, it shows that the L2 latency of Thunderbird is 20 cycles while AMD states that it is 11 cycles in the Cache Whitepaper, how do you explain this? We designed the large 128Kb L1 cache to deliver optimal processor performance by ensuring that the majority of performance-intensive memory requests are serviced by accessing the L1. the 128Kb L1 is 2-way set associative, and the L2 cache is 256Kb 16 way set associative. we use an exclusive cache architecture that delivers 384kb of effective cache memory on the CPU. Exclusive cache architecture contains only the copy-back cache blocks to be written back to the memory sub-system, so there is no redundancy between the L1 and L2. Because of the exclusive cache architecture, the L2 adds a full 256Kb of additional cache memory --- for the 384kb total of effective cache. The L2 has a 64-bit data path with an 8 cycle latency between an L1 miss and the first critical word received from the L2. what does it all mean in terms of overall CPU/system performance? Check the AMD website for a comprehensive set of benchmark results that demonstrate the benefits of our architecture in terms of delivering leading-edge performance. Comment: Drew was not very familiar with both those programs. So the specifics as to why these two programs report a latency of 20 cycles is unknown and will remain unknown until we know more about the L2 cache architecture of Thunderbird, which is beyond the scope of this article. Hopefully, Drew will update us on this situation in a few days.fullon3d.com