AGP-4X shines as incremental Bandwidth gain. If you think of it as another channel of memory, then it is clear that it does add to the total available memory bw on the chip. In 3D rendereing, there are three main clients of memory : the destination color, Z and textures. Because you read and write Z, the BW required for Z is actually quite high, hence it is best on the primary memory, but texture coming from AGP helps when done correctly. IE: if you can bury the latency of the AGPread in the render pipeline, then the gain is significant.
As for contention with CPU, don't forget that CPUs nowadays have really big caches, in many mips-intensive operations (eg : T&L), there are ways to structure the ops so that cache misses (one way is to do T&L in place per vertex instead of all T and then all L) are rare and actual memory access is low. Hence, even although that 100MHz 64bit memory looks weak, traffic across the hostbus from memory is very peaky and bursty, but overall, light.
Just look at the bandwidth: PCI 33MHz * 32 bit bus / 8 bits/byte = 132 MB/sec AGP 66MHz * 32 bit bus / 8 bits/byte = 264 MB/sec AGP2x 66MHz * 32 bit bus / 8 bits/byte * 2 trans/clock = 528 MB/sec AGP4x 66MHz * 32 bit bus / 8 bits/byte * 4 trans/clock = 1056 MB/sec Voodoo3 3500 183Mhz * 128 bit bus / 8 bits/byte = 2928 MB/sec Sony PS2 2 channel 800MHz Rambus = 48000 MB/sec (published spec)
|