"the higher the frequency of cache misses. This means that more DRAM bandwidth is required to keep the CPU from choking off. "
Required by who? :) The point you (and many reputable CPU designers) are every time missing is that a CPU was always choking off, and always will be, big time. The under-estimation this fact leads to excessive expectations from business managers. The "higher bandwidth" may be "required" for some future hypothetic CPU architectures, but for a given and known silicon that is already available for testing, the picture is fully determined.
About higher frequency of cache misses, you probably can say that, but the picture would be misleading, with no means to correctly forecast performance. Better think from the other end: a given benchmark has to execute certain amount of code. For a given CPU and cache architecture and regardless of the core speed, the code results in a fixed amount of cache misses, which in turn generates the corresponding memory/bus traffic, which results in more or less constant waste of run time due to fixed bus algorithms and memory latency. During this time the CPU is "chocking off" big time. In the remaining time the CPU can catch on, proportionally to its internal speed, but this is only a fraction of overall run time, about 30 or 40% at modern (SPEC) code and CPU speeds. Even less than that. That's how it works (or stalls :) :).
Regards, - Ali |