I was going to give you most of that, but:
However, in a big benchmark (or sufficiently large real world system), there are thousands of spindles, so the average latency, assuming the storage subsystem is well set up, is divided by the number of spindles, which puts the speed more at par unless there is a "hot spot" such as a log file that places much of the I/O on a smaller portion of the storage.
Er. Thousands of spindles will certainly improve aggregate throughput, and conceivably, if there's good locality, reduce latency down to rotational-only levels. Although locality to that degree is pretty tricky. You also get the benefit of memory buffering in the thousands of disks, though robust db's may not be using that for reliability reasons.
I don't think it's correct to say you can divide the average latency by the # of spindles, though. A random access is still a random access.
This is a somewhat theoretical discussion, of course. There's no doubt that a big, flat memory model that can buffer the working set of the db is a big win. The only question is why the db can't be buffered efficiently enough with the equivalent amount of segmented memory in a 32-bit address machine. I wouldn't expect it to be near as efficient as a flat space, but a buffer hit in segmented space still ought to be an order of magnitude faster than a disk i/o. |