>> Assuming that they had 100 data centers, then we're talking about inundating the 'net with 50 Mbits * 100 sites, or 5 Gbits of traffic. Hmm. Is this in some way self defeating? What we've done here effectively is to multiply a single T3 of data into something that would require an OC-96.
Consider how well this would scale if everyone began to follow suit. What am I missing here, that would lead me to come to such a drastic error in my thinking? <<
Frank ... this is a queuing systems problem ... some standard references are:
Queueing Systems : Theory by Leonard Kleinrock amazon.com
Queueing Systems : Computer Applications Leonard Kleinrock(Editor) / Library Binding / Published 1976 amazon.com
It's the basically same problem as memory paging ... whether it's more efficient to prefetch the resource into a faster store depends entirely on the traffic pattern.
If 50 people in Kansas want to listen to the Doors, it's better to move the file to Kansas and serve it from there.
If only one person in Kansas cares about the Doors, then the prefetch saves no time globally, but may make one Doors fan happier depending on the relative performance of Fan-Kansas, Kansas-global layers.
In practice, on the web as we know it, caching will only make sense in some cases ... it does not make sense to cache everything. The specific economics of any case cannot be quantified without knowing the traffic pattern and relative performance of the layer or domains ... and the cost and QoS options available.
The whole thing is actually quite hilarious ... the web caching guys seem to have completely forgotten the history of memory management.
Imagine that URLs are dynamicslly paged by layers of cache servers progressively "closer" to the edge based on Least-Recently-Usedness. In this case, the web behaves like the disk, RAM, L2, and L1 layers in a PC. Since the resource in this case (web pages) has many readers and one writer, it makes sense to prefetch the changes to the layer closest to the readers.
This whole game is just getting started and the players are making reinventing the same limited solutions that, say, Dec was forced to use in the PDP-11 (by the technology of the day).
The REAL money is in the automatic admin of the caching issue ... doubt it? Just look at a mask of a Pentium chip and see how the real estate is invested there.
Caching servers are cool but obvious ... what is non-obvious is how to get maximum pages near maximum readers at a cost minima.
We'll see 3-5 years of silly solutions before the real deal kicks in. |