eCo: Link please. :)
I think you misunderstood what I meant by "stable". It's not so much a matter of not being able to keep your uptime high. Rather, it's the problem of having tons and tons of machines in which things can (and do) go wrong. The time required to repair a "broken" node is, in my experience, fairly high. And it just plain simply happens too d*mned often.
But, as Tony said, this is certainly a topic of hot debate around IT departments these days. Intel's advantage would seem to be that it is playing both sides of the fence (Itanium and P4).
To avoid many of the issues with clusters (including those you mention and much of what I referred to), you can go with the so-called blade servers. I have exactly zero experience with those, however, so I cannot give a qualified opinion on the merits of that solution.
Another issue with clusters is that, even rack-mounted, they take up serious space. Not always an issue, but certainly something to consider.
"The fact that replacing a "bad machine" within a cluster is trivial compared to fixing a partly faulty SMP yields much higher availability for carefully designed cluster configurations."
I do see what they are driving at, but in my experience, the most likely component to go is either a hard drive or a NIC. Needless to say, clusters are significantly more susceptible to the latter than either of the other solutions.
-fyo |