To: pgerassi who wrote (74185 ) 3/11/2002 1:06:37 PM From: Ali Chen Read Replies (1) | Respond to of 275872 Dear Pete, this is my last attempt (#4?) to communicate with you :-) You wrote: 1) "Take the 2.0 GHZ NW and the January and February tests of the last SPECint program twolf. The amount varies by 7 units out of about 627. That's a std. dev. of about 0.8%... One, vortex has a std. dev. of about 4% " and "Vortex has a 13 second range for a test that runs 180 seconds on average. That's over 6%!" You must be reading some different SPEC submissions. Inspec.org the 255.vortex posts 186 sec three times in a row. Looks pretty consistent to me... Inspec.org the same program posts 174+-1sec. I am tired to play games with you, so here it is: The first set uses built 010727Z of Intel compiler, the second set uses 010922Z. As I recall, I specifically asked you to check software versions. 2) "Take the score using those (Prod(scores)^1/N) where N is the number of programs. Do the same for the highest score for each test. Pretty big range given that to stay within 3 sigma for your estimate you must be below 0.3% variation." Why should I "take" anything? It is you who has abstract ideas about statistical data processing using 3-sigma deviations based on a data set of 3, and derives theoretical conclusions. Maybe you should just run the SPEC yourself as I did hundreds of times, and then speak about variances? I gave you a (sort of linear) formula that is based on published scores alone, and is accurate to 0.2%. Did you try it? Better, give me your formula. Put it up, or ... 3) "You are wrong about using seconds instead of the ratios." See 2) above. 4) "And you have yet to take into account individual program variance. I looked into the variances abnd they swamp your projections. " Swamp? See 2) above. 5) "And all you had to do is look at SPECint_rate to see this" That's all? Why should I have to look into entirely different arrangement of workloads, with entirely different task-switching scheme? BTW, it can be easily expected that the SPEC-rate scales worse than regular SPEC: the switching between many independent big tasks clearly involves more cache thrashing, which increases the off-chip memory traffic. More, it is still 22.5% according to you, and not 10% as the original model implied. 6) "In scientific work one uses more than one model to validate the method gives reasonable answers. The NOAA uses 7 different models to verify that the forecasts are reasonable. You failed to do that. I can see why." Yes, I see too. See 2) above. You must be confusing a forecast of an inherently chaotic system on a strongly hyperbolic strange attractor, with a straightforward chain of simple binary state machines operating on a fixed clock period. 7) "There is a boundary limit. It exists. SPEC allows such work arounds that it is hard to see it. Normal server type loads find it much faster." Every workload has its own limit. I don't see how a different workload can refute of confirm results from other workload. - Ali P.S. My patience is running out.