SI
SI
discoversearch

We've detected that you're using an ad content blocking browser plug-in or feature. Ads provide a critical source of revenue to the continued operation of Silicon Investor.  We ask that you disable ad blocking while on Silicon Investor in the best interests of our community.  If you are not using an ad blocker but are still receiving this message, make sure your browser's tracking protection is set to the 'standard' level.
Technology Stocks : Advanced Micro Devices - Moderated (AMD)
AMD 204.70-2.5%Jan 8 3:59 PM EST

 Public ReplyPrvt ReplyMark as Last ReadFilePrevious 10Next 10PreviousNext  
To: Mani1 who started this subject8/23/2002 8:18:34 PM
From: dougSF30Read Replies (1) of 275872
 
On aggregate benchmarks...

It seems to me that a much more reasonable scheme for comparing two chips (A and B) across a series of sub-benchmarks (S_1, S_2, ... S_n) would be as follows:

Decide on the relative weights (w_1, w_2, ..., w_n) you'd like each component to have. (e.g, all w_i = 1).

For S_i, compute R_i = (time A takes on S_i) / (time B takes on S_i).

Let R be the weighted *multiplicative* average of the R_i:
R = exp( sum_over_i( w_i * log(R_i)) / sum_over_i( w_i ))

Then you can say: "on average, chip A takes R times as long as chip B on our aggregate benchmark".

To see the nice properties, consider A and B performing identically on 7 out of 8 tests. On the 8th, A takes twice as long. With equal weights, you'd say:

"On average A takes 1.09 times as long as B", i.e.
"A is 9% slower"

Which seems about right, intuitively, based on equal weights. (Note: 1.09... is the 8th root of 2)

This style of comparison also means you end up comparing 2 chips *directly*, not via a 3rd chip baseline used to set arbitrary constants (SysMark 20 = 700MHz Celeron, or the like (just a made up example)).

And it allows the weights you select to actually have meaning-- it doesn't matter how long a particular task takes, since it's the ratio that gets averaged.

In short: The weighted multiplicative average of subtask time ratios would seem to produce fairly transparent, intuitive "benchmarks".

Doug

p.s. The technique can also be used to combine scores across multiple benchmarks-- something the review sites seem to lack: Just set each R_i to be the ratio of both chips on benchmark i, choose weights, and voila:
"On average, chip A gets R times the score of B across our benchmark suite of Sysmark2003, Quake3, ScienceBench, and SandraThroughput" (Just make sure each component benchmark is of the same flavor ("higher = faster" or "lower = faster") or use an inverse to change flavors.
Report TOU ViolationShare This Post
 Public ReplyPrvt ReplyMark as Last ReadFilePrevious 10Next 10PreviousNext