SI
SI
discoversearch

We've detected that you're using an ad content blocking browser plug-in or feature. Ads provide a critical source of revenue to the continued operation of Silicon Investor.  We ask that you disable ad blocking while on Silicon Investor in the best interests of our community.  If you are not using an ad blocker but are still receiving this message, make sure your browser's tracking protection is set to the 'standard' level.
Technology Stocks : Advanced Micro Devices - Moderated (AMD)
AMD 217.53+1.5%Nov 28 9:30 AM EST

 Public ReplyPrvt ReplyMark as Last ReadFilePrevious 10Next 10PreviousNext  
To: Tenchusatsu who wrote (17595)11/3/2000 10:29:36 PM
From: minnow68Read Replies (1) of 275872
 
Tenchusatsu,

You wrote "can you show me a heavy-duty application which crunches the FPU but not bandwidth?"

Yes.

An example is multiple linear regression. The problem is related to the square of the number of variables involved. Therefore, if you are performing a regression with less than 178 variables (very few problems use that many), then the entire accumulation array can be held in a 256K L2 cache. This means that for a regression of 178 variables, one would have to perform about 64K floating point ops for each 1424 bytes of memory read. Another way to look at this is the flops/(bytes of bandwidth) ratio is about 45.
This problem is of extreme practical interest. Regression is used extensively in many fields.

Similarly, there are other mathematical problems were complexity expands even faster. For example, flops needed in regression is proportional to the square of the number of variables. There are problems where calculation time scales to the cube or worse. For these problems, the bandwidth does not help much as long as the working set fits in cache.

Mike
Report TOU ViolationShare This Post
 Public ReplyPrvt ReplyMark as Last ReadFilePrevious 10Next 10PreviousNext