SI
SI
discoversearch

We've detected that you're using an ad content blocking browser plug-in or feature. Ads provide a critical source of revenue to the continued operation of Silicon Investor.  We ask that you disable ad blocking while on Silicon Investor in the best interests of our community.  If you are not using an ad blocker but are still receiving this message, make sure your browser's tracking protection is set to the 'standard' level.
Technology Stocks : Rambus (RMBS) - Eagle or Penguin -- Ignore unavailable to you. Want to Upgrade?


To: Bilow who wrote (40706)4/25/2000 7:59:00 AM
From: John Farrell  Respond to of 93625
 
Bilow, Your explanation of graphics memory bandwidth usage is rather deceptive. I worked in the PC graphics accelerator industry for seven (1991-1997) years for Radius, Headland Technologies (previously known as Video 7), Sierra Semiconductor and STB Systems working with various graphics ASICs from those companies or IBM (8514a & XGA), Western Digital (an 8514a clone), S3, Cirrus Logic, Tseng Labs, Nvidia and 3dfx and know the history of that industry and it's memory usage (DRAM, Fast Page DRAM, EDO DRAM, VRAM, SDRAM, MDRAM, SGRAM, Concurrent Rambus, and DDR DRAM) extremely well.

While there has always been excess offscreen memory in mainstream PC graphics designs, that section of memory has been implimented by device drivers mainly in offscreen caches for objects (fonts, device bitmaps, virtual desktops, and more recently, textures & z-buffers) to limit the waiting for data over what has always been the slowest part of the graphics subsystem, whether that was the ISA bus, VESA-Local Bus, PCI Bus or AGP port.

Usually the limiting factor in refreshing the screen has always been the speed of the RAMDAC and the expensive monitors that higher refresh rates required, not memory bandwidth. In a 2D (Windows GUI) type of world, there was very rarely any page flipping and what little concern for screen "tearing" was in video playback until 3D became pervasive. The bandwidth not used by screen refresh was "drawing bandwidth" which got so abundant that you could scroll Word so fast you couldn't read the screen.

3D is the real driving factor now behind the graphics systems insatiable desire for bandwidth as multiple textures, light maps, and Z-buffering eat up the "drawing bandwidth" (anything over what is required to refresh the screen) that the graphics chip has availible to it.

As I said, in the PC Graphics industry, only Cirrus Logic (Laguna) and Chromatic Research (mPact) (which was founded by the same people who founded Rambus) used Rambus memory and neither one of them could compete on a price/performance basis with chips using EDO and SDRAM memory. The other graphics chip companies were reluctant to go with Rambus because it was next to impossible to do one design that would support Rambus memory AND EDO/SDRAM, partially because of the licensing fees on parts that never were used with Rambus memory . Rambus killed graphics at Cirrus Logic and they exited the business after having once been King of the heap (before they did their Rambus designs). ATI ended up buying Chromatic Research for it's DVD and system on a chip IP to put into set-top box designs (which use SDRAM).



To: Bilow who wrote (40706)4/26/2000 4:52:00 PM
From: Plaz  Read Replies (2) | Respond to of 93625
 
A simple way of expressing the graphics bandwidth problem is that each bit of graphics display memory, has to be read from or written to once for each vertical retrace, assuming double buffering of the display. More modern high end graphics controllers can use more memory to describe the image than the completed image actually contains. As an example, the Nvidia DDR system uses 256Mbits = 32MBytes of memory. In order to read and write all that memory 60 times per second, you would need a bandwidth of 32MBytes * 120/sec = 4GBytes/sec, which is about how much bandwidth the Nvidia card actually provides. Thus with more modern technology, the memory bandwidth problem doesn't get better. It still boils down to something like 120 or so reads or writes per bit per second.

Bilow,

Unlike most of your technical diatribes, you're treading on an area I know something about here. The quoted paragraph makes absolutely no sense. It's so factually inaccurate, I'm not really sure where to begin. So let's begin at the beginning and go sentence by sentence:

A simple way of expressing the graphics bandwidth problem is that each bit of graphics display memory, has to be read from or written to once for each vertical retrace, assuming double buffering of the display

Inacuracies:
1. No, graphics bandwidth is completely dictated by the requirements of 3D rendering. You're thinking 2d, which is a trivial case for today's video cards bandwidth.

2. For rendering 1 frame of a 3d animation, there's a lot more than just the RAMDAC reading the final result once per vertical retrace as you state. The texture data needs to be read, texture data could be written for incoming or generated textures, Z buffer read and written, and the final framebuffer value written.

3. What in the world would double buffering have to do with it anyway? You're just throwing out technical mumbo-jumbo which you don't understand. Double buffering is having the RAMDAC displaying 1 frame buffer while the scene is being rendered into another. When the scene is finished the buffers are switched. But this has no effect on the memory accesses frequency in any event. It is used to keep the image updates in synch. It's not even relevant to your argument.

In order to read and write all that memory 60 times per second, you would need a bandwidth of 32MBytes * 120/sec = 4GBytes/sec, which is about how much bandwidth the Nvidia card actually provides

Inacuracies:
1. Again, 3d rendering is why the memory bandwidth is there. Assuming that "all memory needs to be read and written 60/sec" is NOT an approximation. I don't know what hat you pulled that out of. It must have just fit your result so you used it. A better calculation would be that to sustain 60fps at 1024x786x32, it would require x bandwidth, where x is:

read/write 32 bit Z-buffer: 60 x 1024 x 768 x 4 (16bit) x 2 (read and write) = 377 MB/sec

read/write 32 bit framebuffer: 60 x 1024 x 768 x 4 x 2 = 377 MB/sec

The GeForce2 GTS has 166MHz 128 bit DDR memory, giving 5312 MB/sec of bandwidth. So there's still most of the bandwidth unaccounted for! Where is it going? It's going to reading texture information to be applied to the pixels. This is where, unless you work for a 3d chip manufacturer, analysis becomes difficult. The 3D chip has an internal texture cache, so the texture cache hit rate is significant here. There's also questions like "how many MB of textures are there in the entire rendered world", "how many MBs of textures need to be displayed right now", "how many textures are coming in over the AGP bus", "when a texture is displayed, how much of it is actually displayed (vs. occluded)", etc. This is where I'll switch to benchmarks, because an analysis of this is really hard without making a lot of assumptions. But, any analysis of the benchmark results shows that even the GeForce2 GTS is bandwidth limited at 1024x768x32 and above, but is CPU limited at 640x480x16. See (http://www.firingsquad.com/hardware/nv15preview/page9.asp). Look how the numbers drop drastically for 1024x768 and above. This is mostly due to memory bandwidth limitations.

Saying that read and write all that memory 60 times per second has nothing to do with the real situation and is not even a good approximation, as an approximation at least depends on resultion and depth.

Thus with more modern technology, the memory bandwidth problem doesn't get better. It still boils down to something like 120 or so reads or writes per bit per second.

Inacuracies:
1. The "moderness" of the technology has nothing, absolutely nothing, to do with the bandwidth limitations. We're bandwidth limited because we are expecting so much more from video cards these days and, at high resolutions and pixel depths, most any 3d chip (GeForce, VSA-100, TNT2U), V3, etc)is able to outperform current, cutting edge memory technology.

2. Again, 120 reads or writes per second has nothing to do with it.

I challenge you to find anything in my post that is inaccurate. Are all of your technical articles filled with such nonsense? I'm not qualified to judge most of your technical posts as I'm not a DRAM engineer, but this one leaves a lot to be desired.

Plaz



To: Bilow who wrote (40706)5/2/2000 1:40:00 PM
From: Bilow  Read Replies (1) | Respond to of 93625
 
Re Graphics memory and DDR: A test of "Carl's graphics memory rule of thumb"...

Nvidia released a bit of information about their next generation graphics controller:
Nvidia claims GeForce 2 provides visual realism
While the earlier GeForce's internal clock operated at 120 MHz and connected via a 300-MHz interface to double-data-rate SDRAM, its successor's clock runs at 200 MHz and contains a 333-MHz interface to 96 Mbytes of DDR SDRAM.
techweb.com

Given 96MBytes of memory, I would expect a bandwidth of around 96*120 = 11GB/sec. But the above article doesn't give the bus width. On the other hand, they do give a memory size of 96MB, so the bus is likely to have a width divisible by 3. Possible candidates are 96-bits, and 192 bits. The two candidates give bandwidths, at 333MHz, of 4GB/sec and 8GB/sec. The next highest size would be a 384-bit bus with BW of 16GB/sec, but that is clearly not possible (too many chips). So I expect that the board will have a BW of 8GB/sec on a 192-bit bus. This is 64-bits wider than the current 128-bit wide bus. Supposing that they are using x32 memory chips, this would require six chips, each with 16MB = 128Mb storage each.

So I predict, on the basis of the above information (i.e. assuming the EE-Times article is true), that they are using six 128Mb x32 DDR SDRAM chips. But this assumes that they are getting 333Mbits/sec out of their DDR data pins.

There is another possibility, and that is that when they talk about "333MHz" DDR chips, they are talking about the clock rate, and so have data transfer rates of 666MHz. (This is fast, but it is technically much easier than the 800MHz data rate that Rambus uses. The reason that it is easier is similar to the reason why AMD went with the LDT instead of a Rambus style interchip communication scheme. It is a lot easier to get a high speed system between just two chips on the same board, than it is to design one between 33 chips on four boards, ala RIMMs.) In that case, then it is my prediction that they have a 96-bit wide bus, with three x32 256Mb memory chips, or perhaps six x16 128Mb chips.

In either case, my prediction is for a BW of 8GB/sec.

-- Carl