SI
SI
discoversearch

We've detected that you're using an ad content blocking browser plug-in or feature. Ads provide a critical source of revenue to the continued operation of Silicon Investor.  We ask that you disable ad blocking while on Silicon Investor in the best interests of our community.  If you are not using an ad blocker but are still receiving this message, make sure your browser's tracking protection is set to the 'standard' level.
Technology Stocks : Rambus (RMBS) - Eagle or Penguin -- Ignore unavailable to you. Want to Upgrade?


To: Alex Fleming who wrote (34907)11/20/1999 8:45:00 AM
From: visionthing  Read Replies (1) | Respond to of 93625
 
Alex, can you re-post the link, it doesn't seem to work.

thanks

VT



To: Alex Fleming who wrote (34907)11/20/1999 8:10:00 PM
From: Jdaasoc  Respond to of 93625
 
Alex:
did the link move. Is this same article

toshiba.com

Choosing High-Performance DRAM for Tomorrow's Applications

Introduction
The purpose of this paper is to give engineers and systems designers an overview of the options they have available to them for high-performance DRAM, examine the cost/performance tradeoffs of various DRAM solutions, and discuss the suitability of each type for specific applications. Additionally, a new DRAM architecture, called the Fast Cycle RAM (FCRAM) will be introduced.

The system designers of today know very well that they require high-performance, high-density memory solutions to satisfy the increased processor frequencies and growing complexity of end-user applications. What is not a simple choice is which DRAM type to use. There are Synchronous DRAMs (SDRAMs) with 100MHz, 133MHz and faster clock rates, Double Data Rate (DDR) SDRAMs and Rambus DRAMs (RDRAMs). System and memory controller designers are faced with pressure to adopt mainstream solutions and try to avoid low-volume, niche products. The DRAM supplier has to determine which one(s) to prioritize.

An even larger concern for the DRAM supplier is having the appropriate process technology necessary to offer these solutions, and being able to transition to these fine geometries with minimal investment and technical barriers.

So which of these device types should a system designer use and DRAM manufacturer produce? Looking at chipset and memory controller roadmaps across a wide range of applications, it is clear that 100MHz, 133MHz and faster SDRAMs, DDR and RDRAM will all co-exist. To solve this mystery, we really need to determine in what timeframe and in which applications these solutions will exist, which in turn determines their relative demand and production volume. Let's first look at the cost/performance tradeoffs of each of these DRAM solutions, and based on this, determine the suitability of each type for various applications.

Performance Comparisons
PC100 vs. PC133
Faster speed versions of today's 100MHz (PC100) SDRAMs are a logical and evolutionary progression. Chipsets and memory controllers already exist which support 133MHz (PC133) and faster memory busses. The key factor in determining their success is the cost/performance tradeoff. A PC133 SDRAM may or may not outperform a PC100 SDRAM depending on three critical parameters commonly referred to as CAS latency (CL), RAS-to-CAS delay time (tRCD) and RAS pre-charge time (tRP). These parameters are measured in terms of the number of clock cycles. For example, a device with CL = 2 cycles, tRCD = 2 cycles and tRP = 2 cycles, is commonly referred to as a 2-2-2 device.

TABLE 1:The table below shows a comparison of a PC100 CL2 device to PC133 CL3 and CL2 devices.

Memory
Bus Speed
CAS Latency
(CL)
RAS Pre-charge
Time (tRP)
RAS-to-CAS Delay Time (tRCD)
CL+tRP+tRCD
(total time)
Performance
(normalized)

100MHz
(PC100)
20ns
(2 cycles)
20ns
(2 cycles)
20ns
(2 cycles)
60ns
1.00

133MHz
(PC133)
22.5ns
(3 cycles)
20ns
(2.67 cycles)
20ns
(2.67 cycles)
62.5ns
0.96

133MHz
(PC133)
15ns
(2 cycles)
15ns
(2 cycles)
15ns
(2 cycles)
45ns
1.25


The above values were taken from Toshiba's 128M SDRAM datasheet.

Compared with a PC100 CL2 device, which is considered today's baseline for memory performance, the PC133 CL3 device is about 4% slower, while the PC133 CL2 device is 17% faster. Of course, these calculations are based solely on the above three critical parameters, and actual system performance will depend on the application and other factors, many of which will be discussed below.

It is also worthwhile to note that two of the three parameters, tRP and tRCD, are actually shown as fixed values in nanoseconds and are not necessarily an integer number of clock cycles. If the memory controller only comprehends these parameters as an integer number of clock cycles, then they must be rounded up to the next highest value. For example, in the above table, tRP is 20ns for all three types. In the case of PC100, 20ns is exactly two clock cycles, however for the PC133 device 20ns is 2.67 clock cycles, which has to be rounded up to three cycles. Therefore, in the above example, the PC100 CL2 device is referred to as 2-2-2, the PC133 CL3 device as 3-3-3, and the PC133 CL2 device as 2-2-2.

DDR vs. RDRAM
The performance benefits of DDR vs. RDRAM are commonly debated in the industry today and wide ranges of performance numbers are shown, especially in peak bandwidth comparisons. While peak bandwidth is important, it is really only the "top line" of actual system performance. The bottom line is sustained, or effective, bandwidth, which is also a function of other memory parameters and features, such as latency, the number of internal banks and the read-to-write/write-to-read bus turnaround time. Effective bandwidth is also a function of certain system or application-dependent parameters, such as burst length.

TABLE 2: In the table below, we compare the peak bandwidth for PC100, DDR and RDRAM for various memory bus widths.

DRAM Type
Clock/Data Rate
Memory Bus Width
Peak Bandwidth

PC100
100MHz/100MHz
64-bit
800MB/sec

DDR
100MHz/200MHz
64-bit
1.6GB/sec

DDR-II
200MHz/400MHz
64-bit
3.2GB/sec

128-bit
6.4GB/sec

RDRAM
400MHz/800MHz
16-bit (1 channel)
1.6GB/sec

32-bit (2 channels)
3.2GB/sec

64-bit (4 channels)
6.4GB/sec


It should be noted that DDR in the above table is based on today's industry specification, which basically includes 100MHz and 133MHz clock rates. DDR-II is currently being defined by JEDEC, and is expected to offer much higher clock rates and features to improve effective bandwidth, some of which will be discussed below.

Based on the above analysis, DDR can match RDRAM in terms of peak bandwidth. However, the system designer must make the determination of which device to use based upon the advantages/disadvantages of widening the bus from 64 to 128 bits for DDR vs. adding multiple channels for RDRAM. Additionally, peak bandwidth is only one factor in determining effective bandwidth as was mentioned above and will be discussed further below.

FCRAM - A Faster Memory Core
All of the DRAM types commonly discussed in the industry today, such as EDO, SDRAM, DDR and RDRAM, have one major thing in common ¤ their memory cores are the same. What is different about each type is the peripheral logic circuitry, not the memory cell array. What this complex new peripheral logic circuitry does is attempt to hide the inherently slow memory core.

FCRAM is a novel concept, which finally recognizes and fixes the slow memory core by segmenting it into smaller arrays such that the data can be accessed much faster and latency greatly improved. How this is done is beyond the scope of this paper. If the reader is interested, both Toshiba and Fujitsu can provide more detailed information on FCRAM functionality.

The key measure of how FCRAM improves latency and can improve system performance is the read/write cycle time (tRC), which measures how long the DRAM takes to complete a read or write cycle before it can start another one. In the case of conventional DRAM types, including SDRAM, DDR and RDRAM, tRC is typically on the order of 70ns. With FCRAM, tRC of 20 or 30ns is possible. For this reason, this new device is referred to as a fast cycle RAM.

Besides faster tRC, FCRAM also improves latency with several new features that will be discussed below.

DRAM Features/Parameters Which Affect Actual System Performance
Latency
Very simply, latency is how long it takes a DRAM to begin outputting data in response to a command from the memory controller. There are many different measures of DRAM latency. For example, the time it takes the DRAM to access the data from when the row address is provided by the memory controller is called the row address access time (tRAC), which is typically on the order of 50 to 60ns. The RAS pre-charge time (tRP), typically 20ns, is another measure of DRAM latency. Most of the measures of DRAM latency are a function of the memory core design and wafer process technology used. Therefore, it is a reasonable assumption that DRAM internal latency is the same for SDRAM, DDR and RDRAM for a given design and process. The features/parameters mentioned below have a wider variance between device types and therefore a wider-ranging impact on system latency and performance. The memory concept called FCRAM mentioned above demonstrates how a new memory core architecture can truly improve inherent DRAM latency.

Number of Banks
The number of internal banks a DRAM has is perhaps the biggest factor in determining actual system latency. This is because of the fact that a DRAM can access data much faster if it is located in a bank that has been activated. By activated, we really mean that the data is located in a bank that has been pre-charged. The pre-charged bank can either be the same page (row) that is currently being accessed, or it can be in a bank that is not currently being accessed. If the data is located in a pre-charged bank, we often call this a page hit, meaning the data can be accessed very quickly without a delay penalty of having to close the current page and pre-charge another bank. On the other hand, if the data is in a bank that has not been pre-charged, or in a different row within the bank currently being accessed, a page miss occurs and performance is degraded due to the additional latency of having to pre-charge a bank.

The memory controller designer can minimize latency by keeping all unused banks pre-charged. Therefore, more internal DRAM banks increases the probability that the next data accessed will be to an active bank and minimizes latency.

TABLE 3: The table below shows how the number of banks affects the hit and miss rates, assuming all unused banks are always pre-charged.

DRAM Type
# of Banks
Miss Rate
Hit Rate

SDRAM (16M bit)
2
50% (1/2)
50% (1/2)

SDRAM/DDR

(64M bit and higher)
4
25% (1/4)
75% (3/4)

RDRAM
16
6% (1/16)
94% (15/16)


Clearly, adding more banks increases the hit rate and reduces latency, however adding banks increases the die size and cost of the DRAM. Therefore, a cost/performance comparison is necessary when determining how critical it is to reduce latency by increasing the number of banks.

Bus Turnaround Time
Because of the increasing functionality required of today's main memory subsystems, the time that it takes a DRAM to switch between a read and write cycle or between a write and read cycle is becoming a critical factor. This time is commonly referred to a bus-turnaround time. Delays in turning the bus around can result in costly dead bus cycles and reduced performance. In order to minimize dead bus cycles, fast (preferably zero) read-to-write or write-to-read bus turnaround time is required.

Traditional DRAM types including EDO and SDRAM use a scheme called command decoding to determine whether the cycle is a read or a write. This means that at the same time the address is provided to the DRAM a read or write command is also provided. As the next step, the DRAM has to decode the address and command simultaneously. This results in dead bus cycles. For a relatively slow clock frequency, such as 66MHz, these dead clock cycles do not result in a prohibitive performance loss. However, as clock frequencies increase to 100/133MHz and beyond, the bus turnaround time is becoming an increasingly critical factor in determining actual system performance.

The bus turnaround time is even more critical for DDR, as data is transferred on both the rising and falling edges of the clock. In other words, for every dead clock cycle there are two dead data cycles and twice as much bandwidth "opportunity loss." The emerging DDR-II standard is attempting to address the issue of bus turnaround time with new features, such as write cycle latency being a function of read cycle latency, and the posted CAS or late write feature.

In the case of RDRAM, the bus turnaround time is less of an issue because the device has separate address and control busses, such that simultaneous decoding is not required.

The above three parameters and features, latency, the number of banks and bus-turn around time, are really a function of how the DRAM operates. The factors mentioned below, burst length and randomness, are application-dependent.

Application Dependent Parameters - Burst Length/Randomness
Burst length is defined as the number of successive accesses (column addresses) within a row or pre-charged bank. In other words, burst length is the number of successive read/write cycles without having to provide a new address. DRAMs can access data very quickly if the next data is located in the same row as the current data or in a pre-charged bank. Therefore, as the burst length becomes longer, initial latency is minimized and the effective bandwidth approaches the peak bandwidth. Graphics is a good example of an application with a relatively long burst length. On the other hand, applications such as network switches and routers tend to have very short burst lengths (sometimes the burst length is one, meaning no successive accesses within a row), and initial latency becomes more critical in determining effective bandwidth. Applications with very short burst lengths are often called "random access" applications, as it is not easy for the memory controller to predict where the next data bits are located.

TABLE 4: The table below attempts to quantify short vs. long burst lengths and shows the typical applications which have burst lengths of this order.

Burst Length
Typical Application

1 or 2 (short)
Network switches/routers

4 to 8 (medium)
PC main memory

8 to 256 (long)
Graphics


By comparing today's typical DRAM timing specification, the magic number for burst length appears to be around four. Burst lengths of less than four do not take much advantage of the DRAMs peak bandwidth capability and are better served by a low-latency solution, such as the FCRAM. For burst lengths of 4 and longer, the system will be able to take advantage of the DRAM's peak bandwidth, making very high data rate devices, such as RDRAM, the ideal solution.

Summing it All Up - Bus Utilization
Now that we have defined the DRAM parameters/features and system factors that determine effective bandwidth, we need some way to measure the sum of the affect of all of these items. In reality, what determines the effective bandwidth of the system is the bus utilization, which means the percentage of the time the memory bus is active in reading/writing data. Once this factor is known, it is easy to determine effective bandwidth by multiplying the bus utilization factor by the peak bandwidth. For example, if the bus utilization is 50% (meaning that 50% of the time is the maximum amount that the DRAM bus is reading/writing data), and the peak bandwidth is 1G byte per second, the effective bandwidth is 500M bytes per second max.

TABLE 5:In the table below, we have estimated the maximum effective bandwidth for PC100/133 SDRAMs, DDR, RDRAM and FCRAM.

DRAM Type
PC100
PC133
DDR
RDRAM
FCRAM

Clock speed (MHz)
100
133
133
400
133

Data rate (MHz)
100
133
266
800
266

System data bus width
64-bit
64-bit
64-bit
16-bit
64-bit

Peak Bandwidth (MB/sec)
800
1067
2133
1600
2133

Bus Utilization
62%
59%
42%
74%
55%

Max. Effective Bandwidth (MB/sec)
494
631
897
1190
1165


The detailed calculations used to compute the bus utilization are beyond the scope of this paper. All of the above mentioned factors which determine bus utilization and ultimately effective bandwidth were used in these calculations, and the values are based solely on data sheet parameters and timing diagrams (i.e., no marketing hype). Here are some of the methodologies and assumptions worth mentioning:

A write-read-read (W-R-R) access with burst length of 4 was chosen for comparison purposes to represent a page hit. It represents a typical main memory access to perform a cache fill, and also demonstrates the bus turnaround capability of the devices.
After a page miss, a precharge cycle followed by a W-R-R is performed.
The page hit/miss rates are determined solely by the number of banks.
The DRAM refresh rate is 5% (meaning 5% performance loss to perform refresh) and is the same for each DRAM type.
Of course, every application has different access cycles and burst lengths, however in order to perform this analysis, a fixed set of assumptions is necessary. We believe this analysis is fairly representative of the typical computer main memory conditions.

The preceding analysis leads to some interesting observations:

With SDRAM and DDR, the bus utilization decreases as the clock frequency increases. This is due to the fact that dead bus cycles (which are close to being constant for PC100/133/DDR in terms of clock cycles) have a greater impact on performance as data rates increase.

RDRAM and FCRAM can perform nearly gap-less R-W and W-R bursts. In other words, there are almost never any dead bus cycles. In the case of RDRAM, this is because of the separate address and control decoding, as previously mentioned. For FCRAM, many of the DDR-II features that improve bus efficiency have been adopted. This gives us a fairly good indication of the expected improvement in bus utilization we can expect for DDR-II, however it will still not match FCRAM due to its lower initial latency.

FCRAM can match RDRAM in terms of effective bandwidth. It is not surprising that RDRAM wins the effective bandwidth battle, due to its high peak bandwidth and architecture specifically designed for PC main memory. It is somewhat surprising that FCRAM can keep up. For applications with more randomness and shorter burst lengths, FCRAM cannot be matched. Therefore, FCRAM should be recognized as the performance winner overall

Granularity
Before we can pick the winners for each application, we must discuss the concept of granularity and how it ultimately will determine system cost. Granularity is defined as the minimum system density (in megabytes) that is possible for a given DRAM configuration and system bus width.

TABLE 6: The table below shows the granularity for a peak bus width variety of DRAM types and system implementations.

DRAM Type
DRAM
Density
DRAM Data
Bus Width
System
Bus Width
Granularity
Peak Bandwidth

SDRAM (100MHz clock)
64M bit
16 bit
64 bit
32MB
800MB/sec

128M bit
16 bit
64 bit
64MB
800MB/sec

256M bit
16 bit
64 bit
128MB
800MB/sec

512M bit
16 bit
64 bit
256MB
800MB/sec

DDR
(133MHz clock)
64M bit
16 bit
64 bit
32MB
2.13GB/sec

128M bit
16 bit
64 bit
64MB
2.13GB/sec

256M bit
16 bit
64 bit
128MB
2.13GB/sec

512M bit
16 bit
64 bit
256MB
2.13GB/sec

RDRAM
(400MHz
clock)
128M bit
16 bit
16 bit
(1 channel)
16MB
1.6GB/sec

32 bit
(2 channels)
32MB
3.2GB/sec

64 bit
(4 channels)
64MB
6.4GB/sec

256M bit
16 bit
16 bit
(1 channel)
32MB
1.6GB/sec

32 bit
(2 channels)
64MB
3.2GB/sec

64 bit
(4 channels)
128MB
6.4GB/sec

512M bit
16 bit
16 bit
(1 channel)
64MB
1.6GB/sec

32 bit
(2 channels)
128MB
3.2GB/sec

64 bit
(4 channels)
256MB
6.4GB/sec


One of the key points in the above table is the difference in system architectures for SDRAMs (including DDR) vs. RDRAMs. SDRAMs must be used in parallel, which increases granularity. In the above example, four 16-bit devices must be connected in parallel to match the 64-bit system bus width. Therefore, the system granularity is four times the granularity of the device. For RDRAM, since the system bus (Rambus channel) width is the same as the device bus width, the granularity of the system is equal to that of the RDRAM multiplied by the number channels. This has very compelling cost/performance implications.

In terms of cost, since a single RDRAM can be used, smaller memory (lower cost) systems using RDRAM are possible than for SDRAMs. For example, using 256M (x16) SDRAMs, a system with 64MB is not possible. However, with 256M RDRAMs 32MB or 64MB systems are possible. This benefit has not been realized today, as the most cost-effective (lowest cost per bit) DRAM density is the 64M, which allows low-cost systems to be built with less than 64MB of memory. However, PC main memory system manufacturers utilize the lowest cost per bit DRAM solution, and that solution will be the 256M density (regardless of SDRAM, DDR or RDRAM) in the not-too-distant future. Within the next few years, 512M and 1Gb DRAMs will be in volume production from Toshiba and possibly other suppliers, making low-density, higher performance low-cost SDRAM/DDR solutions even less feasible.

One can argue that increasing the DRAM bus width from 16 to 32 bits helps resolve this problem, however x32 DRAMs are more costly to produce and historically have not been used in main memory applications.

On the performance side of the equation, it is even more compelling for RDRAM. Not only can we build an RDRAM system with less memory than an SDRAM/DDR system, but that same RDRAM system can have significantly better performance. For example, using 256M RDRAMs, a 64MB, 2-channel RDRAM system can be built providing 3.2GB/sec of peak (2380 GB/sec effective) bandwidth. This equates to almost five times the effective bandwidth of the SDRAM system and over 2.5 times the effective bandwidth of the DDR system, yet the RDRAM system will be lower cost because 64MB is not possible with SDRAM/DDR at 256Mb density.


In table 6, found in the DDR vs. RDRAM section, we show peak bandwidth comparisons for a 128-bit system bus for DDR-II, yet they are not included in the above analysis. While increasing the bus width from 64-bit to 128-bit is possible (but not trivial), it is not considered above because it actually makes the granularity problem with SDRAM/DDR worse. However, this may not be an issue in some applications. This last point will be discussed further in the following section.

The Winners in Each Application
Now that we have done detailed performance comparisons and a granularity analysis for each DRAM type, we can make some fairly reasonable conclusions on which DRAM type is appropriate for which application in what timeframe.

TABLE 7: The table below shows this.

Application
Timeframe
Ideal DRAM Solution

Low-end Desktop PC
2000
PC100/PC133

2001
RDRAM

High-end Desktop/Workstation
2000
RDRAM

PC Server
2000
PC100/PC133/DDR/RDRAM

2001
DDR/RDRAM/FCRAM

High-end Server/Mainframe
2000
PC100/PC133/DDR

2001
DDR/FCRAM

Graphics
2000
SDRAM/DDR

2001
DDR/RDRAM

Network Router/Switch
2000
FCRAM

Hand-held/PDA
2000
FCRAM

Digital TV/Set-top Box
2000
SDRAM

2001
DDR/RDRAM


The following explains in more detail why the above DRAM types were chosen as the ideal solution for each application.

Low-end Desktop PC
This market is very cost-sensitive and will be best served by the lowest-cost DRAM solution in 2000, which will be PC100 and possibly PC133 if it is offered for no premium and yields for the 2-2-2 version improve. In 2001, the following three factors will drive RDRAM as the ideal solution in this segment.

RDRAM will become lower cost as production volumes increase and DRAM suppliers come down the learning curve.
The 256M DRAM will become the most cost-effective solution, making low-cost SDRAM/DDR implementations less feasible due to the granularity issue previously discussed
This market segment will also demand performance in 2001.
High-end Desktop/Workstation
The end-users of these systems demand performance for applications such as 3D graphics and office productivity enhancements, and they will pay for performance, making RDRAM the ideal solution. Considering microprocessor and chipset roadmaps, the year 2000 is clearly the year for RDRAM in this segment.

PC Server
We define PC servers as systems with one or more CPUs, generally of the CISC/x86 variety, and which use third party chipsets, as opposed to designing their own memory controllers. This market segment has many solutions in 2000. The primary reason is that in servers main memory performance is derived more from system design techniques, such as interleaving and large L2 caches, rather than from DRAM performance. Additionally, there are many chipset options for 2000. Therefore, we expect the year 2000 to include systems using all of these solutions. In 2001, this market segment will settle somewhat and utilize primarily DDR or RDRAM. Since companies manufacturing PC servers are typically the same companies in the desktop PC business, this market segment will follow the desktop main memory trend, which is towards RDRAM. These systems also do not suffer from the granularity problem in desktops, making the evolution of SDRAMs, i.e. DDR, a feasible and likely long-term solution.

High-end Server/Mainframe
The story for these large systems is basically the same as for the PC server segment, with one notable exception. Companies who manufacture these systems also design their own memory controllers (ASICs) and memory subsystems. This has a significant impact on the device type chosen. Because these companies possess a relatively large staff of skilled memory controller designers, they do not need a "cookbook" solution such as RDRAM. Additionally, they can design systems with 128-bit and wider main memory busses, hence DDR can match RDRAM in terms of performance, especially for the emerging DDR-II standard with its much-improved feature set and performance capability.

In both of the above two market segments, we also show FCRAM as a feasible solution in 2001. From a system perspective, we believe servers and other large systems will become more interested in reducing memory latency as the randomness of the data increases, which will happen as a result of more multimedia (video, audio, text, etc.) traffic occurring over the internet and within the corporation. From a DRAM perspective, FCRAM can be designed in as a superset of DDR, which means it can be easily adopted in these types of systems. Additionally, as 256M and higher density DRAMs become prevalent, the added FCRAM features become more negligible in terms of additional die cost, almost guaranteeing FCRAM's adoption into this segment.

Graphics and Digital TV/Set-top Box
The mainstream graphics market tends to follow the main memory DRAM trends, with the exception of using lower-density, wider devices. For example, today the 1Mx16 (16M bit) and 2Mx32 (64M bit) are the most common devices. It should also be noted that because of the small number of DRAMs in graphics systems, the DRAM loading test specification can be reduced, resulting in faster speed versions with the same yield as the PC100/PC133 maim memory test specification. For example, Toshiba is currently offering 167MHz 2Mx32 SDRAMs with the same process as is used for our PC100/PC133 64M/128M-bit SDRAMs. We expect both x32 DDR and RDRAM to emerge as the preferred graphics solutions as these applications become more performance-driven. Digital TV and set-top boxes are an emerging market segment with much the same system criteria and likely DRAM solutions as graphics.

Network Router/Switch
The networking market segment ranges from modems and interface cards up to very large routers and switches to service local and wide-area networks. It is the latter-type applications that are of interest for this discussion, as they are memory performance driven. Since the data in these types of systems is very random in nature and the data packets are small (short burst length), memory latency is the most critical parameter. Because of this, FCRAM is the ideal solution.

Hand-held/PDA
This segment is differentiated from sub notebooks based on the fact that disposable (non-rechargeable) batteries are most commonly used, hence battery life is critical. Due to the fact that DRAM memory cells are composed of capacitors, which lose their charge over time and must be refreshed, DRAMs in general are not the ideal solution for these applications. However, with the system memory density increasing, DRAMs are already being adopted. As mentioned previously, one of the key design methodologies for the FCRAM that allows the latency to be greatly reduced is the segmented memory core. An additional by-product of this segmentation is a reduction in power consumption of up to 50% compared with SDRAMs of a given process and density. Therefore, FCRAM will emerge as the ideal low-power solution. In terms of system density and DRAM configuration, the hand-held/PDA closely resembles the graphics market, although much less performance driven.

DRAM Design/Process Technology
Hopefully by now, selecting the ideal DRAM solution has become a simpler process for the system designer. The process of actually producing these winning solutions may not be so simple for the DRAM supplier.

In the case of the PC133 SDRAM, we have already mentioned that the 2-2-2 specification is critical for PC133 to show a measurable performance increase over PC100 and hence for PC133's success. At present, the DRAM industry is utilizing primarily 0.20um and wider processes, and yields for PC133 2-2-2 are poor. Therefore, 0.18um and finer process geometries are mandatory.

For DDR, the timing margins are very tight, making fine process geometries also critical to this product's success. Another key point is that the systems that are most likely to use DDR are large systems, which means that 128M, 256M and higher density DRAMs will be required. It is not cost effective to volume produce 256M DRAMs using 0.20um and wider processes.

RDRAM also has very tight timing margins. Additionally, the yield for the 800MHz version is best achieved at 0.18um and finer processes. Reducing the cost through higher density products, such as the 256M, is also critical for RDRAM's success.

The conclusion is fairly clear. The only way for these next generation DRAM solutions to succeed is to build them using very aggressive process geometries. It is not so clear that this is going to be easy for all DRAM suppliers to accomplish. The industry is filled with stories about suppliers having yield problems using these aggressive geometries or not being able to cost-effectively produce a higher density solution. While we can not speak for other suppliers, Toshiba believes it has overcome these concerns.

Toshiba has implemented a program called Scalability by Design, which not only assures that we can smoothly migrate to finer processes and higher densities, but also at a lower cost than standard in the industry. This program started when Toshiba, together with our design partners IBM and Siemens, introduced a new DRAM trench memory cell. Due to the inherent advantages of the trench cell, it has proven to be much easier to scale (i.e. perform die shrinks) than the alternate stacked cell implementations.

Besides this new trench cell, Toshiba also decided to introduce new wafer process equipment and technologies, starting at 0.35um, which allow us to produce five generations of product, all the way down to 0.15um, in one clean room with minimal investment. For example, at 0.35um we introduced Krypton Fluoride (KrF) steppers, shallow trench isolation (STI) and chemical mechanical polishing (CMP). Our estimate is that the introduction and production ramp of each new generation requires a 10% incremental investment on our part vs. 50% as an industry benchmark.

This cost factor is the most likely reason that some of our competitors continue to keep a portion of their production lines running older process, while we have already migrated to 0.20um at all of our production facilities. Toshiba is recognized as having the best yield for CL2 PC100/133 SDRAMs today, clearly indicating that implementing finer process geometries across all production lines is critical for success in the high speed DRAM game.

We believe that our technical advantages will become even more apparent as we migrate all of our production facilities to 0.18um starting in Q4 this year and then to 0.15um next year. Con