[ Continued ] A PCI-to-DSP Bridge for High-Performance DSP Systems Barry Field, Spectrum Signal Processing Inc., Canada: icspat.com
I checked my work :))
Ray
account for the demands placed on it by both buses with the goal of maximizing system performance. Most bus interface chip makers meet this goal by simply targeting their design at a specific processor's external bus interface. However, because of the diversity among commercially available Dips, a PDSP bridge cannot be designed to optimize data transactions for a specific processor. Rather, it must contain features that enable it to meet performance benchmarks regardless of the processor being used in the DSP system. Thus, the main objective of the PDSP bridge designer must be to create a bridge capable of operating efficiently within a "generic" DSP system. The simplified multiprocessor architecture in Figure 1 is representative of a typical DSP card topology, and serves to illustrate one of the most common methods of interfacing to a bridge chip-the shared memory bus. Optimizing data transfers between this bus and the PCI bus should be the main goal of the PDSP bridge designer. DSP I Shared Memory Peripheral I PDSP Bridge Bus Arbiter DSP II DSP III DSP IV Peripheral II Shared DSP Bus PCI Local Bus Figure 1 Multiprocessor Architecture In order to optimize the performance of the architecture shown in Figure 1, the PDSP bridge must adhere to essential design principles.Bus Decoupling To achieve maximum performance both on the PCI bus, and on the shared DSP bus, an effective PDSP bridge should have large internal FIFOs. This affords a high degree of decoupling between buses, and buffers bus latency so that arbitration delays have minimal impact on data throughput. To further boost performance, separate data pipelines should be included for each data transaction type, as shown in Figure 2. The configuration shown in Figure 2 ensures that data flow is not impeded by arbitration delays within the PDSP bridge itself. DSP Bus PCI 'Slave Read' FIFO PCI Bus PCI 'Master Read' FIFO PCI 'Master Write' FIFO PCI 'Slave Write' FIFO ¬ ¬ ¬ ¬ Figure 2 Data Pipelines Determining the correct size for each of the FIFOs shown in Figure 2 requires analysis of the bus behaviours on both sides of the PDSP bridge. Because of the transient nature of the architecture on the DSP bus, it is difficult to accurately gauge what constitutes an adequate buffer to compensate for arbitration delay, transaction overhead, bus collisions, etc. Additionally,given the high data rates that today's DSPs are capable of sustaining, it is fair to assume that the PCI bus will be the rate-determining factor for data throughput in most DSP systems. From the definition of the PCI protocol, and from numerous benchmarks, it is clear that there is a directly proportional relationship between burst size and bus bandwidth. It follows, that to maximize PCI efficiency, the FIFOs should be made as large as die size and available gates permit. Programmability Among most of the widely available DSPs, there exist few commonalties. Most provide interfaces to external asynchronous memory, but the timing requirements and functionality of these interfaces vary so widely among manufacturers that attempting to design to a common protocol is almost impossible. Add to that the necessity of interfacing to the high-speed synchronous memory interfaces on next-generation DSPs, and the hardware designer's task is compounded further. To alleviate the burden of hardware re-design and to maintain performance across DSP architectures, a PDSP bridge should have completely programmable DSP bus interface cycles. Instead of using costly programmable logic to design a custom interface between the bridge chip and the various devices on the shared DSP bus, the designer simply adapts (via programmable registers) the PDSP bridge cycle to meet the requirements of the target peripheral. The timing diagram of Figure 3 is a simple illustration of a flexible write cycle with programmable cycle parameters.cycle size cycle size offset active active inactive inactive setup hold Address CS# (E#) WR# Data Figure 3 Programmable Write Cycle If the target peripheral happens to be synchronous memory, then the PDSP bridge must be capable of generating synchronous programmable bus cycles. It is therefore essential that a PDSP bridge have an internal PLL capable of synchronizing to a DSP's synchronous memory interface. Coincident with the ability to customize DSP bus cycles is the ability to apply different cycles (using the same physical signals) to different portions of the DSP bus memory map. A PDSP bridge should be able to partition its memory map into multiple regions that are programmable in both size and location. The result is that the PDSP bridge will be able to transfer data between peripherals with widely varying cycle types using the same address, data, and control buses. DMA In order to save valuable MIPs, DSPs come with powerful DMA engines that transfer data between the DSP and external memory. A similar engine, residing on the PDSP bridge, should be used to transfer data between the PCI and DSP buses. Once configured, the PDSP bridge's DMA controller should be able to perform data transfers with little, or no intervention fromthe DSPs or the central resource on the PCI bus. This independence implies support for DMA chaining-the DMA controller should have the ability to auto-configure itself at the end of each block transfer via linked lists residing on the DSP shared memory bus. It also implies an efficient mechanism to notify the DSPs when transfers have been completed. Consequently, interrupts should be coupled with a rich DMA status and control register set to allow the flow of data to be easily managed by the DSP system when necessary. In addition to being able to operate independently of processor control, the PDSP bridge's DMA engine should also be amicable to processor interference. Mechanisms must be in place that allow DMA transfers to be paused, modified, or eliminated altogether. Self-Configurability As stated earlier, ease of integration must be one of the features of a PDSP bridge. One of the drawbacks of having an extremely flexible device is that its many programmable features rarely default to the values required by a particular DSP board design. Hence, a PDSP bridge should have the ability to self-configure from read-only memory upon reset. Most PCI bridges provide a serial EPROM interface which loads internal register sets upon reset. This is a minimal requirement for a PDSP bridge. It should also have the ability to configure itself from any byte-wide ROM that may reside on the shared DSP bus, thus obviating the need for a serial EPROM. Finally, the PDSP bridge should be able to accept configuration cycles from the DSPs themselves in the case where board area is extremely limited and there is simply not enough room for ROM ICs.Universal Signaling The PCI bus is can not only be upgraded in terms of bus width and frequency, but also in terms of signaling levels. The architects of the PCI bus recognized that many signaling environments would eventually shift from 5 to 3.3 Volts. Hence, they provided a means by which cards designed for 5 V systems could gracefully migrate to 3.3V environments--the universal PCI buffer. PCI bridges that contain universal buffers can operate in either 5V or 3V signaling environments. This is extremely important to DSP board manufacturers that offer their products as modular components (such as PMC modules), not just point solutions. A PDSP bridge designed with universal PCI buffers will ensure that cards used in existing 5 V systems will not be obsolete with the arrival of 3 V systems. Packaging A common problem among currently available PCI bridge chips is the packaging. Manufacturers of these ICs are trying to pack as many bells and whistles into their chips as possible, overlooking the importance of chip area to board designers. This is especially true in the DSP board business, where maximizing MIPS per unit area provides a competitive edge. Thus a well-designed PDSP bridge will take advantage of the low pin count requirements of PCI and minimize chip area without sacrificing functionality. More important than chip area is chip height. This is a consequence of the fact that most popular form factors, including PCI, have severe height restrictions on the height of components on the back side of the board. Examples include VME, most mezzanine cards (such as PMC and sPCI), andCompactPCI. The height restriction on the back side of boards means that there is a premium on component side board space. A PDSP bridge should be in a low-profile TQFP package capable of being placed on the back side of high-density DSP boards, freeing up valuable component-side board space. Conclusion As the demands of real-time signal processing steadily rise, so too does the need for faster, more efficient methods of data transfer. The high-bandwidth, interoperability, and flexibility of the PCI bus has made it an attractive data-movement alternative for DSP system designers. As such, a PCI-to-DSP bridge that is designed with the specific needs of DSP architectures in mind will facilitate the incorporation of PCI into a wide range of DSP hardware products. Recognizing that such a bridge does not exist, Spectrum Signal Processing has developed a PCI bridge optimized for DSP architectures. During the development process, special consideration was given to meeting the criteria outlined in this paper in order to provide an optimal PCI to DSP bridging solution. References [1] Hady, F., Efficient Use of PCI, Intel Platform Architecture Labs, 1997. [2] Solari, E., and Willse, G., PCI Hardware and Software Architecture and Design 2 nd Edition, Annabooks, 1995. [3] PCI SIG, PCI Local Bus Specification, Revision 2.1, 1995. [4] Medeiros, J., Embedded PCI: Some Case Studies, RTC, July, 1996. |