Silicon Investor (SI) -- The First Internet Community

STOCKTALK

We've detected that you're using an ad content blocking browser plug-in or feature. Ads provide a critical source of revenue to the continued operation of Silicon Investor. We ask that you disable ad blocking while on Silicon Investor in the best interests of our community. If you are not using an ad blocker but are still receiving this message, make sure your browser's tracking protection is set to the 'standard' level.

Technology Stocks : Spectrum Signal Processing (SSPI) -- Ignore unavailable to you. Want to Upgrade?

To: Galirayo who wrote (441)	1/4/1998 8:35:00 PM
From: Galirayo	Respond to of 4400

Sorry, There were more pages. I forgot them. Continuation of ..... Multi-Channel Data Communications on Fixed-Point DSP Architectures Stuart Harker, Arda Erol, Spectrum Signal Processing Inc., Canada: icspat.com Page 2 T1 Line Interface Card(s) Switch Fabric IC TI TMS320C6x DSP TI TMS320C6x DSP TI TMS320C6x DSP TI TMS320C6x DSP Other SCbus Devices SCbus TDM Data Lines [0-15]TDM [0] TDM [4]TDM [1] [2] Figure 1 - Multi-Channel TDM Data Flow The SCSA Hardware Model defines the specifications for the SCbus. The SCbus is a time/space domain multiplexed bus comprising frames of octets transmitted over 16 serial lines. Each octet is referred to as a time slice and is synchronized to a frame sync signal. Bit trans-mission is synchronized to a system clock which is generated by a system clock manager. Con-trol signals and inter-card communications can performed using a separate 17th serial line to as the message channel (MC) bus. The bus can transmit and receive data at 3 dif-ferent clock rates. As the frame sync rate is con-stant 125 ms, higher bus speeds translate into more time slices per frame. Currently the SCbus handle N = 32, 64, or 128 time slices per frame. An SCbus frame is given in Figure 2. time channel 0 channel 1 channel 2 channel N-1 frame = 125 ms 8-bit data packet Figure 2 - SCbus Frame Structure The SCbus is connected to the DSP system a Switch Fabric Controller such as the SC2000O, or SC4000O. The SC4000 provides time slot interchange between multiple serial TDM streams on the local bus (DSP system), and an expansion bus (SCbus, MVIP, etc.) On the local bus side, the SC4000 accepts and out-puts data from four TDM serial data streams at 2.048 Mbps, two streams at 4.096 Mbps, or one stream at 8.192 Mbps. On the expansion bus side, it outputs and accepts data from sixteen serial TDM data lines operating at either 2.048, 4.096, or 8.192 Mbps. The SCSA Hardware model is under soft-SC4000 ware control via the SCSA TAO Frameworks. This is a software API model that allows hard-TDM ware and platform independence for CT appli- cations. Host interaction with the DSP system would be through the API and SPI's device drivers for that device. TMS320C6xx Overview In 1997, Texas Instruments introduced the TMS320C6201 based on their VelociTIO ar-chitecture. This fixed-point processor, with a peak performance of 1600 MIPS at 200 MHz, is well suited to multi-channel data communication systems. What gives the C6x its performance is Very Large Instruction Word (VLIW) core with eight functional units. This core, with two mul- tipliers and six arithmetic logic units, can exe- cute up to eight RISC-like instructions per cycle when the pipeline is full. The C6201 contains 1 Mbit of on-chip mem-be ory. This internal memory is equally split be-referred tween program memory and data memory. The program memory is equivalent to 32k of 32-bit instruction memory, as used on other TI proces-at sors, and can be enabled as a cache. This 32k instruction memory size is somewhat mislead-can ing, as benchmark tests 3 show that code size on the C6x is approximately 5-6 times as large as on a C54xx. This due to the RISC-like instruc-tion set. For example, the C6x does not have the traditional DSP MAC instruction. Instead the MAC instruction becomes a multiply followed by an add. Even getting the coefficient from memory requires an operation. This operation alone increases the code density by three. These figures lead to "traditional" program memory capacities of around 6-7k of 32-bit instructions. The data memory is 128 kbytes in size, and can be accessed as bytes, 16-bit half-words or 32-bit words. This memory space is also di-vided into four 16k banks, each with a 16-bit bus to the execution units. Critical to the multi-by channel communications system are the 2 serial ports on the C6x. These two ports have support for multi-channel TDM, and can enable as many as 32 channels per frame. On-the-fly switching 3 radix-2, 256-point FFT. See reference [2] of channels is supported, but only in groups of 16 channels. The combination of the SCSA and TMS320C6x architectures are only the hardware considerations for multi-channel data communi-cations. The corner stone of a functional system is the Real-time Operating System (RTOS) that controls data and program flow between differ-ent modem sessions. RTOS Considerations The handling of multi-channel data and pro-gram context switching need not be supplied by a formal RTOS. Size and/or overhead restric-tions may lead designers to consider an applica-tion- specific RTOS to meet the exact needs of the environment. The requirements that must be considered by the RTOS can be broken down into four main categories: Spawning; Task Switching; Synchronization; and MIPS and Memory Management. Figure 3 refers to a sample modem data flow with generalized modem task routines, and data handling. tdm_isr()host_isr(2) Task 1a host_handler()Tx buffer Task 1b data_pump()tx_host(1) TDM Rx buffer TDM Tx buffer tx_tdm(1) tx_channel(1)rx_tdm(1) rx_channel(1) TDM Serial Port Rx buffer rx_host(1)Host 1 Task 2a host_handler()Tx buffer Task 2b data_pump()tx_host(2)rx_stream(2) tx_stream(2) TDM Rx buffer TDM Tx buffer tx_tdm(2) tx_channel(2)rx_tdm(2) rx_channel(2)Rx buffer rx_host(2)Host 2 spawn() add_channel()add_stream()host_isr(1) tx_stream(1) rx_stream(1) Figure 3 - Generalized Multi-Channel Modem Data and Task Flow Spawning In the above multi-tasking system example, spawn(), add_stream(), and add_channel() are executed when requested by the system controller (external to this sys-tem). The function spawn() deals with the crea-tion of a new modem task. The RTOS must deal with creating and initializing the required data spaces, and task information. Since data memory configurations for data communications appli-cations are, by in large, static, the task informa-tion could be allocated at compile time. In fact all required data resources for all instances could be reserved at compile time, given the space. This would have no affect on system perform-ance, and would not require RTOS utilities such as memory allocation. This approach assumes that the maximum number of modem tasks that can be run on the system are known, and would reduce the complexity of the spawning process and the request messaging to the system con-troller. If the number of modems running is less than the maximum, then another can be initial-ized. Another approach is to allocate all resources dynamically. This would include all RTOS-related structures, and I/O buffers. This ap-proach would allow for an optimal number of modem tasks to be spawned (based on current task MIPS values), but would require a number of different heaps to be present in the system. Both aligned and unaligned heaps would be re-quired in internal and external memory. The function add_channel() would setup the correct channel streaming on the TDM (Time Division Multiplexing) serial port. The TDM channel handling tables for the tdm_isr() would also be initialized. Like spawn(), it could allocate heap memory or rely on static variables. The function add_stream() allocates and initializes the required host interfaces and buff-ers. Again, like spawn(), add_stream() could allocate heap memory or rely on static variables. Task Switching Task switching in this environment requires all the usual elements of a standard multi-tasking RTOS: task switch calls, task sleep calls, and process control blocks. The main element of task switching that we are concerned with here is code re-entrance and load-balancing. Re-entrance relates to static and global vari-ables used by the function. If all global and static variables used by a function (or applica-tion) are designed as fixed offsets from a task relative offset then the code will be re-entrant. The task variable pointer will simply be defined in the process control block for the task. The function tdm_isr() is not required to be re-entrant. Since all TDM data is sent and retrieved in a single frame, all of the I/O can be handled at once, in a task-independent routine. A TDM resource list would be used to cycle through all the task channels. This differs from the host I/O processing which requires host_isr() to be re-entrant, and task de-pendent. Therefore not all routines are required to be re-entrant. Synchronization The synchronization of data and task re-sources is determined by interrupt handling and task load balancing. Due to buffering, only the ISR's need to be synchronized to the data flow. Interrupts from the TDM serial port are both synchronous and task independent. This means that no task I/O gets preference in the system, and data synchronization is uniform with each task. The host interrupt handling is performed as separate interrupt sources for each task, all of which are asynchronous. To achieve synchronization and isolation of the data streams, careful consideration must be given to the assigned interrupt priority levels and the associated buffer levels. The TDM interrupt is of a higher priority than the host interrupts. Load balancing within the multi-tasking en-vironment can be handled by basing the context switching on a cooperative multi-tasking model, a pre-emptive model, or a combination of the two. Given that communication tasks only need to process enough data to keep a connection open, a pre-emptive system with cooperation between distinct tasks would allow tasks to "trade MIPS" when required, but still offer pro-tection from renegade or infinite loop tasks that would monopolize the CPU. Each data pump would process a fixed amount of data, then yield to other tasks. MIPS and Memory Management As modem modulation schemes continue to grow in speed and complexity, designers re-quire more 0-wait state memory to achieve their MIPS targets. This becomes an even greater problem for multi-channel implementations, when different channels are operating under dif-ferent modulation configurations, hence differ-ent MIPS and memory requirements . One pro-posed solution is to configure critical program routines into segments and DMA the blocks into internal memory when required. This leads to the program having a number of internal mem-ory states. Different blocks would be required at various locations in the program flow and would need to be DMA'ed into internal instruction RAM. Problems with this approach have to do with task switching. The current memory state would have to be saved during context switch-ing, and new states loaded for the next. Configuring the internal program memory as a cache is a simpler method for reducing MIPS in critical code sections, but it is highly depend-ent on code structure, and it is an all-or-none proposition, meaning that MIPS critical RTOS operations such as task switchers, ISR routines etc. would have varying MIPS performance due to cache hits or misses. Optimization of the internal data space will lead to less pipeline stalls due to external data accesses. Data memory usage can be reduced by overlaying global and static variables for mutu-ally- exclusive applications. Fax modem data pumps, for example, with never run concur-rently with V.42/V.42bis. Careful distribution of data can lead to MIPS savings when accessing the internal data space. The C6x divides the data memory into 4 16kbyte blocks, each with its own data bus. The execu-tion units can fetch two 16-bit data words in one cycle, as long as the data does not reside in the same bank. An example of this is a data buffer and filter coefficients in different 16k banks. Conclusions Some of the RTOS considerations and struc-tures required for an implementation of multi-channel communications system were explored. An example RTOS environment was discussed to consider issues such as data memory parti-tioning, context-switching, program memory DMA's, and MIPS usage. From this investigation, it appears that some considerations, such as, dynamic memory allo-cation and pre-emptive multi-tasking , are more suited to multi-channel applications. Other RTOS components will require much more in-vestigation and analysis when considered in a real environment. These issues and all of the components of the multi-channel communica-tions system will ultimately determine the num-ber of communication application tasks that can run on the TMS320C62xx, and the efficiency of these algorithms. References [1] TMS320CC62xx Technical Brief, Texas Instruments, 1997 [2] Jim Turley, Harri Hakkarainen, "TI's New `C6x DSP Screams at 1,600 MIPS", Mi-croprocessor Report, pp.14-17, February 17, 1997 [3] Mansoor A. Chishtie, Jay B. Reimer, "Code Reentrancy and Efficient Memory Management on TI DSPs", ICSPAT'96, pp. 796-800, Boston USA, 1996 [4] "Anatomy of a True CT Server", Com-puter Telephony, pp.78-98, February 1997 [5] SCSA Hardware Model, Version 3.0, Dialogic Corporation, 1994 [6] Bob Frankel, "DSP/BIOS Technical Over-view", dspbios.com ew_main.html, 1997 [7] Helen Dorbolo, "Efficient Resource Man-agement and MIPS Breakdown of a V.34 Modem Software Implementation", ICSPAT'97, San Diego USA, 1997

To: Galirayo who wrote (441)	1/4/1998 10:10:00 PM
From: pat mudge	Read Replies (1) \| Respond to of 4400

[Spectrum papers. . .] Ray -- Thanks for posting all the papers. Now, will you please translate???? Just kidding. I should have known they'd be this technical. Anyway, I hope someone can understand the importance of what's being said. I do notice one says, "High-performance DSP systems require a PCI-to-DSP bridge designed to accomodate the unique requirements of digital signal processors and the algorithms that they implement," and I believe that's what Spectrum's Hurricane chip does. My left brain is getting an unusual work-out tonight. :) Cheers! pat