SI
SI
discoversearch

We've detected that you're using an ad content blocking browser plug-in or feature. Ads provide a critical source of revenue to the continued operation of Silicon Investor.  We ask that you disable ad blocking while on Silicon Investor in the best interests of our community.  If you are not using an ad blocker but are still receiving this message, make sure your browser's tracking protection is set to the 'standard' level.
Technology Stocks : Spectrum Signal Processing (SSPI)

 Public ReplyPrvt ReplyMark as Last ReadFilePrevious 10Next 10PreviousNext  
To: pat mudge who wrote (440)1/4/1998 8:41:00 PM
From: Galirayo   of 4400
 
Efficient Resource Management and MIPS Breakdown of a V.34 Modem Software Implementation
Helen Dorbolo, Spectrum Signal Processing Inc., Canada:
icspat.com

Pat, All of these files have Cool Diagrams that I can't copy over to SI.
You may want to get copies from SSPIF.

Ray

Introduction
Utilization of digital signal processors to imple-ment fax and modem datapumps in software has been an instrumental development in reducing cost and providing a flexible platform for future upgrades.
To remain competitive in the modem market, designers must implement high performance communication algorithms in a cost effective manner. The accommodation of the V.34 stan-dard, (and following V.34+) necessitates ob-taining the most out of conventional DSPs. To this end, designers must aim at reducing the MIPS usage, and the overall memory require-ments of the system.
Because of the limited amount of internal RAM on most low-cost DSP chips, and the increasing complexity of modem algorithms, it is becoming difficult to fit all the code for intensive real-time applications within the internal memory of the DSP itself. Also, in the interest of code mainte-nance, upgradability, and shortening the design cycle, it is more effective to have parts of the code in a high-level language. However, the re-sulting assembly code (using a cross-compiler) would not be ideally optimized, and therefore may not fit in internal memory. To run this code from external memory would require expensive external RAM (with low access time) in order to keep the MIPS usage low.
An alternative cost effective solution is to use the DMA capabilities of the DSP to load seg-ments of the code into internal RAM before they are required to be executed. In this way it is pos-sible to store part of the code on [low speed] external RAM or ROM.
This paper presents the implementation of a V.34 modem algorithm organized to accommo-date efficient use of memory overlay. The mo-dem has fax capability and also implements the older standards to resume compatibility with older modems; but we will discuss the V.34 case since it is the bottleneck with respect to MIPS consumption and code size.
Hardware Overview
The block diagram of the hardware is shown in Figure 1. The DSP itself includes instruction and data RAM, a UART core, voice and telephony CODEC interfaces, PCMCIA and ISA inter-faces, and DMA controllers. The external mem-ory is comprised of 128k X 16 ROM, and 32k x 16 SRAM. The core is a 16 bit, fixed-point DSP, running at 50MHz, with a full Harvard archi-tecture.
Of special interest to the DSP programmer is the instruction DMA controller on the DSP. It is used to copy blocks of instruction code from external into internal RAM. Until the copy has been completed the instructions will execute from external RAM. Once the copy has been completed any instruction fetches within the range of address of the given block will be exe-cuted from internal RAM instead of external. This process effectively converts the copied code block from external multiple wait state memory to internal zero wait state memory.
Although the DMA controller operates transpar-ently to the software, if the DSP core and the DMA controller are accessing external memory simultaneously both processes will be slowed down, possibly resulting in a MIPS overflow. In this case it becomes necessary to make the DSP wait till the DMA transfer is complete.Efficient Resource Management and MIPS Breakdown of a
V.34 Modem Software Implementation
Helen Dorbolo Spectrum Signal Processing Inc. Burnaby, British Columbia, Canada email: helen_dorbolo@spectrumsignal.com

DSP Core
IRAM
Read and Write Paths
External Memory Interface
SRAM 32k x 16 ROM 128k x 16 PCMCIA and ISA Interfaces
Config Registers
DMA Controller 16550 UART Logic
CODEC Interface 2 CODEC Interface 1
Telephony CODEC Voice CODEC DSP Boundary 16550 on Host (standalone)Host Bus
CACHE Data RAM
2 Port RAM
Figure 1. DSP Architecture
Software Architecture
To reduce system costs, slow external ROM de-vices were chosen to store all of the modem/fax code. The complication that arises is that the DSP requires excessive wait states to access 16 bit data from the ROM. In addition, the DSP core instructions are 32 bits wide, and approxi-mately three instructions are required to piece together the two 16 bit sections. Therefore the total access time per instruction from ROM is even greater than data. This latency makes exe-cuting code from the ROM impossible for all but the most non-time-critical modules.
The bulk of the datapump code is written in as-sembly language. For the purposes of the algo-rithm flow the datapump code is divided into two distinct segments: Analog (V.34 core), and Digital.
The analog code block performs the following functions on the transmit side: modulation, near and far end echo estimation, non-linear encod-ing, and adaptive pre-emphasis. It performs the following functions on the receive side: de-modulation, equalization, timing recovery rou-tines, retrain and rate re-negotiation detection. Servicing the interrupt routine is considered part of the analog datapump. The interrupt routine reads from and writes data to the codec, removes the estimated echo, and performs the Hilbert transform on the received data. It also estimates the echo canceller coefficients during hand-shaking.
The digital code block performs the following functions on the transmit side: scrambling, trellis encoding, shell mapping, framing, constellation shaping, and pre-coding. It performs the fol-lowing functions on the receive side: descram-bling, Viterbi decoding, de-shell mapping, and inverse pre-coding. For an in-depth discussion of the details of the V.34 algorithm the reader is referred to [1].
The datapump MIPS usage was measured during a 4800b/s connection at 3200 baud. The break-down is as follows:
Code Block MIPS Usage
Digital Receive 11.7
Digital Transmit 6.5
Analog Receive 5.6
Analog Transmit 4.3
Interrupt Service Routine 1.6
The modem uses the V.42bis and MNP5 1 proto-cols for data compression; for error correction the V.42 and MNP2-4 standards are used. The bulk of this code is written in C and cross-compiled to run on the DSP.
As mentioned earlier, running the code from external ROM is not feasible because of exces-sive wait states. Because the internal 0 wait state RAM is far smaller than the ROM, blocks of code must be transferred to internal RAM when they need to be executed.
To make efficient use of the internal instruction memory the code must be structured into fixed sized blocks of related functions. The compo-nents of each block are determined beforehand by carefully studying the different states of the algorithm. Each block is given a unique block number. The DMA controller will transfer in one block per call.
The algorithm is structured into unique states where each state of the modem is characterized by the blocks of code which must reside in in-
1 MNP is a trademark of MicrocomDSP Core
IRAM
Read and Write Paths
External Memory Interface
SRAM 32k x 16 ROM 128k x 16 PCMCIA and ISA Interfaces
Config Registers
DMA Controller 16550 UART Logic
CODEC Interface 2 CODEC Interface 1
Telephony CODEC Voice CODEC DSP Boundary 16550 on Host (standalone)Host Bus
CACHE Data RAM
2 Port RAM
Figure 1. DSP Architecture
Software Architecture
To reduce system costs, slow external ROM de-vices were chosen to store all of the modem/fax code. The complication that arises is that the DSP requires excessive wait states to access 16 bit data from the ROM. In addition, the DSP core instructions are 32 bits wide, and approxi-mately three instructions are required to piece together the two 16 bit sections. Therefore the total access time per instruction from ROM is even greater than data. This latency makes exe-cuting code from the ROM impossible for all but the most non-time-critical modules.
The bulk of the datapump code is written in as-sembly language. For the purposes of the algo-rithm flow the datapump code is divided into two distinct segments: Analog (V.34 core), and Digital.
The analog code block performs the following functions on the transmit side: modulation, near and far end echo estimation, non-linear encod-ing, and adaptive pre-emphasis. It performs the following functions on the receive side: de-modulation, equalization, timing recovery rou-tines, retrain and rate re-negotiation detection. Servicing the interrupt routine is considered part of the analog datapump. The interrupt routine reads from and writes data to the codec, removes the estimated echo, and performs the Hilbert transform on the received data. It also estimates the echo canceller coefficients during hand-shaking.
The digital code block performs the following functions on the transmit side: scrambling, trellis encoding, shell mapping, framing, constellation shaping, and pre-coding. It performs the fol-lowing functions on the receive side: descram-bling, Viterbi decoding, de-shell mapping, and inverse pre-coding. For an in-depth discussion of the details of the V.34 algorithm the reader is referred to [1].
The datapump MIPS usage was measured during a 4800b/s connection at 3200 baud. The break-down is as follows:
Code Block MIPS Usage
Digital Receive 11.7
Digital Transmit 6.5
Analog Receive 5.6
Analog Transmit 4.3
Interrupt Service Routine 1.6
The modem uses the V.42bis and MNP5 1 proto-cols for data compression; for error correction the V.42 and MNP2-4 standards are used. The bulk of this code is written in C and cross-compiled to run on the DSP.
As mentioned earlier, running the code from external ROM is not feasible because of exces-sive wait states. Because the internal 0 wait state RAM is far smaller than the ROM, blocks of code must be transferred to internal RAM when they need to be executed.
To make efficient use of the internal instruction memory the code must be structured into fixed sized blocks of related functions. The compo-nents of each block are determined beforehand by carefully studying the different states of the algorithm. Each block is given a unique block number. The DMA controller will transfer in one block per call.
The algorithm is structured into unique states where each state of the modem is characterized by the blocks of code which must reside in in-
1 MNP is a trademark of Microcomdatapump handshake has been completed if ei-ther modem detects a significant change in the line conditions. The response time must be within the tens of milliseconds.
The difficulty arises because these states require the handshaking code to run from internal mem-ory. Therefore the handshaking code must be loaded in, overwriting the data compres-sion/ error correction code. After the retrain or rate re-negotiation is complete, the data com-pression/ error correction code must be reloaded. The procedure will be explained in more detail below.
V.34 Retrain:
Before starting the retrain, the block numbers of the last three blocks of internal RAM are saved in a buffer. The retrain preamble includes a si-lence period of 75ñ5ms. During this time the handshaking code (for the V.34 Initialization state) is loaded in internal memory.
After retrain is complete, just as at the end of handshaking, code for the last three blocks of internal RAM must be loaded in according to the block numbers that have been saved. The prob-lem in this situation is that the datapump will start to make calls to error correction/data com-pression code immediately after retrain is com-plete. This could cause a MIPS overflow since this code is still in external ROM and will need to be transferred into internal memory. To avoid this situation, after completion of the retrain the function pointers for the error correction/data compression code will be reset to alias functions which will do nothing. After the DMA is com-plete the function pointers will be reset; this process is transparent to the datapump. The data packets sent and received during the DMA time will have to be resent by both sides, and this will lower the throughput, however this will not be significant compared with the total loss of throughput resulting from the retrain.
V.34 Rate Re-negotiation
This state is similar to the retrain state above, except that instead of starting from the first handshaking state, the modem will enter the last handshaking state, V.34 QAM Handshake.Summary
In this paper we have discussed an implementa-tion of the V.34 modem which reduces the amount of internal RAM required on the DSP. This is done by using low speed external ROM to store the fax/modem code, and taking advan-tage of the DMA capability of the DSP to load blocks of code into internal memory as they are required by the algorithm. This requires thor-ough understanding of the modem algorithm flow and careful planning for the modem state transitions.
Another advantage of this implementation is that it reduces code optimization time since only the most resource intensive state (which in this case is V.34 + V.42 +V.42bis) will be optimized to fit in the internal RAM. The other cases (which includes older modem standards) will not have to be optimized as much because while they are not used they are not taking space on valuable internal RAM.
References
[1] ITU-T Recommendation V.34, September 1994.
Report TOU ViolationShare This Post
 Public ReplyPrvt ReplyMark as Last ReadFilePrevious 10Next 10PreviousNext