Anyone interested in taking a trip to Portland, OR?
Friday, March 20, 1998, 2:00-3:30 pm
Main Seminar Room, Administration Building, OGI Campus
Hardware, Software and Parallel Algorithms for Speech Recognition
Nelson Morgan ICSI and UC Berkeley
ABSTRACT:
Commercial speech recognition has reached a level of performance that can be successfully used for many tasks. In addition, commercial workstations and PCs appear to be sufficient to support the algorithms used in these products. However, for many purposes, speech recognition is insufficiently robust to unanticipated variability in speaking style and environmental acoustics. Some of the classes of algorithms that have been proposed to alleviate these difficulties may require orders of magnitude more computational capability. Fortunately, it is anticipated that some approximation to Moore's Law will continue to be valid for years to come,and so the capabilities should be available in the not-so-distant future.
Given this perspective, two primary questions arise: How will the anticipated hardware affect the algorithms, and vice versa? What are our current best guesses at how we should implement each? In this talk Nelson will discuss trends in approaches to robust speech recognition, computational hardware, and how they relate. As examples, he will discuss two potentially related developments: the Intelligent RAM (IRAM) project at Berkeley, and the multi-stream processing approach under joint development at OGI and ICSI. These developments will be discussed in the context of previous hardware projects at ICSI that have been used for the training of neural networks for speech probability estimation. The most recent such project yielded a system that has trained a neural network with 3.2 million free parameters for the recognition of conversational speech, using about 1e15 arithmetic operations.
BIOGRAPHY: Nelson Morgan is the leader of the Realization Group at the International Computer Science Institute, and an Adjunct Professor at the University of California at Berkeley. He currently advises 10 Berkeley Ph.D. students,and is the co-editor-in-chief of Speech Communication. His interests have typically focused on signal processing and pattern recognition, particularly for automatic speech recognition, but also for biomedical applications. He is also interested in hardware to permit the use of computationally expensive algorithms. (See also icsi.berkeley.edu
====================================================================== See our website at: ece.ogi.edu for seminar schedule and map to OGI - - - - - - - - - - - - - - - - - - - - - |