Now you can be on speaking terms with your computer
Wednesday, July 14, 1999
By Bob Weinstein, Globe Correspondent Not too long ago the only places you would find talking computers were in sci-fi novels and films. Well, the future is here and speech recognition software (computer programs that understand spoken language) has progressed from the drawing board to a cutting-edge product available to a marketplace that is hungry to gobble up the latest technological toys.
Bill Gates, the founder of Microsoft, recently said, "Speech is not just the future of Windows, but the future of computing itself."
Others before him have recognized the significance of speech in the technological future. The first speech synthesizer was developed in the 1920s. Voice recognition software, which enables you to talk into the computer so that it flawlessly types what you say, has been around since the 1950s. Back in 1952, Bell Laboratories (now part of Lucent Technologies) created the first crude speech-recognition system.
Scientists have been trying to teach computers to understand human speech for the last few decades. It's been an uphill trek. The technology's major players include IBM, Dragon Systems Inc., Lernout & Hauspie Speech Products, Nuance Communications, AlTech, Voice Control Systems (VCS) and fonix Corporation. Companies such as these have taken speech recognition software to a new plateau.
Although the technology is still not at the point where computers are capable of responding to our verbal commands, it's getting closer every day, according to Caroline Henton, vice president of technology at fonix Corporation in Salt Lake City.
"Speech will enable people to do computing in smaller and smaller packages because you don't have to worry about keyboard size," says Roger Matus, Dragon Systems' vice president of marketing. "Computers are getting smaller and so are keyboards. The question is: How can you get full power computing in something that is small and light? The answer is speech is going to make that happen."
Henton offers a quick summation of the complex voice recognition field.
"Speech technology provides the user with the opportunity to interact with computers and other machines such as cellular phones and PDAs (process data acquisition systems) in the most natural form via the human voice," she says. "It involves three components: speech recognition, speech synthesis and, to some extent, speech compression."
But the Holy Grail is speech recognition, according to Henton, which means real time recognition of a human voice by a computer. That is, you speak to a computer in plain English and it understands what you're saying and responds.
"Don't think that anyone speaking any sort of dialect can walk up to a PC and expect words appear to on the screen," she says. "This is still beyond our reach."
In February, Business Week published a special report, "Let's Talk! Speech Technology Is The Next Big Thing In Computing," which offered an impressive review of past accomplishments and, more important, what's ahead. It predicted "voice power" wouldn't be an overnight megabillion dollar software market and that the few players involved wouldn't be turning the information technology world upside down. But beyond 2000, Business Week reports the story will get more interesting. "The market for products that use speech could be astronomic."
Henton contends that a handful of players in this tough and fiercely competitive voice recognition business have made enormous strides.
At the same time, innovative thinkers at MIT and other think tanks are working feverishly to perfect this technology. Victor Zue, an associate director of the Laboratory for Computer Science at MIT and a respected pioneer in the speech recognition field, has built a system called Jupiter, an 800-number people can dial for weather forecasts for 500 cities worldwide. Ask Jupiter any question in plain English about the weather and it will answer. If you ask "Is it raining on the Cape?" or "Is it hot in Hong Kong?" the system will provide the right answer. But, stray to another topic and you'll get silence.
Jupiter isn't on the market yet, but Newton-based Dragon Systems, run by husband-and-wife team Jim and Janet Baker, has a slew of voice recognition products already available and more are in the works. Last year, they introduced Naturally Speaking, which is touted as the first dictation software able to process continuous speech.
This year, Dragon's hot new product is the NaturallyMobile pocket-sized recorder, which sells for $39.95. Just speak into the four-ounce hand-held device and it records your voice. The device can then be plugged into the serial port of your PC so text can be transcribed faster than real time. "It's ideal for mobile professionals who can enjoy the luxury of dictating their work away from the office while leaving their PC behind," says Matus.
For the upcoming holiday season, Dragon just released NaturallySpeaking for users 10 and older. Specs say the product, which sells for $59.95, can bypass tedious typing and allow students to focus on the content of their work. Users simply speak directly into virtually any Windows application. It boasts a vocabulary of 230,000 words and will dictate up to 160 words per minute.
The fonix Corporation people are also deep in the throes of developing new speech recognition applications. Want to pare staff and eliminate human error in one fell swoop? Try out Portico, a fonix software that's being sold by General Magic of Sunnyvale, Calif. Portico is a virtual assistant with a voice that sounds human. Designed for the busy executive, this software answers the phone if you're busy, routes calls when you're on the road, and prioritizes and reads e-mail messages out loud. You can even interrupt it in mid-sentence with a new request, bypassing the frustration of wading through menus.
Next month fonix will debut Power-Scribe EM for use in hospital emergency rooms. The software eliminates timely paperwork and clerical chores by allowing doctors and nurses to simply speak into a microphone and instantly have the words recorded and spit back in text form. The annual market for dictation and transcription systems for emergency medicinal departments is estimated at more than $500 million.
This winter, skiers in Utah will be using fonix technology when they call a ski conditions phone number and converse in natural speech with a robot that responds to whatever questions they ask about ski resorts in Utah.
The primary message on this technology is to stay tuned, advises Henton and other researchers crafting tomorrow's cutting-edge technology. Beyond the fact that it's fascinating gee-whiz technology, practically speaking, voice recognition will make our lives easier and more efficient, Henton contends. For that reason alone, it will only get hotter.
"New companies are springing up all the time," says Henton. "The new ones are now combining technologies. For example, some companies devote themselves solely to providing e-mail, Web browsers or car navigation controls."
Business Week projected that as speech recognition software becomes cheaper and more pervasive, it will be built into hundreds of products. Companies that stand to benefit most are those selling the speech-enhanced products rather than those that created the software.
What's ahead? "Speech is going to be the primary interface with machines," says Matus. "It won't be just computers, but a whole range of appliances and devices. Virtually any machine on which you now have to punch a button is a candidate for speech."
Just imagine, says Matus, one day you'll be able to say to your microwave oven, 'Warm up the soup!' and it will set time and power levels automatically.
It won't be long before you'll be able to walk into your office, turn on your computer and ask it to tell you if you have any messages.
Adds Matus, "The power of being able to not just transcribe but to decipher speech is going to transform computing from top to bottom."
Advertising information
© Copyright 1999 Boston Globe Electronic Publishing, Inc.
Click here for assistance. Please read our user agreement and user information privacy policy.
Use Boston.com to do business with the Boston Globe: advertise, subscribe, contact the news room, and more.
boston.com |