A few articles of interest:
STMicro chips will include biometrics
Saint Genis, France - STMicroelectronics is to use biometric verification technology from Keyware Technologies NV (Zaventum, Belgium) in forthcoming video and audio DSP chips intended for use in automotive, computer and telephony applications. Keyware combines several biometric techniques, which use a variety of a person's physical characteristics, in a control system called Layered Biometric Verification (LBV) security server.
The audio DSP, a voice signal processing chip, will also include voice-recognition software based on technology developed by Lernout & Hauspie Speech Products NV (Ieper, Belgium), licensed to STM in 1997 and used for in-car applications.
STM expects to offer evaluation samples of the first circuits with LBV technology in the fourth quarter of 1998, with volume production due to start in the second quarter of 1999.
Copyright r 1998 CMP Media Inc.
Business Is Talking To Its PCs Paula Rooney
Waltham, Mass.-With the exception of 3-D Computer-Aided Design (CAD) engineering titles, real-time data-gathering applications and high-end games, few software applications harness the white-hot computational power of 400MHz Pentium II PCs and other high-end systems.
This is particularly true for business software applications, and even leading software publishers, such as Symantec, concede the point.
"There are very few mission-critical apps that require that kind of speed," said Jeff Cable, director of worldwide sales development for the Cupertino, Calif.-based software company.
Not that it hurts. Cable said faster speed will mean quicker searches on Symantec's Act! database, for instance, and faster screen redraw for any application, but those performance improvements also depend on other factors, such as video transfer rates and Internet access speeds.
There is, however, one exception to the no-need-for-speed rule in the business category: voice-recognition software.
Analysts said major improvements over the last two years in voice-recognition titles like IBM's ViaVoice, Lernout & Hauspie's VoiceXPress and Dragon Systems' Naturally Speaking are directly attributable to the explosion in PC megahertz.
"Voice-recognition software really uses this, because the hardware requirements are staggering," said Jeff Tarter, editor of Soft-letter, an industry newsletter based in Watertown, Mass. "The more powerful the PC, the better it works. They all ride the power curve."
Over the past year, for example, developers of voice-recognition software have made dramatic improvements in accuracy and dictation speeds. Users once balked at "discrete" speech-recognition offerings-which required users to pause unnaturally between words-but most titles now give users reasonably accurate continuous-speech features.
With such improvements, the category has become a "hundred-million-dollar business," Tarter said, noting that such titles once had high return rates because they didn't work as well as advertised. "Everyone thought this was 'Star Trek' quality," he said.
But that's changing as more powerful PCs hit retail shelves. Microsoft and other software companies are preparing voice interfaces for Windows and all applications, with the goal of eventually replacing the ubiquitous graphical user interface.
Symantec's Cable noted that for now, software developers are targeting application releases for the larger installed base of consumers with 486- or Pentium-based machines. "You have to develop applications that take advantage of those speeds, but you can't limit market opportunities," he said. "[The 400MHz PC base] is the bleeding-edge crowd."
Copyright r 1998 CMP Media Inc.
At Your Command -- Interactive Agents can work animation magic-if you have enough bandwidth. Martin Heller
Remember this classic exchange from the science-fiction masterpiece 2,001: A Space Odyssey: "Open the pod bay doors, HAL."
"I can't do that, Dave."
Okay, let's stop right there; that was a bad example. But I wanted you to understand what it would be like working with a Microsoft interactive software Agent. Perhaps my mind is wandering to a different type of agent.
Let's try again. You'll need a decent Web connection for this, plus a Windows 95 or NT system with a sound card, speakers and (optionally) a microphone. Using IE 3.0 or later with security set to medium, browse to my Agent demo at winmag.com and walk through the demo.
By the way, if you don't have enough bandwidth to download the megabytes of required Agent software data, you can access Microsoft's Agent demo and Software Development Kit (SDK) from the Site Builder Network Web Snapshot CD-ROM (for more information on obtaining the CD, see microsoft.com. Once you've installed the software and run the Microsoft demo from the CD, my demo will load relatively quickly over the Web.
Agents' abilities
So what's it all about? Basically, the Agent provides a conversational user interface, which is normally employed to enhance, rather than replace, the usual Windows graphical interface. An Agent animation runs in its own window, typically a borderless one with a transparent background so that the Agent appears to hover over the Desktop. In many ways, the Microsoft Agent is a logical extension of Office 97's animated Assistant.
What can an Agent do? Each of the three characters-Genie (pictured in the sidebar "The Magic Agent"), Merlin and Robbie-has a palette of animations, a synthesized voice, the ability to synchronize its movements with recorded voices, and the ability to respond to mouse clicks and voice commands.
An Agent's functionality comes partly from the Microsoft Agent service provider, partly from additional services accessed through the provider and partly from the Agent's character definition file(s). The service provider includes an OLE automation server application (AGENTSVR.EXE), as well as a file and docfile provider (AGENTDPV.DLL). It also relies on Speech API-compliant engines for its text-to-speech and voice recognition capabilities. An Agent control (AGENTCTL.DLL) simplifies the programming interface and lets you use Agents from scripting languages.
Microsoft has made its Command and Control speech recognition engine, the Agent software and Lernout & Hauspie Speech Products' TruVoice Text-To-Speech engine easy and free to use on a Web page; simply include tags that point to URLs on Microsoft's site. Microsoft has also made the three stock characters available for use over the Web by pointing the Agent control's Load method to the appropriate URL. If you want to redistribute Agents with a non-Web application, however, you'll need a license from Microsoft.
Every Agent character has a documented list of stock methods and states, with an animation assigned to each state and accompanying return state. In addition, each has a documented list of animations it can play, some of which are not assigned to states, and all of which can be given custom names. For instance, every character has a state for its Show method, called Showing, but the animation used for this state might be called Show or something else altogether.
You can program Agents through either a COM interface or an ActiveX control. I wrote my sample programs as Web pages using VBScript, so the ActiveX control was my only choice.
I started with some of Microsoft's samples and altered them to suit my tastes. I found that the actual scripting of Agents is pretty trivial. For instance, in my first sample (http://www.winmag.com/people/mheller/agent1.htm) I decided to use Genie, so I loaded him from the Agent Web site: AgentControl.Characters. Load "Genie", "http://agent.microsoft.com/characters/genie/genie.acf". Then I saved his character object to a variable to simplify the rest of the code: Set Genie=AgentControl.Characters("Genie").
I wanted him to perform a series of actions: Show himself in a puff of smoke (the "showing" state animation), bow to you (the "greet" animation), say "Hello," give you a thumbs up (the "congratulate" animation), tell you everything's okay, wave to you, say goodbye and finally vanish in another puff of smoke. That script is shown in the sidebar "Speak, Genie!"
One thing I learned when developing this was to be careful with my Agent tag codebase fields. Several of the more prominent third-party Agent pages on the Web have errors that result in multiple downloads and installations of the Microsoft Agent ActiveX control. My Agent tags were current when I posted my demo pages; I based them on the tags Microsoft uses in its own demos, rather than on the information in its online documentation.
Speak for yourself
As you've already found out if you've run my three demos, Agents have two ways to speak: with a synthesized voice and with a recorded voice. Either way, they can display what they're saying in a balloon and synchronize their lip movements fairly well with what they're saying.
Using the Text-To-Speech engine to make your Agents talk is easy and efficient, but the results tend to be rather mechanical. You can vastly improve the quality of the agent's speech by using recorded voices-at the cost of some extra effort for you and some additional download time for your users.
If you want the character's lips to move in synchronization with the recorded voice, you need to add some extra information to the data found in an ordinary WAV file recording. To do this, use the Microsoft Linguistic Information Sound Editing Tool; you can download it, along with all the software you need to run and develop Agents, from Microsoft's Web site (http://www.microsoft.com/workshop/prog/agent). You'll also find tips for using the sound editor at my Agent demo page.
Once you've tweaked the recording, save the combined sound and linguistic information in an LWV file and upload the file to your Web site. You can now play the LWV file from the Agent control using the character's Speak method.
When I used the Speak method for synthesized speech (as shown in "Speak, Genie!"), the text was the first argument to the method. The second argument to the Speak method is the WAV or LWV file (for example, Genie.Speak_, "http://www.winmag.com/people/mheller/hellowrl.lwv"). You'll notice that I left out the first argument, so the Speak method uses the text in the LWV file for the Genie's balloon.
Linguistics lessons
Getting Agents to understand spoken commands is a little more complicated than getting them to speak, though it's not as complicated as you might expect. Essentially, you need to add items to the agent's Commands collection, setting the voice recognition string and caption for each command. Then you need to add each command case to the Agent control's Command event handler and implement a script for each command.
Voice recognition strings use a variation on Backus-Naur Form (BNF) for their syntax. For instance, "(go away | [say] goodbye | scram)" matches "go away" or "say goodbye" or "goodbye" or "scram." The only caveat I found is that the voice recognition engine has trouble making fine distinctions. For instance, it can easily confuse "home" with "hello," if you speak quickly. It had no problem distinguishing "go home" from "say hello," however. To help define unambiguous command sets, keep the vowel patterns of all the recognized phrases unique.
I haven't even touched on the process of editing your own Agent characters. Microsoft supplies a tool for assembling the relevant animations for a character frame by frame, with options for action-sequence branching and mouth overlays. But Microsoft expects you to supply your own frames in BMP or GIF format. Given the sheer number of frames required to make a convincing character, I would probably use a modeling tool to generate the Agent action sequences, instead of drawing key and in-between frames. But then again, I haven't tried it yet.
Senior contributing editor Martin Heller programs and writes about it from Andover, Mass. Contact Martin at winmag.com or care of the editor at the addresses on page 20.
SIDEBAR: Speak, Genie!
Here's an example of the VBScript needed for the Genie Agent character.
'Preload the states and animations we need
Genie.Get "State", "Showing, Speaking"
Genie.Get "Animation", "Greet, GreetReturn"
Genie.Get "Animation", "Wave, WaveReturn"
Genie.Get "Animation", "Congratulate, CongratulateReturn"
Genie.Get "State", "Hiding"
'Do the actual animation
Genie.Show
Genie.Play "Greet"
Genie.Speak "Hello!"
Genie.Play "Congratulate"
Genie.Speak "Congratulations! If you can hear
this you have the Agent control and the text
to speech engine installed properly."
Genie.Play "Wave"
Genie.Speak "Goodbye"
Genie.Hide
Copyright r 1998 CMP Media Inc. |