When Will Alexa Know Everything?
Let’s start with the basics. Pretend I know almost nothing about technology. You probably won’t be too far from the truth. Explain to me, in the simplest possible terms, what happens when I ask my Amazon Echo, “Alexa, what’s the weather today?”
Al Lindsay: Sure. The first thing that happens is the local software running on your device, which is able to recognize the wake word “Alexa,” detects that you said the word “Alexa,” wakes up, alerts you that it’s now listening by lighting up the ring in blue, opens a connection to the cloud, and streams the rest of your request to Alexa in the cloud. That’s the first stage, which is understanding, “Hey, you’re talking to me. I need to do something with this, get it over to the cloud, so that Alexa can process it.”
The next part is understanding the words that you said, so speech recognition, or what we call ASR—it stands for Automatic Speech Recognition. That’s understanding the words, so based off the language, trying to understand what the strings of potential words are you might have said, and then we get those on over to a system we call, natural language understanding, which then tries to make sense of the meaning of those words. It might be able to understand that you said, “What’s the weather?” Those words—the fact that translates to a request that needs to be routed to a piece of software that understands about weather, and location, and those types of things is the understanding layer.
Then, I think about weather, as an application, we call them skills. Skills are like applications and they handle your request. You get that request over to the weather skill and it’s able to figure out and speak back to you, using the text-to-speech engine, the answer to your query.
What about the question of whether Alexa is “always listening?” You hear that sometimes. You have this device in your living room, it’s listening to you all the time. If nothing else, it’s listening to make sure you’re not saying the word Alexa, right?
What’s happening is the software runs locally on the device itself, and it’s listening locally for the word Alexa. There’s no connectivity or streaming, at that time. The sound is passing through the microphones. The engine is simply looking for that one pattern. Do I see the pattern Alexa? If not, it’s just passing through. It’s only when we detect the phrase Alexa, locally, that we then wake up and say, “Hey, this was meant for me. Now I need to take further action and actually start to listen and stream to the cloud.”
Read More - Slate |