Is voice the next big thing for the Internet?
How Hypervoice will affect smartphones and search.
So, how do we make using devices like smartphones, the Internet of Things and even websites easier and more intuitive? Here’s a clue. As a species, we’ve been talking for millennia. Man’s first words might have been: “Look at the size of that mammoth!”.
But gestures like tap, pinch and zoom have been in the public psyche for less than a decade.
We’re hardwired for speech. After walking, it’s the first skill we master. So talking with devices has to be more intuitive than even the friendliest of touch gestures.
But that’s not all. As a way of communicating with devices, speech is also incredibly efficient in terms of the space it occupies. Keyboards are inevitably bulky because their size is, ideally, defined by the width of two hands. A smartphone screen is kind of defined by the distance between a human’s ear and their mouth. But a microphone to pick up voice commands can be just a couple of millimeters across.
So if you’re an engineer designing, say, a smartphone or a next-generation light bulb, the human voice has a lot going for it.
In particular: there is no user training required; a microphone has a miniscule cost; and the space and power requirements are negligible. So hypervoice ticks some very important boxes for product developers.
Hypervoice is an emerging technology that will integrate voice more closely with a data-centric world. It is a technology that can improve the 1-to-1 relationship people have with devices … and also bring voice into the domain of big data.
Imagine these hypervoice scenarios:
Perfect recall. What if people allowed, say, an app like Facebook to monitor their conversations. In that scenario, social preferences and relationships could be inferred from conversations … rather than actively input by the user.
Smartphones becomes sensors. Audio will go ‘beyond the call’ and move from being session-oriented to an always-on ‘sensor’. Your smartphone could be sitting on the table and you ask it to turn on the TV, without even touching the phone. It’s constantly listening for your commands.
Microphones become input devices. The voice UI will be omnipresent, especially in wearable devices that do not have a large screen for a touch UI. Smart watches, light bulbs and cars are great examples of ‘no touch’ applications.
Voice search. Digital assistance services like Google Now, Siri and Cortana will drive natural interactions between you and your device. Your device will seem friendlier and more collaborative when you can ask it for help.
Smart assistant. To go a stage further your device will become invaluable when it is proactive. So your phone might say: “That meeting clashes with your dentist appointment. Shall I re-schedule the meeting or the dentist?”.
For emerging communities. Straight away we imagine the features above in commercial settings for developed markets. But voice control also has a role to play under-developed communities. For example, farmers in poor, rural areas of the world could call a voice-controlled database – using only the most basic featurephone – to enquire about crop care or crop prices.
For elderly and disabled individuals. Due to its intuitive interface, voice-control would also be ideal for elderly people who are not technically inclined, or for people with physical disabilities. An always-on assistant would be reassuring or even life saving.
Join the conversation.
Topics like voice recognition, voice control and Hypervoice are high priorities on the NXP Software Speech & Sensing roadmap. It’s a natural progression to the algorithms we’ve already put on millions of handsets worldwide to make voice clearer and richer.
Want to join the conversation? You can …
• Email Edwin Zuidema at firstname.lastname@example.org
• Leave your comment at the end of this blog post
• Check out a few articles below that inspired us.
Martin Geddes: How to define Hypervoice.
Why voice is the next big internet wave.
Microsoft and voice search
Videos from NXP Software at MWC 2014
Finally, a little comedy. In the 1986 movie Star Trek IV, the crew travel back in time 600 years and are surprised to find computers aren’t voice controlled. It makes you think …