Image from Amazon.com

The Subtle Reason The Amazon Echo Show Will Be A Smashing Success

Dharmesh Shah
ThinkGrowth.org
Published in
3 min readMay 9, 2017

--

As soon as I heard the news that Amazon is releasing an Echo with a screen, I couldn’t go fast enough to Amazon.com and order the device. Those that know me aren’t too surprised — I’m a gadget freak. But, this time there’s more to it.

For the past year, I’ve been thinking about (and working on) conversational UI. I’ve talked about this on stages at INBOUND, SxSW, and elsewhere.

What I’ve been mostly working on is GrowthBot a chatbot (or digital assistant) for marketing and sales. It’s been really popular (right now, it’s ranked #2 in the Slack app directory for both marketing and sales). But, GrowthBot is a “text in / text out” interface. You type a message to GrowthBot (like: “what keywords does airbnb.com rank for?”) and it shows you a response (in text). Works in Slack and Facebook Messenger.

As much as I love my chatbot, I don’t think a text-in / text-out interface is ideal for every use case. Because, it’s still a bit painful to type in your message, especially on a mobile device. That’s why I really like voice interfaces like what the Amazon Echo provides. But, voice-in / voice-out is not ideal either.

Here’s the simple reason why:

Most humans can speak faster than they can type, but they can consume visual information faster than spoken information.

We can read at about 250 words a minute, but we can listen at about 150 words a minute. Let’s look at a simple example. Let’s say you ask “What are the top movies?”. Alexa answers (at least today) with: “Here are a few movies playing near you today. Guardians of The Galaxy Volume Two. The Fate Of The Furious. Beauty and The Beast. The Boss Baby. The Circle. Gifted. Would you like to hear more movies?”

Now, imagine life with audio input but visual output.

You still speak your request (like you did before), but instead of having to wait and listen, the answer pops up on your screen. Chances are, you can scan that screen much faster than what it takes for you to wait and listen to the spoken words. That will be the magic of the Amazon Echo Show — and other similar devices in the future.

And, here’s the clincher. This just assumes we’re going to show text on the screen (which is faster to read than listen to). But, imagine things for which visual output is much, much better. You know, the “picture is worth a thousand words.” Like in my movie example, imagine if instead of text, you got a thumbnail for the top movies. Much better, right? We are, after all, visual creatures.

I’m not saying you should run out and buy an Amazon Show. I am saying that this kind of audio+visual interface is going to be really popular.

What do you think?

--

--