By Jeffrey R. Harrow
It took me a long time. Over a year. And that’s just not like me – to get a new gee-whiz must-have gadget. But now that I have my iPhone (having finally accepted that I must leave the Verizon converge I’d been so happy with), the iPhone is indeed a ‘gadget to end all gadgets’ – at least for the moment.
But this isn’t a gush about the slick user interface, the thin “glass slab” look and feel that finally pried me away from my cherished RAZR, or even the sort-of-OK integration with my Mac. It’s about the ingenuity that many vendors have brought to this new platform, and what that’s going to mean to our future projects. Specifically, this discussion is about What Google Hath Wrought, which took me by happy surprise. It’s very much about how the results of Google’s work will surely make it into the user interfaces of many of our customers’ automation systems in the not-very-distant future.
Crying that Google is innovative is hardly worth the bits; we see evidence of how they’re changing peoples’ expectations of computing every day. But what really impressed me here started with a seemingly obscure, but amazingly useful free service that found its way out of Google Labs in Sept. 2007 – “Google-411” (800-GOOG-411).
As the name implies, this is a directory lookup service that you can call from any phone. An automated voice asks for a business name (or business category), and the city and state. (It’s limited to business listings in the United States and Canada, in English, at the moment.) The system then speaks its surprisingly accurate matches, letting you choose the one you want. And then it offers to go ahead and connect you to that phone number (a big safety-plus in the car) – for free. Or it will provide you with more information about the listing, such as its address and phone number, via voice or via a text message to your phone.
Pretty slick, and in my experience, amazingly useful, if for nothing else than avoiding the seemingly usurious fees that phone companies charge for directory assistance. And the crux of the matter – its speech recognition – is surprisingly good, and getting better over time.
But Where’s The Beef?
Google-411 is a very neat service, which I certainly appreciate. But what’s in it for Google? There’s no charge, and no advertising on the service. Now I know, because the free application that Google offers for the iPhone (and for some other phones, with varying degrees of sophistication) has shown how much the company has learned about speech recognition from all of the voices that have called Google-411 over the years.
The free Google Mobile App (This is not related to the Goog-411 phone service; it’s an application that runs on the phone.) at first seems to just be a simple portal into the Google search engine so you don’t have to go through the phone’s browser. But things get much more interesting as you realize that when you raise the phone to your ear, the iPhone’s accelerometer notices this and automatically enables voice-input. At this point you can speak pretty much anything until you lower the phone, which is digitized and sent over the Internet (not as a phone call) to Google. The system then rapidly recognizes and dissects your words and enters them as a traditional Google search.
For example, I tried “Control Global I Am Robot,” and the Google results page led off with a link to my previous article in this series. Similarly, saying “Boston, Massachusetts Lowes” brought up the expected listings.
More interestingly, because the application can make use of the phone’s location, saying “Chinese restaurants” immediately brought up a listing of the ones near me, while “veal francaise” offered recipes that left my mouth watering.
And then there are the unexpected benefits. Saying, “one point fifty British pounds” yielded the current conversion into US$2.17, while “one point five euros” indicated its value of US $1.90. Or for that mental crutch of figuring out 18% of a restaurant bill, saying, “Wis eighteen percent of thirty-two point five six” offers $5.86 as the amount for a tip.
Trying out a contemporary topic, “Co pilot of the US Air flight in the Hudson river” (someone who I think deserves more credit since he and the captain are a team. BTW, his name is Jeffrey Skiles.)yielded everything from his name and picture to his flying history.
Literally, you can ask it anything and Google does a surprisingly good job of understanding the speech.
Speech Recognition Is Now Changing Expectations.
Speech recognition, or let’s call the Holy Grail “speech understanding,” is still immeasurably far away from its science fiction potential. But I suggest that contemporary examples demonstrate that speech recognition is already quite useful in its current state. Most importantly, the wide, free availability of extremely useful and effective speech recognition, as demonstrated by these two Google applications, is rapidly educating people as to what the technology can already do today. And as folks become accustomed to such sophisticated interaction in their day-to-day activities, they will begin to expect similar capabilities in the applications that run their businesses.
Speech recognition isn’t yet up to the task of allowing someone to safely control a process automation system solely by voice. But many automation systems are also asked to provide information to a wide range of personnel, from engineers and operators to management. Especially with an increasingly mobile workforce, speech recognition systems make it easy for people to glean information, and perhaps execute limited control, even over the phone.
That sounds like a competitive advantage – automation systems that are easy to interact with.
Can people talk to yours?
(Disclosure: I have no business relationship with Google other than as someone who likes to chat with their applications.)
Jeffrey R. Harrow is principal technologist at The Harrow Group.