Monday, December 22, 2008

Please listen carefully as we don't really care how awful our IVR is...

You've heard them enough times, and if you are like me, you cringe every time: "Your call is very important to us," they say, and then they keep you waiting and waiting; "For English, press one," they insist on telling you, even though you've called them a hundred times and every time you pressed – that's right -- one for English; then there are those little phrases you've heard so many times that your ears don't event bother to pick them up any more – ones like, "You can interrupt me at any time" or "Please select from the following menu options."

But there is one particular little gem that, in spite of the thick skin of my ears, still gets my goat. I'm talking about, "Please listen carefully as our options may have changed!"

I would say 9 out of 10 of the IVRs I call these days have this phrase right there upfront, proudly played as if to signal that you are dealing with a company so dynamic and so cutting edge that its menu options are constantly changing – so, you'd better pay attention lest you get hurt.

The sad thing is that almost 100% of the time the phrase is played, nothing had really changed in the IVR menu for months – and in some cases, the phrase is inserted from the very beginning of the IVR deployment!

So wherefore the horrid little habit?

A good guess would be that it started legitimately enough when menus did change and power users skipped ahead without listening to the new options, resulting in confusion and complaint call backs about how the system was broken.

From that point on, my guess is that the context of the inclusion of the phrase falls into one of the two scenarios: (1) the voice user interface (VUI) "designers" were rank amateurs and therefore proceeded like all tentative amateurs do -- that is, by playing it safe and methodically and carefully imitating "what is out there," or (2) the VUI was designed by professionals who knew better than to perpetrate the atrocity but who were forced to include the phrase by adamant call center managers who perceive the IVR's mission to be first and foremost keeping callers from reaching humans and are therefore willing to throw any verbiage at callers if it forces them to listen carefully to the instruction prompts – i.e., are willing to outright lie to callers about how the menu is constantly changing.

Case (1) is the easier to remedy: if companies were to systematically invest in hiring professionally trained VUI designers and would take the development of their IVR as seriously as they do their web site, the phrase will at long last set itself on the path of extinction.

Case (2) requires a bit of a struggle. If you are a VUI designer and find yourself battling an arrogant, all-knowing call center manager who insists on including the phrase, here is how I would suggest you proceed.

First, point out that power users do not listen to prompts – they know what to press and they start pressing as soon as they realize they are connected. They will certainly not notice the white noise language of, "Please listen carefully as our options have changed" -- especially if it is played every time they call. The only way power-users will learn that an option has changed is for them to get lost once or twice.

Second, point out that even non-power users filter out the phrase if it is played every time they call. After a while, they will catch on that you are crying wolf and will simply tune out your pleas.

Third, propose that if there was indeed a drastic menu change and you desperately needed your callers to notice it, then at the very least, use something far more attention grabbing than flat language to signal the change: a double dings sound followed by an announcement that the menu options had changed, for instance, would be far more effective.

And to close the deal, explain to the call center manager that the best way to contain callers within the IVR is to ensure that they have a great experience with it every time they call it. What if the IVR were to remember who among the callers had already heard the menu change notification and then would act on that knowledge? For instance, noticing that the caller is calling for the first time since the menu change, the IVR would play the menu change alert and disable barge in, hence both ensuring that the caller notice that the menu had changed and forcing them to listen to the new options. And then, next time that person called, the IVR wouldn't play the menu change alert again.

Wouldn't that be more likely to minimize errors and misrouting than playing a phrase that is either not even noticed by the caller or, if noticed, can only needlessly annoy?

Monday, November 17, 2008

Balentine's Brave New World

Been plowing through Bruce Balentine's new Book, "It's Better to be a Good Machine than a Bad Person," and I must say that so far I am enjoying it.

The basic idea of the book is that it's about time that we gave up trying to have machines try to behave like human beings (and do an awful job at it -- bad persons) and started having machines tackle those problems that they can solve well (good machines), and in the process interact with human beings in their capacity as machines rather than pretend to be human beings.

So, as you can imagine, Balentine doesn't like it when a machine tries to act or sound like a human: saying that it's sorry, expressing gratitude, giving compliments, etc. For Balentine, such anthropomorphism not only adds little value to the interaction but in fact confuses and ultimately leads to frustration and disappointment when the the system does not live up to the intelligence the surface anthropomorphism implied.

I am completely sympathetic to the idea of building machines that are true to their identity. Yes: humans do not interact with machines the way they interact with other humans. My only concern is that spoken and heard language are so suffused with the human that expecting a human being to somehow find a way to use it and yet strip away the layers of emotional and cognitive meanings that are fully enmeshed in it is a difficult endeavor, to say the least.

Imagine being interrupted in mid-sentence by a machine: would you help not feeling irritated? Or how would you feel if the system were to order you around with "Give me your contact ID" or "Say that again." As things stand, I wouldn't like it and -- and this is the main point -- I wouldn't be able to help not liking it.

But Balentine feels that sooner or later, we will get to that brave new world where we wouldn't react emotionally when we are talking to a machine. We would know that this is the way to interact with a machine and we would turn our emotional sensors off.

I think what makes this tricky is the fact that the interaction is verbal in both ways. We have no problems shouting orders at machines in a way that we would not a human being; and we have no way accepting cryptic responses from machine when not spoken (ATM menus, Boarding passes). But as soon as we are engaged in a two-way verbal dialog (even when not spoken), we are overwhelmed with the anthropomorphic illusion.

But let's grant that someday we will somehow get ourselves to that point. The question is: how are we to cross the chasm from where we are today to that time when humans will talk to machines in that special human-to-machine way? Will one killer product (iPhone?) or application help us make the quantum leap? Or will we gradually evolve into such a new standard?

Will ramble on some more in future posts as I read and think about this....

Thursday, July 17, 2008

Follow up study to be presented in SpeechTek

Susan Hura, one of the head organizers of SpeechTek this year just posted following on the VUIDS Yahoogroups group:
For those of you coming to SpeechTEK next month, Tim Pearce from Dimension Data and Mike Bergelson from Cisco are going to present year 2 data from the Alignment Index at the conference. We're kicking off the Business Goals track with this session, Monday, August 18, 10:15-11 AM. We'll also be hearing about a similar study conducted in the EU by VoiceObjects.

Here is a link to the session.

Tuesday, July 15, 2008

The Parallel Worlds of Vendors and Users

Just came across a fascinating study by Dimension Data (in collaboration with Cisco) on the perception gap between "vendors" and "consumers" of speech-enabled self service solutions. By "vendors" the study refers to platform developers, system integrators, voice application developers, and speech technology vendors. 128 such vendors were surveyed for the study. By "consumers" they refer to callers who have interacted with speech-enabled self-service applications. They surveyed 1,203 such consumers.


The key findings revolve around 6 questions:

(1) How often would you prefer to use a speech recognition system rather than a touch-tone system? 9% of vendors answered "As little as possible," while 45% of users gave that answer. A huge disconnect. On the flip side, 47% of users gave a qualified "Yes" -- that is, they would prefer speech under some circumstances (depending on time of day, where the caller is, etc.), which tells us that users are not necessarily reflexively rejecting speech-enabled automation under all circumstances.

(2) What do you think is the main reason organizations provide automated services in their call centers? 69% of vendors said "to save money" compared to 54% of users. In other words, callers are no dupes: they fully understand what motivates to deployment of these solutions.

(3) What do you think is the most important benefit of using an automated system when you phone a call center? 51% of vendors mentioned "to avoid wait time" while 49% of users mentioned "24 x 7 service" against 18% who mentioned "Avoid wait time"! A remarkable mis-alignment and a clear opportunity for marketers and designers to exploit for increasing adoption.

(4) In general, when you've used a speech recognition system, which of the following best describes how well it helped you deal with your query? 77% of vendors said that it "Partially addressed the reason I called" while only 43% of users did. Another large gap. 2% of vendors responded with, "Did nothing I needed," while 13% users gave that response. Again, another noticeable gap that points to excessive optimism from vendors. On the other hand, only 8% of vendors responded with "Fully addressed the reason I called," while 18% of users gave that answer. In other words, it seems that vendor answers are driven by mushy conservative wishful thinking rather than insight into actual user reception.

(5) Having used a speech recognition automated system, would you now...? 44% of vendors responded with, "Be neutral to use one again" vs. only 28% of users giving the same answers. What is noteworthy is that a greater proportion of users (36%) responded with "Be happy to use one again" vs. 32% of vendors giving that answer, and a greater proportion of users (also 36%) responded with "Be reluctant to use one again" vs. 24% from vendors. In other words, just like question 4, users are more opinionated and have a less neutral disposition than vendors.

(6) The thing that annoys or irritates me most about using an automated speech application is when.... 41% of vendors answered with "System didn't understand me," vendors' number one answer, while users' number one answer was, "Transfer to agent with no context." This is a fascinating disconnect. Only 17% of users responded with, "System didn't understand me." Which simply means that it's not speech recognition that users find annoying or irritating, but the experience with the application: an additional 16% of users said, "Can't skip ahead" and 14% said, "No alternatives". In other words, 67% of dissatisfaction revolves around the experience with the application. Vendors by contrast focused on technology, in this case ASR and CTI ("Transfer to agent with no context" receiving 38%). "Can't skip" received 4% and "No alternative" a mere 1%.

The report gives a couple of general recommendations such as establishing "cross-functional engagement within organizations" and ensuring "contributions from non-technology stakeholders, e.g., marketing, customer services, and usability experts." But that is no revelation to anyone who seriously engages in voice user interface design.

What would have made the study complete would have been including a third category of stakeholders: the companies that deploy these applications -- i.e., the actual customers of the vendors. I suspect that since many of these customers are sold on the value of self-service applications by the very vendors surveyed in the study, a parallel mis-alignment between customer expectations and those of the ultimate users also holds.

The authors promise to run the survey year over year. Let's keep our eyes open. Hopefully, vendors and customers will read the report and will begin to actually align their goals and values along those of end users.

Saturday, June 21, 2008

Starting on the wrong note....

I'm preparing a couple of presentations for the upcoming Speechtek gathering in NYC, where I will talk about why people hate IVR (will try to dig a bit deeper than the obvious) and what designers can do to be as smart as possible in their interactions with callers.

One thing that occurred to me that does not get mentioned nearly enough to explain why people often detest IVRs is the following: almost always, unless the caller knows that they are calling into an automated system, an interaction with an IVR always starts on the wrong note in that the caller is forced to engage in an interaction in a way that they didn't necessarily expect or prepare for. They want to speak with a human and instead get a machine. How would you feel if you went up to teller counter only to discover that instead of a human person you find an ATM machine....

Of course, in itself, that doesn't explain aversion for IVRs. But I think this asymmetric imposition from the machine starts the conversation off on a bad note from the get go, with the caller feeling that they have no choice but to submit...

Will be doing more thinking on this....

Tuesday, May 6, 2008

Outrage Triggers

Here is a short list of IVR failures that trigger in callers a feeling of outrage -- or at least loathing and contempt!


1. You are forced to start all over after giving the IVR several pieces of information. That's right: nothing can make a person's day like having them emulate Sisyphus while trying to reach customer support. Nice!

2. You are made to listen to several minutes of declamations, instructions, warnings, and general statements before you are offered anything that you care about. That's because we all love to hear other people thump their chest about how they are the best and the brightest and the loveliest, and how maybe we should check their web site next time, and that our call is so important to them that they can't stand it, and so forth. We all love to hear that stuff, especially when we are calling because we are pissed off and need help with their crappy product.

3. You are made to wait a long time only to be routed to voice mail. This is my favorite by far. OK -- we thought about it and... we don't think you are worth speaking with. So sorry. After the tone, start speaking or whatever.... Beeeep.

4. You are made to wait a long time, finally get to an agent, but you never get your problem resolved. This is worse than #3, because you force yourself to waste 10 minutes of your existence on earth talking to someone who wouldn't know how to help you if your hair got caught on fire before you start feeling ashamed of yourself for indulging in such bottomless self-delusion.

5. You are asked by the agent to repeat information that you already provided to the IVR. This is of course the all time classic. (Once, an agent pretended that she needed me to repeat the information "just to make sure". I smiled and repeated it. At least she cared enough to make the effort...)

6. You are transferred from one IVR system to another IVR system. This always makes me smile: if people are able to launch businesses and make a healthy living with this kind of utter thoughtlessness, I too will become rich and famous one day....

7. The IVR system asks you to call at a later time and then hangs up on you. You gotta respect a machine that can detect a meat head, swiftly decides that it has no tolerance for such density, and then cuts its losses and moves on. When you get your act together, buddy, give me a call and we can talk....

Saturday, April 26, 2008

Monday, March 31, 2008

Deploying Lousy IVRs

Here are 10 causes that contribute to the deployment of lousy IVRs.

1. Designing from the armchair, without discovery, without getting the key people involved in developing the requirements.

2. Treating the IVR deployment like an IT project and handing ownership of it to the Technology group.

3. Grossly under-financing the project (while spending an order of magnitude more on the web site).

4. Not hiring VUI experts; not hiring experienced project managers who have deployed IVR solutions. (A corollary is using internal employees to record prompts.)

5. Micromanaging the VUI designer and second-guessing them.

6. Letting legal have their way instead of battling them at every turn.

7. Not involving agents in the discovery phase, the design phase, or in the acceptance testing phase.

8. Not training agents on the IVR once deployed.

9. Altogether skipping the acceptance testing phase.

10. Not monitoring after deployment and not refining and adapting the solution to new findings.

Monday, March 24, 2008

The High Wire Act of Demoing Speech

If you have ever given demos of speech-enabled IVR applications, you know how stressful the experience can be. You have put on your best Sunday suite, have behaved yourself impeccably, have given a great PowerPoint presentation, have definitely impressed your prospect with your client list and the reference quotes, and have demonstrated full and sensitive understanding of your prospect's problems. But all that, you fear as you get ready to give the demo, could be wiped out -- or so it feels when it happens -- with a cruel, "I'm sorry, I didn't understand that!"

Here's a list of tips to maximize chances of success and minimize the agony when failure takes place. Needless to say, your demo is as good as its VUI – and a solid design is your starting point. But here are some pointers to make sure you don’t tragically crash and burn for a silly reason.

1. Remove all prompts that explicitly talk about failure, such as: "I'm sorry, I didn't understand that" or, "Sorry, I didn't hear you." Use a double beep, instead, which will cue you to speak again (and could be interpreted by your audience as a failure on your end rather than the technology's), or just re-prompt.

2. Know the DTMF fall-back to responses if voice recognition gives you a problem. If traversing the call flow is what the demo is about, keep the flow moving with DTMF.

3. Don’t speak over prompts. Instead, wait for a pause and speak your answer. (Obviously, make sure that when you design your demo’s Voice User Interface, you insert silences long enough to let you speak your answer.)

4. Make sure you don't have the application talk for more than 10 seconds without giving the turn back to you.

5. Test the application with the same equipment that you will use in the demo. I've found that best performing for demos is desktop speaker phone and worst cell phone speaker phone. Whatever you choose, just make sure that what worked when testing is what you use when demo-ing.

6. Test the application in the same room and environment where you will do the demo. Ambient acoustics can make a big difference, even if your ears can’t tell.

7. If you are conferencing in the IVR, make sure you know how to end the voice application – i.e., what to say or press to have it end without you needing to hang up.

8. Make sure you know how to pull back the application from voice mail: what to press to return to the main flow after leaving a voice mail.

9. Make sure to tell everyone in the room and the conference line to remain quiet while you demo.

10. Turn off or keep away all cell phones from the demo phone as static will interfere with speech recognition.

11. If doing a conference call, make sure you know how to conference the application in.

12. Know how to mute and un-mute your conference line.

13. Never improvise or show off while demo-ing. Pick a path in the flow, make sure that it works, test it several times, and then during the demo traverse it exactly as you had tested it.

14. If for whatever reason the application fails, be honest about why it failed. If you were forced to use speaker cell phone and speech was degraded as a result, tell them that the voice recognition does not perform well with in such an environment. If a script that talked to a backend failed, take the time to explain that to them. If you don’t know what happened, tell them that you don’t know. Chances are that your audience will sympathize with your plight.

15. If the application fails, have a plan B. A canned recording of the interaction would do. If possible, schedule a follow-up demo with someone in the group, and move on.

Remember, the purpose of the demo is to create a favorable impression. What your prospects cares most about is to see technology work as promised. If you accomplish that, you win.

Thursday, March 20, 2008

Putting up with Bad Interfaces

A thought I want to jot down and elaborate on in future posts.

Why do people have such low tolerance for bad usability in the IVR but are (or have been) able to endure without a peep of complaint the DOS and UNIX prompt? I don't ever remember anyone saying, "stupid computer! Why can't you understand me when I type 'check disk' instead of stubbornly insisting on 'chkdsk'. " Instead, people usually curse their own "stupidity" and "carelessness," and exclaim how they are just bad with computers and how their 10 year old child was far better than them, and all that.

So, why is that?

One reason, I guess, is the perception that humans have that whatever a human can do, the brainy machine should be able to do faster and better. A human can understand speech and engage in conversations with almost no effort (or so we think), and if a computer can't do it, then clearly there is something wrong with that computer (or, in this case, the IVR technology).

The perception is exacerbated by the illusion well designed VUIs give of sounding and behaving like a human. Which results in a raising of intelligence expectations and a lowering of tolerance for mistakes.

Sometimes I wonder if a VUI that was designed like a DOS command would result in people complaining less than otherwise! Would a VUI that sounded like a robot (but was perfectly intelligible), that tolerated no variations, that sounded like the almighty Computer itself speaking, without a hint of negotiating, who was in complete and unshakable control, that didn't fret or apologize when it made a mistake, that accused the user of making errors when things failed, that responded with things like, "Your response is not recognized as an internal or external command" -- would such a VUI have the effect of having people blame themselves when things didn't work? An experiment worth conducting....

Tuesday, March 18, 2008

Closing the Dialog -- Part III

7. Never say, “Your call is important to us”

Another non-negotiable rule. The expression is overused and will only elicit snickers of derision from the user.

8. Don’t make the user repeat information they provided to the IVR

One of the biggest complaints that users have about IVR systems is the notorious practice of forcing users to repeat to agents information that they had just provided to the IVR. There are three ways to address this failure in usability: (1) Pass to the agent whatever information that was collected – whether by a screen pop or an audio whisper to the agent prior to connecting; (2) if the system can’t pass information to the agent, then don’t ask in the IVR for information that you know the agent will need; or (3) at the very least, have the agent apologize for making the caller repeat themselves, and have the agent ask only for the very minimum to accomplish the task.

In the case where no information is being passed from the IVR to the agent, at the very least, make sure that the agent is alerted that the call they are receiving is a call transferred from the IVR. The agent can then adjust their behavior accordingly (e.g., sympathize with the user if they know that usually users transferred from the IVR are frustrated or angry).

9. Avoid transferring users from one IVR system to another IVR system

Unless the VUIs of the two IVR systems are designed as units of a common whole (with identical personas, with information collected from the first system passed to the second, etc.), don’t transfer users from one system to the other.

10. Don’t play phone rings unless you are transferring directly to a human

The sound of phone rings after an interaction with an IVR is a signal to the user that they are about to speak to a human being. Never play phone rings and then present the user with yet another IVR system.

Saturday, March 15, 2008

The Strategic First Step....

Eduardo Olvera, author of the VUIDesign Blog, sent me a very thoughtful response to my note about taking the first steps towards building a real IVR reform movement. Here is an excerpt from his note:
As you well pointed out, there was great resistance at the beginning, and I agree with some of the things that happened after that original resistance. But I think one of the biggest factors some people miss of that story - which is also directly related to your other points about advocates - is the huge role kids played on it.

That's right. If you remember, once the government enforced rules to require kids to use seat belts while riding on a car, the side-effect they didn't envision was what started to happen when kids started asking their parents why it was that they had to wear seat belts when their parents weren't, and guess what, I can't think of a better way to convert users than to have you realize as a parent that you teach by example, and therefore the parents started to use it too!

Going back to Gethuman, I think it would be great if we could find a similar legislation/kid combo which may on one hand start enforcing and monitoring change (e.g. CTI, hold-times, etc.), while on the other hand promote change from the inside... someone with enough power to make us question our current ways of doing things, and why not, ask us directly "and why aren't you wearing one?"

This is the kind of thinking we should be doing if we want to make headway in forcing companies to invest in quality!

I agree. The promotion of the rights and safety of the vulnerable in general and kids in particular has been key to the civilizational drive: labor laws also had their genesis in the protection of children, from which things like rights we take for granted today followed (sick days, 8-hour day, 5-day week); many food safety and air quality initiatives owe their existence for the concern of children also.

For us, then, we need to think carefully about what would be the vulnerable constituency that can enable us to make the qualitative leap in terms of forcing companies to invest in user-centric developments of phone automation systems. I want to think about this some more, but this is a great first step in the right direction.

Sunday, March 2, 2008

New Blog by Speech Technology Magazine

A quick note: Speech Technology Magazine have launched their own blog. Looks like they plan to post at least once a day -- which is great. Let's hope they keep the energy level up.

Thursday, February 28, 2008

New Study By VocaLabs

An important study was just published by VocaLabs, a consulting firm that specializes in call center customer feedback.

The report, authored by Peter Leppik, finds that:

The top three things customers associate with good customer service can be summarized as "make it quick and easy to reach a person, and make that person polite and professional." 50% of survey respondents selected all three of the top three factors.

Among the top items on customers' wish list are the ability to reach agents easily, politeness and courtesy of agents, short wait times, and the having one's problem on the first call.

Read the rest at:

Wednesday, February 27, 2008

Thoughts on launching a movement

Elaborating on my post of yesterday regarding the troubled waters Gethuman is navigating, I think the key to building a reform movement that will take hold and have a concrete and lasting positive impact is to assemble the right alliances from stakeholders who have a vested interest in pressuring businesses to invest in customer-centric solutions.

If you look back at all progress say in product safety that has been made, in almost every case, it has been a battle between on the one hand businesses that want to protect their short term interests and their margins by resisting anything that leads to production cost increases, and on the other consumers and their advocates who moved to pressure them to build safer products.

The classic example being of course the car seat belt and Ralph Nader's crusade for automotive safety. Initially, there was great resistance to the seat belt, but eventually, companies realized that safety was a competitive advantage they could exploit (and did). And that turned out to be great not only for the companies who went the safety route, but for the automotive industry in general because it opened up a whole new market and new set of customers (safety conscious Moms) who had up to then been excluded.

The stakeholders in our case are consumers and their advocates, VUI designers (we want to have jobs and make money), companies that deploy IVR solutions (they want to build applications that will be adopted have a truly positive impact for their clients), integrators, companies that host the deployments, and companies that buy them and deploy them (our customers).

The challenge in our situation, I believe, is that the ultimate user is not the technology buyer, so, obviously, the interest of the buyer is not aligned with that of the consumer.

A strategy for moving forward, in my view, would look something like this:

(1) Educate consumer groups on the shabby state of deployed automation and support. That shouldn't be too hard given the universal dislike of currently deployed IVR systems.

(2) Educate consumer groups on the possibilities of the technology: they need to understand that a great deal of consumer pain can be alleviated if businesses invested in the deployment of quality speech solutions.

(3) Have consumer advocates pick one or two key features that can be delivered and that are most wanted by the consumer and agitate for their adoption. For example: telling the caller how long they need to wait when they are placed on hold, or never having the caller repeat information they give the agent.

(4) The features in (3) need to be (a) easily implementable (technology exists and it is not too expensive), and (b) easily monitor-able.

(5) Have the consumer groups establish watchdog units that will monitor and mobilize when the key feature in (3) is absent from an application.

(6) Enlist legislators that will serve as a Democles Sword. Nothing mobilizes an industry to do the right thing than the threat of legislated regulation.

(7) Build capacity: i.e., make sure that the supply of VUI designers meets the demand for them.

(8) Once the structures are in place for transferring best practices into actual deployments (after the initial strategic insinuation in mobilizing for (3)) and the investments needed to deploy quality solutions are systematically made, quality of deployments has no
way to go but up.

Gethuman in this scheme could fulfill the role of consumer advocate. They would be the agitators making demands for better solutions from companies that deploy IVR contact center applications. For this to be taken seriously, though, Gethuman would need to tone down its "bypass the IVR" gimick and make demanding better automation its central demand rather than doing away with automation....

Establishing a consortium of some sort that speaks on behalf of businesses that deploy IVR solutions would be a concrete first step to take.

Tuesday, February 26, 2008

Gethuman in trouble....

A piece in Businessweek discusses how and why the gethuman project has not lived up to the revolutionary promises it made when it was launched in late 2005.

What happened? English learned that no matter how effective online consumer crowds may be, full-blown change still takes the passion and energy of committed individuals. "If you're going to try to do a standard," he says, "you need someone who is really going to drive it." English admits his busy schedule as chief technology officer of fast-growing travel search engine played a part in the slowdown, but he never intended the site to take up much of his time in the first place: "I wanted the citizens of the Web to run this."

Leading a movement is a thankless task that requires a great deal of passion and selfless sacrifice.

I suspect that Microsoft and Nuance pulled one of the oldest tricks that establishments pull when they need to neutralize an agitator: flatter them and pretend to join them.

Also: I think the main reason that English lost steam is that he had no strategy for reform: he agitated and that was that.

Agitating is useful and necessary. But reform requires a long-term program with the patient building of allies and the steady shaping of opinion.

But I don't think his efforts have been in vain at all. The movement truly needs to be picked up from within the industries that deploy Speech IVR solutions with bridges to consumer watch dogs to put steady pressure on companies to invest in the deployment of quality speech solutions.

Wednesday, February 20, 2008

Content vs. Form -- more on Balentine's book....

Continuing with my reading of Bruce Balentine's book. I am now in the thick of his argument against building "bad humans" and for building "good machines" -- this is articulated explicitly in essays #94-#97 (pp. 265-279).

I've expressed in a previous post my sympathy with the core idea of moving away from the paradigm of building machines that mimic humans and towards a more grounded paradigm where the focus is on helping people solve problems as efficiently as possible.

There are several tensions that have come up in this proposition, though, which I will tackle in the next few posts.

Here is the first: I am not comfortable with the neat dichotomy that Balentine seems to rely on between form and content. He seems to suggest that there is a clean way of slicing off "useless fluff" (my expression) and getting to the heart of the matter that is just as effective as language with "the fluff."

Example: should a system say "thanks" after you give it a piece of information it asked for? From what I can tell, that's a bit of the machine trying to sound human, as far as Balentine is concerned. But in my view, the "thanks" is not "just" an expression of gratitude (which, granted, is primarily a social transaction). It also serves the functional role of alerting you that the system has heard you and that it has accepted your input, and that the interaction was completed successfully.

How about this one: the system is retrieving information from a back-end and things are taking a bit longer than usual, and so it says, "Sorry for the delay. Looks like the system is slow today."

Is that too much anthropomorphism? Too much chit-chatting? I don't think so. It's very useful information: tells me that the system is still working on things and that it is indeed slower than usual. And you know what, I don't mind the fact that the system "realizes" or acknowledges that I don't like waiting. Not that I am fooled into thinking that the machine has feelings. And yet, I like the "sorry" message. It is useful.

More tensions tomorrow...

Tuesday, February 19, 2008

Web vs. Phone Self-Service....

A blog I have started reading recently is Service Untitled. This is a great, active blog to read, with lots of very useful insights and information from people who truly care and think a lot about customer service.

Their latest post focused on self-service FAQs on the web and the author offered the following recommendations about what a good FAQ should be and should not be:

  1. It should not be forced. Companies should never require their customers or users to use self-service. They can suggest it or make it more noticeable, but they should never force it.
  2. It should be intelligent. FAQs and self-service options that are static are worthless. The systems should update based on popularity, helpfulness, etc. There should also be humans watching the self-service systems and how customers are using them. Use Google Analytics if your system doesn’t already have an analytics tool.
  3. It should ask for suggestions. Like Google and LucasArts, good self-service centers should ask if articles were helpful, if they helped resolve issues, etc. To take it a step further, human representatives should ask if customers tried self-service. If they say no, ask why. The answers may be surprising.
  4. It should be up-to-date. There are very few things that are less helpful than an out of date help center. Make sure yours stays up-to-date and contains relevant information.
  5. It should be easy to navigate. It should also be easy to search. Make sure your help center is easy to navigate. It should be easy to go back, easy to explore relevant entries, and all of those good things.
Very interesting how the 5 requirements above closely match to some best practices in phone self service: (1) don't hide the zero out option, (2) use information that you can gather about the caller and the context to serve them intelligently, (3) get feedback from callers and monitor caller satisfaction, (4) offer accurate information and solve caller problems, (5) make sure the call flow is coherent and rational (for instance, don't drive people crazy with jumbled menus full of holes - "For Support, press 1, for sales press 7, for billing press 5," etc.).

See more about this in an article I recently published in TMCNet: "Treat Humans Humanely and they Might Just Like IVR".

Thursday, February 14, 2008

Piece in ASRNews....

A piece we wrote for the ASRNews newesletter came out earlier this week. Here it is reproduced. Please consider subscribing to the newsletter. It's chock full of must know information.


Back to the Future: Bleeding-edge IVR

Imagine this: every Saturday morning, the first thing you do even before you fully open your eyes is to reach for your cell phone, fling it open, press the "9" key, press the "Call" button, place the cell phone against your ear and engage an Interactive Voice Response (IVR) system as follows:

System: Hi there! The last four digits?
You: 0817.
System: Ok. Hang on. Your balance is $5,235 dollars and 23 cents. Anything else?
You: No.
System: Great. Goodbye.

At which point you would flip your cell phone shut and then rollback to sleep. The whole interaction would have taken you between 20 and 30 seconds, no more.

Compare this to getting your information from the web. If you are like us and you use a desktop at home, it means you would have had to get out of your bed, walk to the room where the desktop is, turn the computer's monitor on, click on the tab that points to your bank's login page, type the login credentials, and then navigate to where your checking balance is displayed. After that, you log out from the account and bring the browser down (to minimize any security risks), switch the desktop's monitor off, shuffle back to your bed, and finally get back to sleep. At the best, it would have taken you between 4 and 5 minutes.

What if you had a laptop? Well, maybe you would be able to shave a minute or so off, but only if you had the laptop nearby and it was connected to the Internet (which probably means that you have WIFI at home).

What if you had a PDA (Blackberry, Palm, iPhone, etc.)? You wouldn't have had to get up from your bed, right? Yes, but have you tried navigating the Internet with any of those devices? At best, it is less than a gratifying experience, but usually it is downright painful. The iPhone has made great strides over its other PDA competitors in the display of web pages, but it took a step backward in information entry: it is relatively easier to type with a Blackberry or a Palm than it is with an iPhone. "Relatively easier," because typing with the Blackberry or the Palm is no trivial skill to acquire.

So, then, it turns out that the most cutting edge technologies (desktops, laptops, PDAs) do not compare well at all with our humble phone when it comes to the simple task we described above.

What does it tell us? Simply that IVR technology is here to stay. It is here to stay because for certain tasks, it can do the job cheaper, more quickly, and require less effort on the part of the end user, than any of the most cutting edge communication technologies out there today.

But then you ask: so why do people hate IVRs? Why do they groan and shake their head in dismay when they realize that they are about to interact with a machine over the telephone?

The answer is simple: because most IVRs are atrociously designed. The interaction we described above is not your typical exchange between a user and an IVR system. Your typical IVR would have greeted you with some 30 seconds of chest-thumping messaging about the company, followed by some mindless instructions, such as, "For English, press 1," or "Please listen carefully as our menu options have changed;" would have listed a long menu of options, would have required you to select the "check balance" option, then the "checking account" option, then would have required you to enter your full checking account number, then, for security purposes, a pin, and only then would have finally given you the balance. A grueling 3 or 4 minutes would have gone by – and you would have had to get up from your bed and retrieve your checking book, unless you were so organized as to have the checkbook near by, or had committed to memory your 14-digit checking account number – etc!

So, what did it take to have the IVR system we described initially to behave as it did?

Here are the keys to its effectiveness: (1) it recognized who the caller was, (2) it knew that they were calling to retrieve their checking account balance, (3) it did not waste time talking, but said only what it needed to say, no more, and (4) it let the caller speak back their answers.

Can this interaction be implemented with today's technology? Absolutely. With the caller ID and the last four digits of the caller's checking account (easy to memorize, especially if you are calling once a week), the user can be identified and validated, and the checking account balance retrieved and spoken back to the user in a matter of seconds. With some intelligence in the back-end (a simple Naïve Bayesian algorithm would amply do), the system can quickly learn by itself that every Saturday morning, this particular customer will call to ask for their checking balance. With that knowledge, the system can adapt its interaction to shorten all of its verbal prompts to the bare minimum (e.g., "The last four digits" rather than "The last four digits of your checking account number"), ask only for the information needed to accomplish its task, and then execute that task. And with the current state of Speech recognition, letting the user speak back the last four digits of their account and say "No" are trivial tasks.

There is no reason, then, why every IVR system deployed out there today cannot be as effective as the one described above. Give the people a system that helps them, that solves their problem without wasting their time, and they will use it and love it every time.

Dr. Ahmed Bouzid heads the Partnerships program at Dr. Weiye Ma is an independent speech consultant. They are authors of the VUI Post blog at

Saturday, February 2, 2008

The Black List....

Someone at the VUI Designers Yahoogroups discussion board asked about why deployed VUIs out there continue to include the tired phase of, "Please listen carefully as our options have changed." As some in the discussion pointed out, the phrase is either passively included (like all bad practices, people are doing it because other people are doing it), or actively added because someone in the design committee is afraid that power users will not listen to the options offered and will press the wrong old key and then get lost....

I suggest that next time you find yourself in a position of having to explain why you do not agree to including the prompt, simply explain your position as follows: Power users do not listen to prompts, and will certainly not notice the white noise language of "Please listen carefully as our options have changed" -- especially if it is played every time they call. The only way they will learn that an option has changed is for them to get lost once or twice. If there was indeed a drastic menu change and you desperately needed your callers to notice it, use something far more attention grabbing than some language signal the change (a double dings sound followed by an announcement that menu options had changed) -- and use it only when something has really changed, and for a limited period of time. And if you were really smart and cared about your callers, you would remember who has already listened to the menu change notification and so does not need to be bothered with it every time they called.

Here a few more expressions that I propose should be banned from a VUI design:

  • "Your call is very important to us"
  • "You can interrupt me at any time"
  • "For English, press one"
  • "Press X, NOW!" ("Now" being the thing to ban, especially in the context of listing a list of options to press)
  • "Please select from the following menu options"
Let me know if you can think of more....

Sunday, January 27, 2008

Insight from Walt Tetschner

Have been engaged in a constructive email back and forth with Walt Tetschner. It started out nicely enough with him sending me a compliment about a piece I had recently published in TMCnet titled Treat Human Humanely and They Might Just Like IVR, a gesture I appreciated very much.

In the compliment, however, I detected what I thought was a bit condescension from Walt in him saying that he was glad to see that I had "found religion" by championing good VUI design. Needless to say, I pointed out to Walt that I had been engaged in championing good VUI design in my writings and in practice most of my professional life, etc., that it was strange that he would be glad to see me champion good VUI when the gethuman project had only bare bones advice in the web site about good VUI, etc., and the exchange went downhill from there for a few rounds.

But the back and forth took a sudden constructive turn when we started talking about the reasons why VUIs out there are so awful. My contention was that there are not enough skilled people out there that know how to do good VUI and that the vast majority of projects are badly managed. Walt however pointed out that even when good designers are engaged, the end result is often still awful. And he pointed to the attitude of call center managers who view callers as the enemy and who communicate such an attitude, subtly or otherwise, to everyone around them, including the VUI designer, who may very well come to the project wanting to produce the best interface ever, but then gets hammered and demoralized into accepting various mutilations of their design to the point where he/she no longer cares as long as they get paid and they can move on....

The insight is interesting and I need to think about it carefully. I have interacted with enough call center managers to grant that the proposition has merit, but I need to think more about whether the main cause is really about call center manager attitude, and if so, what that means in terms of developing a reform strategy....

Friday, January 25, 2008

Tips for effective menu design: Part III

9. Avoid mixing voice and DTMF menu choices

If your application is voice enabled, avoid cramming your menu prompts with instructions on how to pick menu items by voice and by DTMF. Avoid, for instance, wordings such as, “You can say ‘check balance’ or press 1, ‘open account’ or press 2, or ‘transfer funds’ or press 3.” Instead, first offer the leaner voice-only menu, “You can say, ‘check balance’, ‘open account’, or ‘transfer funds’,” and only if the users seems to have trouble with it, revert to the mixed prompt, “You can say ‘check balance’ or press 1, ‘open account’ or press 2, or ‘transfer funds’ or press 3.”

10. Use the same part of speech/clausal form when listing menu options

Bad prompt:

System: You can say, “Balance,” “Open,” or “Transfer.”

Good prompt:

System: You can say, “Check balance,” “Open account,” or “Transfer funds.”

11. Keep you menus consistent with one another

For example: in the opening menu, you ask the user to indicate whether or not they are a registered customer and then you branch off accordingly. Make sure that after the user indicates that they are a registered customer, none of the sub-menus offers options that apply only to non-registered customers (e.g., “To speak with one of our agents about becoming a registered customer, press “3”).

12. Let users ask, “What are my choices?”

At any point in the call, the user should be able to ask, “What are my choices.” In response, the system should respond by, first, positioning the user in the menu tree, and then listing the menu items that the user can select from.

User: What are my choices?

System: We were transferring funds. I need to know which account you would like to transfer funds from? You can say, “Checking,” “Savings,” or “Money Market.”

Friday, January 18, 2008

Tips for effective menu design: Part II

4. Use the construct, “You can say….”

If your application is speech-enabled, use the construct, “You can say….” to list the menu options.


System: You can say, “Books,” “Magazines,” or “Newspapers.”

5. Avoid the construct, “For X, say X, for Y, say Y, For Z, say Z.”

Simply rewrite the menu prompt as, “You can say, X, Y, or Z”. In cases where you can’t find the X, Y, or Z wordings that will accurately convey the meaning of the options, then use the construct “To A, say X, To B, say Y, To C, say Z,” whet “To A” would briefly explain what the option means.


System: To get your current balance, say, “Check balance,” to open a new account, say, “Open account,” to transfer funds from one account to another, say, “Transfer funds.”

6. Don’t use, “Please select from the following options”

A tired phrase that needs to be retired.

7. Never allow holes in your DTMF choices

We say that a menu has a hole if the options presented are not sequential. A menu that offers the user the option to press “1,” “2,” or “4,” has a hole. A menu that offers the options, “1,” “2,” and “3,” does not.

8. Mark position in the menu tree

A simple, “Main menu,” played prior to listing the menu items will reduce user confusion as to “where” they are in the dialog. The menu position marking becomes even more important as the user is led deeper into the menu tree. When you are leading a user down a menu path, list a menu header whenever you traverse a path and then list the sub-menu options. In case of a no-input or a no-match, then list the full path prior to replaying the menu prompt.


System: Main menu: you can say, “Check balance,” “Withdraw funds,” or “Transfer funds.”

User: Transfer funds.

System: Transferring funds. Which account do you want to transfer funds from? You can say, “Checking,” “Savings,” or “Money Market.”

User: Savings.

System: Transferring funds from Savings.

9. Avoid mixing voice and DTMF menu choices

If your application is voice enabled, avoid cramming your menu prompts with instructions on how to pick menu items by voice and by DTMF. Avoid, for instance, wordings such as, “You can say ‘check balance’ or press 1, ‘open account’ or press 2, or ‘transfer funds’ or press 3.” Instead, first offer the leaner voice-only menu, “You can say, ‘check balance’, ‘open account’, or ‘transfer funds’,” and only if the users seems to have trouble with it, revert to the mixed prompt, “You can say ‘check balance’ or press 1, ‘open account’ or press 2, or ‘transfer funds’ or press 3.”

Friday, January 4, 2008

Tips for effective menu design: Part I

Primitive a mechanism as they may be, menus remain the most effective way to elicit information from users. The system offers a list of options, the user picks what they want, and the system moves on to the next step. Nothing could be more straightforward. And yet, one can easily design a difficult to use menu unless some basic principles are observed.

In the next few posts, I list 16 guidelines that should help you design usable menus.


1. Present the most requested items first

Not all menu items are created equal. If you know which items are requested most frequently, place those items at the head of the menu list.

2. Keep the menu list to 4 items or less

Because of the invisible nature of VUIs, try to keep your menus to four items or less. In case you need to present the user with more than four items, split the list into two, with the first list presenting the user with the items they are most likely to request, and access to the second list offered as the last option.

3 Keep the menu depth to 3 or less

People hate deep menus. They are exasperated by them. And the deeper the menu, the stronger the feeling that they are being led into a blind alley, with little hope to get to where they want to go. If you can’t keep your men depth to 3 or less, go back to the drawing board and see if you can’t consolidate some of those tree branches.