Sunday, July 1, 2007

The Invisibility of VUI

Perhaps the most frustrating thing about using a voice interface is the feeling of not knowing where precisely you are in the interaction and what exactly the system expects you to do next. A well-designed web site will show navigators where in the menu tree they are, but even without a menu path indicator, a web page usually has enough visual clues to tip the user on where they are in the site (a url being one simple indicator). Not so with a voice interface, where the user can quickly feel lost for a lack of mental markers positioning them where they precisely are in the exchange with the system.

Mark the exchange: just like a well-designed web page will indicate where in the web site a user is, a good voice interface will tell the user where in the menu tree they are positioned. Usually, a word or two will suffice: “main menu” for the highest level menu, “here are your flights” before announcing a list of flight numbers, etc.

Trace the path: in applications where the menu structure is deep and wide, users can very easily become confused about where they are in the interaction, even when you mark the individual menu levels. In such situations, you can associate with each voice page that handles an interaction with users a “position page” that traces, starting from the main menu, the position of the user within the menu tree. “Restaurants, Chinese, Zip code”, for instance, would succinctly help the user understand that they chose “Restaurants”, then “Chinese”, and are now giving out a zip code to locate Chinese restaurants within that zip code. You can achieve path tracing by using a message page with a prompt describing the path and the “Go back” option for “Actions”.

Use earcons: an “earcon”, or “auditory icon”, is the voice-equivalent of a graphical interface’s icon. An icon is small graphic that means something specific in the context of the interaction: for instance, an “arrow” pointing to the right may mean go to the next page, and one to the left may mean go back to the previous page. Earcons can be very useful in positioning the user within a menu structure or in announcing the type of action that is about to be undertaken. The sound of a keyboard clicking could be used to indicate to the user that the system is busy doing something (while dead silence may be taken by the user that the system crashed or the call had ended).

Perhaps the one fundamental advantage that GUIs have over VUIs is the feeling that a graphical user has of control over both the medium and the interaction. A very bad GUI can certainly make one feel helpless and at the mercy of irrational forces, but it does take a very bad GUI to throw the user into a state of confusion. A VUI, on the other hand, because it is time time-linear, uni-directional, and invisible, has to stumble only once in the interaction for the user to be thrown in a state of hopeless perplexity. Keeping in mind that there are key differences between designing a GUI and a VUI should help the alert VUI designer avoid making the costly mistake of smuggling GUI assumptions when engaged in VUI design.

No comments: