Research: | Artificial Intelligence - Communication
between machines and humans.
I come from a background in natural language processing and machine understanding but in 2001 we did some experiments with an ECA(1) that quantified the extent to which people are interested in information versus their concern for a system to be polite(2). With notable exceptions (e.g. Dix and Dautenhan) HCI research uses a "computer as tool" metaphor - it is up to the user to understand what the tool does, and weld it correctly(X). Language does not work that way. Indeed that is not how people communicate with their dogs(3). I am now firmly in the computers as social actors camp and think that us engineers need to figure out how to work with the human sciences. Sabine Payr and I wrote a successful FP7 proposal that enabled us to put "robot rabbits" in older people's homes and record 300 or so interactions(4). What did we learn? Well, in general we found we don't know what to do with the data(5). Sabine used Grounded Theory which was impressive, and I have been trying to fund a project to adapt Peter Abell's model of business case studies to human-robot interaction data. In the mean time, I am creating an IVR system that implements Michael Tomasello's 2008 position that human communication is intentional and cooperative(6) and at home, continuing the dog theme, I am trying to make our Roomba come when it's called(7). |
---|---|
Phone: | (international) +44 161 247 6479 |
Fax: | (international) +44 161 247 6350 |
Mobile: | 0791 005 9137 |
Email: | pwallis at acm.org |
The idea of being able to talk to a machine has permeated our notion of the future and dates back to at least Turing and his test for intelligence. Machines that can hold a natural conversation however continue to allude us - it is one of the AI-complete issues. One might expect "natural language processing" (NLP) to be the sub-discipline that researches this area but they have lost interest and it has generally been taken up by the speech recognition community and independently by those interested in synthetic characters.
There is growing agreement that dialogue management is critical to speech enabled applications. This paper describes a novel approach to knowledge acquisition in the natural language processing domain, and shows the use of techniques from cognitive task analysis to capture politeness protocols from a "dialogue expert." Acknowledging the importance of intention in mixed initiative systems, our aim was to use an off-the-shelf Belief, Desire, and Intention (BDI) framework from Agent Oriented Software to provide the planning component, and introduce plan library cards as a means of capturing expertise in this context.
Computers that can hold a conversation such as chatbots, embodied conversational agents (ECA) and automated call handling IVR systems are, by the agent model, autonomous systems situated in a social world. As social animals we humans rely on social norms that we are barely conscious of. In this paper it is argued that 1) these normative systems have a layered structure, and 2) current conversational agents only work at the top layer. People abuse such systems, not because they fail, but because their response to failure is inappropriate.
Speech recognition has come a long way but has limitations. We failed (dismally) to get speech recognition to work for casual conversation with a far field microphone in the SERA project but speech recognition over the telephone is stable technology with a considerable market. Call the bank to cancel a card and you will probably end up talking to a computer rather than a person. Why aren't they used more? Primarily because they are generally annoying. Here at CPM we are setting up an experimental IVR system along with the tools to measure task completion and user satisfaction. Having done that, we are inviting a few social scientists along to tell us how to improve it.
My interest in this case is HRI with non-humanoid physical machines. The term "body language" suggests that gesture can be treated as language but it is really the other way round - spoken language works in the same way as body language. Rather than language being a system of signs that have referential meaning to things in the world, much of language as used depends on the hearer accounting for a speaker's actions. Why would a dog growl at someone approaching its food? Obvious to us but not to a robot. The question becomes what physical actions can a Roomba perform that will be systematically interpreted by the humans around it?
The SERA project set out to collect real human-robot interactions and study them. The data collected is rich, but we did not reach a consensus on what to do with the data and, from a personal perspective, the methodologies we did use were unsatisfying. The paper provides a theoretical account of ethnomethods and outlines the idea that Abell and I were trying to fund a few years back.