Engineering Natural Language Interfaces: can CA help?
The people
The University of Sheffield, Department of Computer Science: Mark Hepple (PI) and Peter Wallis
Newcastle University, Linguistics and Language Sciences: Alan Firth and Christopher Jenks
The grant
EPSRC: Digital Economy: Feasibility Studies in novel ICT developments which allow early User adoption
Start date: 01 April, 2008, Duration: 12 months
Summary
Interactive Voice Response (IVR) is used extensively by business for
routine customer support, but the usefulness of these systems could be
improved if they were, well, less annoying. The hypothesis is that
natural language interfaces fail primarily because they say the wrong
thing and there are subtle but significant differences between ways of
saying things that natural language interfaces must take into account.
Current best practice for developing IVR systems is to use one's
intuitions about language and to simply write down what should be said
and when. Based on sucesses with statistical parsers, attempts have
been made to use machine learning to decide what to say next from a
corpus. This proposal is for a study to look at the feasibility of
using conversation analysis, or CA, to provide a structured approach
to the creation of IVR systems. Conversation analysis has been
evolving since the early 60's and has become an accepted part of
applied linguistics [2]. A popular introduction to CA is Hutchby and
Woofitt [4] but it has also had a recent re-vamp and revival (see ten
Have [12] and Seedhouse [11]). For the ideas of CA to contribute to
current work in computational linguistics, it necessary for annotated
corpora to be created which capture their insights as applied to
sufficient quantities of real data that the resulting resources can be
used to aid researchers who are developing dialog systems, and
possibly to provide data for machine learning based systems. This
project will assess the feasibility of creating such resources,
particularly in terms of whether CA-based annotation schemes can be
developed for which reasonable levels of inter-annotator agreement can
be achieved, and for which adequate annotation throughput is possible.
Natural language interfaces have been "near future" technology since
the first days of computing and CA may be the methodology to take us
that critical step closer.