Institut des Technologies Multilingues & Multimédias de l’Information
Institute for Multilingual & Multimedia Information
Institut für Multilinguale & Multimediale Informationsverarbeitung

Invited Talks

vendredi 15 novembre 2013

Toutes les versions de cet article : [Deutsch] [français] [français]

Invited speakers

JPEG Sarah HAWKINS (University of Cambridge - GB)
Illusions, contexts and domains of analysis in assessing errors in speech processing
The types of errors made in speech recognition models stem from a wide range of factors. Restricted domains of application and probability-based models have been fruitful, influencing for the better psychological models of human speech processing, yet there remain significant obstacles to bringing the performance of recognition models closer to the performance of human listeners. The ubiquity and power of top-down influences on speech processing by humans presumably challenge machine recognisers working from an acoustic signal alone. How do humans (or machines) achieve an efficient balance between between long-domain pattern perception and monitoring for local phonetic detail ? What status do standard linguistic units have in a system that responds effectively to rapidly-changing evidence as the signal unfolds in time ? I will review data that suggest that efficient speech processing uses multiple domains of analysis, distinguished by communicative function. The function affects the speech style and the places in discourse that provide greatest predictive power, and functions may change within an utterance. These processes are circular and mutually reinforcing.
JPEG Julia HIRSCHBERG (University of Columbia - USA)
Identifying and Responding to Errors in Spoken Dialogue Systems
Today’s Spoken Dialogue Systems rely on simple strategies to obtain user input when they believe they have not recognized a user’s utterance correctly, such as : “I’m sorry, I didn’t understand you. Can you please repeat ?” We are developing new methods for eliciting targeted information from users about predicted errors in spoken dialogue systems, based upon a number of studies of how humans react to errors in dialogue. These methods require more precise localization of automatic speech recognition errors, which we will also describe.
JPEG Josef KITTLER (University of Surrey - GB)
To fuse or not to fuse
Multiple classifier fusion is a popular approach to enhance the performance of pattern recognition systems. Multiple classifier systems can be built explicitly by combining the output of classifier designs, which exploit different sources of information, different architectures, data distribution models, training set perturbation schemes, feature spaces, training process parameters, etc. The design of a multiple classifier system inevitably leads to an overproduction of component classifiers which then raises the question which of them to use and combine and how to fuse them. By means of analysing classification error the talk aims to identify the issues raised by classifier fusion and attempt to suggest fusion schemes worth exploring in future research.
PNG Mark LIBERMAN (University of Pennsylvania - USA)
Machine learning from machine learning errors : Work in progress on parsing and phonetic alignment
Mathematical modeling of human errors has long played an important role in theories of human perception ; and human interpretation of machine errors has always played a central role in guiding algorithmic improvements. This talk will feature three less obvious applications of human-machine interaction in predicting, identifying and responding to errors : eliminating untrustworthy machine output from research datasets ; improving productivity and quality in semi-automatic annotation by better management of the human/machine division of labor ; and adjusting machine output towards human-annotation norms. None of these applications are new, but recent changes create new opportunities for developing and deploying them.
JPEG Lucia SPECIA (University of Sheffield - GB)
Is this translation fit for purpose ? Predicting quality versus predicting errors
Plan du site |