6533b824fe1ef96bd12809d7

RESEARCH PRODUCT

Towards the evaluation of automatic simultaneous speech translation from a communicative perspective

Claudio FantinuoliBianca Prandi

subject

FOS: Computer and information sciencesComputer Science - Computation and LanguageMachine translationEnd userComputer sciencebusiness.industrymedia_common.quotation_subjectSample (statistics)Context (language use)Intelligibility (communication)computer.software_genreSpeech translationQuality (business)Artificial intelligencebusinessComputation and Language (cs.CL)computerInterpreterNatural language processingmedia_common

description

In recent years, automatic speech-to-speech and speech-to-text translation has gained momentum thanks to advances in artificial intelligence, especially in the domains of speech recognition and machine translation. The quality of such applications is commonly tested with automatic metrics, such as BLEU, primarily with the goal of assessing improvements of releases or in the context of evaluation campaigns. However, little is known about how the output of such systems is perceived by end users or how they compare to human performances in similar communicative tasks. In this paper, we present the results of an experiment aimed at evaluating the quality of a real-time speech translation engine by comparing it to the performance of professional simultaneous interpreters. To do so, we adopt a framework developed for the assessment of human interpreters and use it to perform a manual evaluation on both human and machine performances. In our sample, we found better performance for the human interpreters in terms of intelligibility, while the machine performs slightly better in terms of informativeness. The limitations of the study and the possible enhancements of the chosen framework are discussed. Despite its intrinsic limitations, the use of this framework represents a first step towards a user-centric and communication-oriented methodology for evaluating real-time automatic speech translation.

https://doi.org/10.18653/v1/2021.iwslt-1.29