Search results for "language processing"
showing 10 items of 421 documents
Overview of the Second BUCC Shared Task: Spotting Parallel Sentences in Comparable Corpora
2017
This paper presents the BUCC 2017 shared task on parallel sentence extraction from comparable corpora. It recalls the design of the datasets, presents their final construction and statistics and the methods used to evaluate system results. 13 runs were submitted to the shared task by 4 teams, covering three of the four proposed language pairs: French-English (7 runs), German-English (3 runs), and Chinese-English (3 runs). The best F-scores as measured against the gold standard were 0.84 (German-English), 0.80 (French-English), and 0.43 (Chinese-English). Because of the design of the dataset, in which not all gold parallel sentence pairs are known, these are only minimum values. We examined …
Design and evaluation of prosody-based non-speech audio feedback for physical training application
2011
Abstract Methodological support for the design of non-speech user interface sounds for human–computer interaction is still fairly scarce. To meet this challenge, this paper presents a sound design case which, as a practical design solution for a wrist-computer physical training application, outlines a prosody-based method for designing non-speech user interface sounds. The principles used in the design are based on nonverbal communicative functions of prosody in speech acts, exemplifying an interpersonal approach to sonic interaction design. The stages of the design process are justified with a theoretical analysis and three empirical sub-studies, which comprise production and recognition t…
Visual knowledge processing in computer-assisted radiology: A consultation system
1992
This paper presents Visual Heuristics, a consultation system for diagnosis based on thorax radiograph recording. Visual Heuristics uses both prototypical representations of physiological and pathological states and reasoning aimed to infer conclusions from pathological or physiological conditions, establishing correspondences between pathological or physiological states and semantic descriptions of images. Images are assembled with groups of descriptors that guide the recognition process, achieving the possibility of comparisons with real images on the basis of 'expected' images. The system may be employed to generate a dynamic atlas that does not contain proper images, but generates them.
Grammars++ for modelling information in text
1999
Abstract Grammars provide a convenient means to describe the set of valid instances in a text database. Flexibility in choosing a grammar can be exploited to provide information modelling capability by designing productions in the grammar to represent entities and relationships of interest to database applications. Additional constraints can be specified by attaching predicates to selected nonterminals in the grammar. When used for database definition, grammars can provide the functionality that users have come to expect of database schemas. Extended grammars can also be used to specify database manipulation, including query, update, view definition, and index specification.
Editorial: Mining Scientific Papers: NLP-enhanced Bibliometrics
2019
International audience
Intent Detection System Based on Word Embeddings
2018
Intent detection is one of the main tasks of a dialogue system. In this paper we present our intent detection system that is based on FastText word embeddings and neural network classifier. We find a significant improvement in the FastText sentence vectorization. The results show that our intent detection system provides state-of-the-art results on three English datasets outperforming many popular services.
Finnic data sets in the ELDIAdata databank
2019
Semantic models of musical mood: Comparison between crowd-sourced and curated editorial tags
2013
Social media services such as Last.fm provide crowd-sourced mood tags which are a rich but often noisy source of information. In contrast, editorial annotations from production music libraries are meant to be incisive in nature. We compare the efficiency of these two data sources in capturing semantic information on mood expressed by music. First, a semantic computing technique devised for mood-related tags in large datasets is applied to Last.fm and I Like Music (ILM) corpora separately (250,000 tracks each). The resulting semantic estimates are then correlated with listener ratings of arousal, valence and tension. High correlations (Spearman's rho) are found between the track positions in…
Word sense disamibiguation combining conceptual distance, frequency and gloss
2004
Word sense disambiguation (WSD) is the process of assigning a meaning to a word based on the context in which it occurs. The absence of sense tagged training data is a real problem for the word sense disambiguation task. We present a method for the resolution of lexical ambiguity which relies on the use of the wide-coverage noun taxonomy of WordNet and the notion of conceptual distance among concepts, captured by a conceptual density formula developed for this purpose. The formula we propose, is a generalised form of the Agirre-Rigau conceptual density measure in which many (parameterised) refinements were introduced and an exhaustive evaluation of all meaningful combinations was performed.…
The CogALex-IV Shared Task on the Lexical Access Problem
2014
The shared task of the 4th Workshop on Cognitive Aspects of the Lexicon (CogALexIV) was devoted to a subtask of the lexical access problem, namely multi-stimulus association. In this task, participants were supposed to determine automatically an expected response based on a number of received stimulus words. We describe here the task definition, the theoretical background, the training and test data sets, and the evaluation procedure used for ranking the participating systems. We also summarize the approaches used and present the results of the evaluation. In conclusion, the outcome of the competition are a number of systems which provide very good solutions to the problem.