Search results for " language processing"
showing 10 items of 415 documents
Recall of common and uncommon words from pure and mixed lists
1980
Recall of high- and low-frequency words in the conventional free recall paradigm was compared with recall of the same words when subjects were required to count backward before and after the presentation of each word. The addition of this distractor task was associated with a reduction in the high-frequency advantage otherwise found with pure lists containing only high- or low-frequency words. This finding is attributed to the disruption of organizational processes. In contrast, the low-frequency advantage found with conventional presentation of mixed lists, containing high- and low-frequency words, was not reduced by distraction. These findings indicate that the frequency effects obtained …
Linguistic interpretation of speech errors
2016
The paper is an attempt to illustrate the linguistic interpretation of speech, known that it remains insufficiently resolved, especially for Romanian. The cause is given by the multitude of criteria that can or should be considered important in speech processing. The aim of this study is to develope a computational tool in order to identify the possible errors related to the morphosintactic structure of speech. Our goal is to assist users who can receive automatically different suggestions that can help them to improve the quality of their text. Thus, we chose an interdisciplinary approach through speech analysis that brings together the key fields of linguistics, computer science and so on…
Validation of Semantic Analyses of Unstructured Medical Data for Research Purposes
2019
BACKGROUND: In secondary data there are often unstructured free texts. The aim of this study was to validate a text mining system to extract unstructured medical data for research purposes. METHODS: From a radiological department, 1,000 out of 7,102 CT findings were randomly selected. These were manually divided into defined groups by 2 physicians. For automated tagging and reporting, the text analysis software Averbis Extraction Platform (AEP) was used. Special features of the system are a morphological analysis for the decomposition of compound words as well as the recognition of noun phrases, abbreviations and negated statements. Based on the extracted standardized keywords, findings rep…
Measuring Semantic Coherence of a Conversation
2018
Conversational systems have become increasingly popular as a way for humans to interact with computers. To be able to provide intelligent responses, conversational systems must correctly model the structure and semantics of a conversation. We introduce the task of measuring semantic (in)coherence in a conversation with respect to background knowledge, which relies on the identification of semantic relations between concepts introduced during a conversation. We propose and evaluate graph-based and machine learning-based approaches for measuring semantic coherence using knowledge graphs, their vector space embeddings and word embedding models, as sources of background knowledge. We demonstrat…
A practical solution to the problem of automatic word sense induction
2004
Recent studies in word sense induction are based on clustering global co-occurrence vectors, i.e. vectors that reflect the overall behavior of a word in a corpus. If a word is semantically ambiguous, this means that these vectors are mixtures of all its senses. Inducing a word's senses therefore involves the difficult problem of recovering the sense vectors from the mixtures. In this paper we argue that the demixing problem can be avoided since the contextual behavior of the senses is directly observable in the form of the local contexts of a word. From human disambiguation performance we know that the context of a word is usually sufficient to determine its sense. Based on this observation…
Does orthographic processing emerge rapidly after learning a new script?
2021
Epub 2020 Aug 11 Orthographic processing is characterized by location-invariant and location-specific processing (Grainger, 2018): (1) strings of letters are more vulnerable to transposition effects than the strings of symbols in same-different tasks (location-invariant processing); and (2) strings of letters, but not strings of symbols, show an initial position advantage in target-in-string identification tasks (location-specific processing). To examine the emergence of these two markers of orthographic processing, we conducted a same-different task and a target-in-string identification task with two unfamiliar scripts (pre-training experiments). Across six training sessions, participants …
Manulex-infra: Distributional characteristics of grapheme—phoneme mappings, and infralexical and lexical units in child-directed written material
2007
It is well known that the statistical characteristics of a language, such as word frequency or the consistency of the relationships between orthography and phonology, influence literacy acquisition. Accordingly, linguistic databases play a central role by compiling quantitative and objective estimates about the principal variables that affect reading and writing acquisition. We describe a new set of Web-accessible databases of French orthography whose main characteristic is that they are based on frequency analyses of words occurring in reading books used in the elementary school grades. Quantitative estimates were made for several infralexical variables (syllable, grapheme-to-phoneme mappi…
A word prediction methodology for automatic sentence completion
2015
Word prediction generally relies on n-grams occurrence statistics, which may have huge data storage requirements and does not take into account the general meaning of the text. We propose an alternative methodology, based on Latent Semantic Analysis, to address these issues. An asymmetric Word-Word frequency matrix is employed to achieve higher scalability with large training datasets than the classic Word-Document approach. We propose a function for scoring candidate terms for the missing word in a sentence. We show how this function approximates the probability of occurrence of a given candidate word. Experimental results show that the proposed approach outperforms non neural network lang…
A Methodology for Bilingual Lexicon Extraction from Comparable Corpora
2015
Dictionary extraction using parallel corpora is well established. However, for many language pairs parallel corpora are a scarce resource which is why in the current work we discuss methods for dictionary extraction from comparable corpora. Hereby the aim is to push the boundaries of current approaches, which typically utilize correlations between co-occurrence patterns across languages, in several ways: 1) Eliminating the need for initial lexicons by using a bootstrapping approach which only requires a few seed translations. 2) Implementing a new approach which first establishes alignments between comparable documents across languages, and then computes cross-lingual alignments between wor…
Adaptive Vocabulary Learning Environment for Late Talkers
2016
The main aim of this research is to provide children who have an early language delay with an adaptive way to train their vocabulary taking into account individuality of the learner. The suggested system is a mobile game-based learning environment which provides simple tasks where the learner chooses a picture that corresponds to a played back sound from multiple pictures presented on the screen. Our basic assumption is that the more similar the concepts (in our case, words) are, the harder the recognition task is. The system chooses the pictures to be presented on the screen by calculating the distances between the concepts in different dimensions. The distances are considered to consist o…