Search results for "language processing"
showing 10 items of 421 documents
Chapter 11. Computational representation of FrameNet for multilingual natural language generation
2021
Mining Interpretable Rules for Sentiment and Semantic Relation Analysis Using Tsetlin Machines
2020
Tsetlin Machines (TMs) are an interpretable pattern recognition approach that captures patterns with high discriminative power from data. Patterns are represented as conjunctive clauses in propositional logic, produced using bandit-learning in the form of Tsetlin Automata. In this work, we propose a TM-based approach to two common Natural Language Processing (NLP) tasks, viz. Sentiment Analysis and Semantic Relation Categorization. By performing frequent itemset mining on the patterns produced, we show that they follow existing expert-verified rule-sets or lexicons. Further, our comparison with other widely used machine learning techniques indicates that the TM approach helps maintain inter…
Application of the Error Correcting Grammatical Inference Method (ECGI) to Multi-Speaker Isolated Word Recognition
1988
It is well known that speech signals constitute highly structured objects which are composed of different kinds of subobjects such as words, phonemes, etc. This fact has motivated several researchers to propose different models which more or less explicitly assume the structural nature of speech. Notable examples of these models are Markov models /Bak 75/, /Jel 76/; the famous Harpy /Low 76/; Scriber and Lafs /Kla 80/; and many others works in which the convenience of some structural model of the speech objects considered is explicitly claimed /Gup 82/, /Lev 83/, /Cra 84/, /Sca 85/, /Kam 85/, /Sau 85/, /Rab 85/, /Kop 85/, /Sch 85/, /Der 86/, /Tan 86/.
A practical solution to the problem of automatic word sense induction
2004
Recent studies in word sense induction are based on clustering global co-occurrence vectors, i.e. vectors that reflect the overall behavior of a word in a corpus. If a word is semantically ambiguous, this means that these vectors are mixtures of all its senses. Inducing a word's senses therefore involves the difficult problem of recovering the sense vectors from the mixtures. In this paper we argue that the demixing problem can be avoided since the contextual behavior of the senses is directly observable in the form of the local contexts of a word. From human disambiguation performance we know that the context of a word is usually sufficient to determine its sense. Based on this observation…
Automatic identification of word translations from unrelated English and German corpora
1999
Algorithms for the alignment of words in translated texts are well established. However, only recently new approaches have been proposed to identify word translations from non-parallel or even unrelated texts. This task is more difficult, because most statistical clues useful in the processing of parallel texts cannot be applied to non-parallel texts. Whereas for parallel texts in some studies up to 99% of the word alignments have been shown to be correct, the accuracy for non-parallel texts has been around 30% up to now. The current study, which is based on the assumption that there is a correlation between the patterns of word co-occurrences in corpora of different languages, makes a sign…
Read&Answer, A Tool to Capture on-Line Processing of Electronic Texts
2009
This paper is aimed at presenting Read&Answer, a tool that records reading times, one of the main on-line methods employed in text processing research. Read&Answer allows the recording, analysis and interpretation of the learner processing in order to test specific hypotheses and explain final comprehension results. First, we will describe the tool, and then we will briefly explain some research studies using the tool. We will show how Read&Answer can be used in combination with another on-line method extensively employed in text processing research, i.e., verbal protocols, and we will also compare Read&Answer with eye movement tracking, a widely accepted on-line reading times technique.
Implicit learning
2008
International audience; All of us have learned much about language, music, physical or social environment, and other complex domains, out of any intentional attempts to acquire information. This chapter describes first how studies investigating this form of learning in laboratory situations have shifted from a rule-based interpretation to interpretations assuming a progressive tuning to the statistical regularities of the environment. The next section examines the potential of statistical learning, and whether statistical learning stems from statistical computations or chunk formation. Then the acceptations in which this form of learning may be qualified as implicit are analysed. Finally, i…
Concept Maps for Comprehension and Navigation of Hypertexts
2013
Comprehension and learning with hypertexts are challenging due to the nonlinearity of such digital documents. Processing hypertexts may involve navigation and comprehension problems, leading learners to cognitive overhead. Concept maps have been added to hypertexts to reduce the cognitive requirements of navigation and comprehension. This chapter explores the literature to examine the effects of concept maps on navigation, comprehension, and learning from hypertexts. The literature review aims to elucidate how concept maps may contribute to processing hypertexts and under which conditions. In spite of the variability of concept maps used in hypertexts, some findings converge. Concept maps r…
Concordance Analysis
2011
Background In this article, we describe qualitative and quantitative methods for assessing the degree of agreement (concordance) between two measuring or rating techniques. An assessment of concordance is particularly important when a new measuring technique is introduced.
Arabic Named Entity Recognition: A Feature-Driven Study
2009
The named entity recognition task aims at identifying and classifying named entities within an open-domain text. This task has been garnering significant attention recently as it has been shown to help improve the performance of many natural language processing applications. In this paper, we investigate the impact of using different sets of features in three discriminative machine learning frameworks, namely, support vector machines, maximum entropy and conditional random fields for the task of named entity recognition. Our language of interest is Arabic. We explore lexical, contextual and morphological features and nine data-sets of different genres and annotations. We measure the impact …