Search results for "language processing"
showing 10 items of 421 documents
An Innovative Similarity Measure for Sentence Plagiarism Detection
2016
We propose and experimentally assess Semantic Word Error Rate (SWER), an innovative similarity measure for sentence plagiarism detection. SWER introduces a complex approach based on latent semantic analysis, which is capable of outperforming the accuracy of competitor methods in plagiarism detection. We provide principles and functionalities of SWER, and we complement our analytical contribution by means of a significant preliminary experimental analysis. Derived results are promising, and confirm to use the goodness of our proposal.
Probing neural mechanisms of music perception, cognition, and performance using multivariate decoding.
2012
Recent neuroscience research has shown increasing use of multivariate decoding methods and machine learning. These methods, by uncovering the source and nature of informative variance in large data sets, invert the classical direction of inference that attempts to explain brain activity from mental state variables or stimulus features. However, these techniques are not yet commonly used among music researchers. In this position article, we introduce some key features of machine learning methods and review their use in the field of cognitive and behavioral neuroscience of music. We argue for the great potential of these methods in decoding multiple data types, specifically audio waveforms, e…
Conceptual Ontological Object Knowledge Base and Language
2008
This paper deals with AI in aspect of knowledge acquisition and ontology base structure. The core of the system was designed in an object model to optimize it for further processing. Direct concept linking was used to assure fast semantic network processing. Predefined attributes used in the core minimize the number of basic connections within the ontology and help in inference. The system is assumed to generate questions and to specify the knowledge. The AI system defined in this way opens a possibility for better understanding of such basic human mind mechanisms as learning or analyzing.
Validation of Semantic Analyses of Unstructured Medical Data for Research Purposes
2019
BACKGROUND: In secondary data there are often unstructured free texts. The aim of this study was to validate a text mining system to extract unstructured medical data for research purposes. METHODS: From a radiological department, 1,000 out of 7,102 CT findings were randomly selected. These were manually divided into defined groups by 2 physicians. For automated tagging and reporting, the text analysis software Averbis Extraction Platform (AEP) was used. Special features of the system are a morphological analysis for the decomposition of compound words as well as the recognition of noun phrases, abbreviations and negated statements. Based on the extracted standardized keywords, findings rep…
Chapter 3. Prosodic versatility, hierarchical rank and pragmatic function in conversational markers
2019
Sub-symbolic Mapping of Cyc Microtheories in Data-Driven “Conceptual” Spaces
2007
The presented work aims to combine statistical and cognitive-oriented approaches with symbolic ones so that a conceptual similarity relationship layer can be added to a Cyc KB microtheory. Given a specific microtheory, a LSA-inspired conceptual space is inferred from a corpus of texts created using both ad hoc extracted pages from the Wikipedia repository and the built-in comments about the concepts of the specific Cyc microtheory. Each concept is projected in the conceptual space and the desired layer of subsymbolic relationships between concepts is created. This procedure can help a user in finding the concepts that are "sub-symbolically conceptually related" to a new concept that he want…
An Intralingual Parallel Corpus of Translations into German Easy Language (Geasy Corpus): What Sentence Alignments Can Tell Us About Translation Stra…
2021
Parallel corpora are traditionally interlingual and contain source and target texts in different languages. However, intralingual translations into Easy Language (EL) become more and more common in various countries. First intralingual corpora have been built up and investigated in terms of linguistic and structural features, but a translation-driven corpus linguistic approach is still missing to empirically describe the strategies of Easy Language translation, the characteristics of translated texts as well as to make these parallel corpora usable for professionalising and automatising translation processes. In this paper, we introduce an intralingual parallel corpus of translations into G…
Invalid Syntax: NooJ Assisted Automatic Detection of Errors in Auxiliaries and Past Participles in Italian
2017
The work targets two areas of Italian morphosyntax: auxiliary selection (AS) and past participle agreement (PPA). In selecting such inflectional morphemes, learners of Italian commit frequent errors, even after a long period of constant study. We aim to enclose AS and PPA within the boundaries of NLP in order that a tool can be developed with a twofold purpose: first, it helps experts to build specific computer drills regarding AS and PPA; second, it assists self-taught learners in verifying whether their periphrastic sentences in Italian are well-turned. This area of Computer-Assisted Language Learning is currently poorly investigated. Further research might substantiate the importance of …
Métricas epistemológicas para modelos basados en fractales lingüísticos de PLN
2016
This work is part of a wider research named BIOTECH that intends to assure the quality of linguistic modeling activity for automatic systems, making it possible to automate the management of words and natural language. Words are considered part of the complex articulation of language expressions. BIOTECH aims to take it as a tool to evaluate and track linguistic and verbal communication distorsion in patients with Autistic Spectrum Disorder. The main contribution of this paper is to discuss the validity of fractals when used to model linguistic reasoning, and the relevance of considering not only statistics but also epistemology-related metrics. Furthermore, a set of metrics is introduced a…
The Effectiveness of LDOCE Definitions for Concrete and Abstract Nouns in Headword- and Picture-Identification Tasks
2021
Abstract LDOCE uses a defining vocabulary to make their definitions intelligible to the user. Critics claim this policy may result in imprecise definitions, something especially noticeable in certain concrete and abstract words that are difficult to define by a definition only. This paper examines to what extent LDOCE definitions of such words help learners identify the objects and words being defined. In our experiment on 381 learners of English as a foreign language, three groups of participants viewed different definition types: simplified definitions of LDOCE, unsimplified definitions of MWC, and definitions written in the learners’ mother tongue (UDPL/TR). The results show that the LDO…