Search results for "natural language processing"
showing 10 items of 413 documents
Letter-case information and the identification of brand names.
2014
A central tenet of most current models of visual-word recognition is that lexical units are activated on the basis of case-invariant abstract letter representations. Here, we examined this assumption by using a unique type of words: brand names. The rationale of the experiments is that brand names are archetypically printed either in lowercase (e.g., adidas) or uppercase (e.g., IKEA). This allows us to present the brand names in their standard or non-standard case configuration (e.g., adidas, IKEA vs. ADIDAS, ikea, respectively). We conducted two experiments with a brand-decision task (‘is it a brand name?’): a single-presentation experiment and a masked priming experiment. Results in the s…
Morphological parsing with lexical transducers : a case study of OMorFi
2016
This thesis explores the task of morphological parsing, which is going from a written word to a representation of the units of meaning making up the word. The research objective is to investigate morphological parsing of Finnish with lexical transducers through a case study of OMorFi (Open Morphology for Finnish). The thesis also presents some linguistic and mathematical background as well as some techniques for constructing FSTs (Finite-State Transducers). The main results are an exposition and some analysis of OMorFi’s paradigms, stubs & stems language model, some comparison with related work and ideas for potential future work.
Applications of Pattern-driven Methods in Corpus Linguistics
2018
The use of corpora has conventionally been envisioned as being either corpus-based or corpus-driven. While the formal definition of the latter term has been widely accepted since it was established by Tognini-Bonelli (2001), it is often applied to studies that do not, in fact, fullfil the fundamental requirement of a theory-neutral starting point. This volume proposes the term pattern-driven as a more precise alternative. The chapters illustrate a variety of methods that fall under this broad methodology, such as the extraction of lexical bundles, POS-grams and semantic frames, and demonstrate how these approaches can uncover new understandings of both synchronic and diachronic linguistic p…
Leksinių samplaikų sąrašo tikslinimas: bandymas taikyti Formulex metodą
2017
A number of corpus studies focusing on the description of the use and functions of lexical bundles havebeen conducted recently in order to explore the phraseology of learner language. As with any studiesof lexical bundles, the problem of overlapping or structurally incomplete items poses a particularchallenge. In practice, it is often difficult to align such units with specific discourse functions. The factthat lexical bundles do not constitute neat form-and-meaning mappings results from, among otherreasons, their being grounded in language use rather than language system. In this pilot study weattempt to test a new method called Formulex (Forsyth, 2015a; 2015b) to verify whether an applica…
LEXOP: a lexical database providing orthography-phonology statistics for French monosyllabic words.
1999
During the last 20 years, psycholinguistic research has identified many variables that influence reading and spelling processes. We describe a new computerized lexical database, LEXOP, which provides quantitative descriptors about the relations between orthography and phonology for French monosyllabic words. Three main classes of variables are considered: consistency of print-to-sound and sound-to-print associations, frequency of orthography-phonology correspondences, and word neighborhood characteristics.
Design, development and validation of a system for automatic help to medical text understanding
2020
Abstract Objective The paper presents a web-based application, SIMPLE, that facilitates medical text comprehension by identifying the health-related terms of a medical text and providing the corresponding consumer terms and explanations. Background The comprehension of a medical text is often a difficult task for laypeople because it requires semantic abilities that can differ from a person to another, depending on his/her health-literacy level. Some systems have been developed for facilitating the comprehension of medical texts through text simplification, either syntactical or lexical. The ones dealing with lexical simplification usually replace the original text and do not provide additi…
MĪLENBAHA-ENDZELĪNA LATVIEŠU VALODAS VĀRDNĪCĀ IEKĻAUTĀS BĀRTAS IZLOKSNES LEKSIKAS VISPĀRĪGS RAKSTUROJUMS
2021
This article gives an insight into the vocabulary of one of the sub-dialects of the Central dialect of the Latvian language – namely, the sub-dialect spoken in Bārta, a place in South-Western Kurzeme. The focus of the article is on those lexical units of the Bārta sub-dialect that are included in one of the most important works of Latvian linguistics – the Latvian Language Dictionary (1923–1932) and it’s Appendix (1934–1946), compiled and published by Kārlis Mīlenbahs, Jānis Endzelīns and Edīte Hauzenberga. The material analyzed here is taken from the electronic version of the Latvian Language Dictionary (www.tezaurs.lv/mev). The vocabulary of the Bārta sub-dialect is represented there by a…
Graphic representation of data resulting from measurement comparison trials in cataract and refractive surgery.
2003
* BACKGROUND AND OBJECTIVES: The evaluation of new diagnostic measurement devices allows intraindividual comparison with an established standard method. However, reports in journal articles often omit the adequate incorporation of the intraindividual design into the graphic representation. * PATIENTS AND METHODS: This article illustrates the drawbacks and the possible erroneous conclusions caused by this misleading practice in terms of recent method comparison data resulting from axial length measurement in 220 consecutive patients by both applanation ultrasound and partial coherence interferometry. * RESULTS: Graphic representation of such method comparison data should be based on boxplots…
Dictionnaire Bilingue, Syntaxe et Sémantique
1995
We have first shown that the conventional and theoretical lexicography has always had a semantic basis which does not permit homogeneous analysis of the items which constitute the entries. We present in this paper a model of bilingual dictionary whose fundamental principles are syntactical. This dictionary is an application of the methods of the Lexique-grammaire des langues romanes adapted to contrastive viewpoint. This contrastive aim implicates that we have to make specific analytic distinctions which we comment here. On the one hand, from a theoretical standpoint, this dictionary may help to establish suitable basis of systematic contrastive (French/Spanish) linguistics, and on the othe…
Mapping wordnets from the perspective of inter-lingual equivalence
2017
Mapping wordnets from the perspective of inter-lingual equivalence This paper explores inter-lingual equivalence from the perspective of linking two large lexico-semantic databases, namely the Princeton WordNet of English and the plWordnet ( pl. Slowosiec ) of Polish. Wordnets are built as networks of lexico-semantic relations between words and their meanings, and constitute a type of monolingual dictionary cum thesaurus. The development of wordnets for different languages has given rise to many wordnet linking projects (e.g. EuroWordNet, Vossen, 2002). Regardless of a linking method used, these projects require defining rules for establishing equivalence links between wordnet building bloc…