Search results for " corpus"
showing 10 items of 202 documents
Linguistic interpretation of speech errors
2016
The paper is an attempt to illustrate the linguistic interpretation of speech, known that it remains insufficiently resolved, especially for Romanian. The cause is given by the multitude of criteria that can or should be considered important in speech processing. The aim of this study is to develope a computational tool in order to identify the possible errors related to the morphosintactic structure of speech. Our goal is to assist users who can receive automatically different suggestions that can help them to improve the quality of their text. Thus, we chose an interdisciplinary approach through speech analysis that brings together the key fields of linguistics, computer science and so on…
Establishing Video Game Genres Using Data-Driven Modeling and Product Databases
2015
Establishing genres is the first step toward analyzing games and how the genre landscape evolves over the years. We use data-driven modeling that distils genres from textual descriptions of a large collection of games. We analyze the evolution of game genres from 1979 till 2010. Our results indicate that until 1990, there have been many genres competing for dominance, but thereafter sport-racing, strategy, and action have become the most prevalent genres. Moreover, we find that games vary to a great extent as to whether they belong mostly to one genre or to a combination of several genres. We also compare the results of our data-driven model with two product databases, Metacritic and Mobyga…
The International Comparable Corpus: Challenges in building multilingual spoken and written comparable corpora
2021
This paper reports on the efforts of twelve national teams in building the International Comparable Corpus (ICC; https://korpus.cz/icc) that will contain highly comparable datasets of spoken, written and electronic registers. The languages currently covered are Czech, Finnish, French, German, Irish, Italian, Norwegian, Polish, Slovak, Swedish and, more recently, Chinese, as well as English, which is considered to be the pivot language. The goal of the project is to provide much-needed data for contrastive corpus-based linguistics. The ICC corpus is committed to the idea of re-using existing multilingual resources as much as possible and the design is modelled, with various adjustments, on t…
Osseous oral hyaline ring granuloma mimicking a mandible tumor in a child with congenital agenesis of the corpus callosum
2017
Background Hyaline ring granuloma (HRG) of the oral cavity is an uncommon disorder considered to be a foreign-body reaction resulting from implantation of food vegetable particles. Microscopically, it is characterized by the presence of structures of hyaline rings in an inflamed fibrous tissue background, which contains multinucleated giant cells. Material and methods We present the case of a 4-year-old boy diagnosed with a mandible osseous HRG, which showed clinical and tomographic aspects suggestive of an aggressive bone tumor. Results The patient underwent surgical exploration and histopathologic analysis showed fragments composed predominantly of widespread dense connective tissue with …
Les grammaires de construction au service de la morphologie lexicale ? Apports pour la description des noms-termes en allemand
2019
International audience; Contexte Le « goût » de la langue allemande pour la créativité lexicale, en particulier dans le domaine nominal, fait figure de lapalissade. Ceci est particulièrement vrai en discours spécialisés où les besoins en dénomination de nouveaux concepts passent, outre les différentes formes d’emprunts ou de néologismes de forme, par les deux procédés classiques que sont la dérivation et la composition. C’est dans ce contexte que la présente proposition de communication vise à interroger les apports des grammaires de construction appliquées à ces phénomènes – on parlera alors avec Booij (2017) de morphologie constructionnelle – pour une analyse de l’interface syntaxe-séma…
De la terminologie spontanée à une terminologie aménagée et vice-versa : parler des vins espumantes au Brésil
2018
International audience; [Contexte] Le marché des vins effervescents au Brésil a fortement augmenté ces dix dernières années, affichant une hausse de +248% selon l’OIV . L’Instituto Brasileiro do Vinho (Ibravin) indique que cette catégorie de produits représente environ 80% du marché vitivinicole intérieur en 2015 et enregistre une augmentation de 79,58% dans l’importation de ce type de produit en 2017. Cette augmentation globale de la production, de la commercialisation et de la consommation d’effervescents au Brésil entraîne, tout naturellement, un besoin accru de valorisation et donc de communication autour de ces produits, mobilisant, entre autres, des descripteurs de nature terminologiq…
‘I don’t know the answer to that question’: a corpus-assisted discourse analysis of White House Press Briefings
2013
White House Press Briefings, daily meetings with the press held by the White House Press Secretary, are the main information conduit for the White House (Kumar 2007). They are considered a “political chess game” where the Press Secretary and the press face a “wrestling match” (Partington 2006: 16). Our analysis is carried out on a corpus comprising all the Press Briefings across three presidencies from Clinton to Obama. The additional mark-up includes information about individual speakers and their role, allowing us to compare different discourse strategies adopted by the participants in the briefings at different points in time. This leads us to determine the extent of the differences in t…
Medical Academic Speech. A Corpus-based Investigation of Same-Speaker Most Frequent Content Key Word Repetition in Non-Native English Discourse
2016
Studies on repetition in ELF interactions have been carried out in several domains, but medical academic discourse still remains under-researched. This paper explores same-speaker repetition in a 31,153-word corpus of lectures included in the 100,135-word medical section of the 1 million-word ELFA (English as a Lingua Franca in Academic Settings) corpus. More specifi cally, the corpus was searched for the most frequent same-speaker content key word repetition and corresponding functions, with both immediate and delayed repetition being scrutinized. The results confi rmed the initial hypothesis according to which same-speaker repetition was expected to be pervasive in the data, not only as a…
Using corpus tools to analyse learner language in a UK EAP context
2014
This study analyses the language of successful spoken requests used by Chinese intermediate level English for Academic Purposes (EAP) students in Discourse Completion Tasks (DCTs) at a UK higher education institution. Using corpus tools, the authors examined the frequent words, chunks and moves in request data and compared this to general reference corpora. Findings suggest that successful spoken requests often made use of high frequency modals and chunks. The data also demonstrated that the use of appropriate request moves were often associated with success, even if the language used contained linguistic errors. The findings have important implications for how spoken requests are taught in…
Gender-neutral Language in EU Secondary Legislation: The Case of the English Language
2023
English does not have a grammatical gender, thus having an “intrinsic predisposition towards gender-neutral forms” (Poddighe 2020, 3). Most personal nouns do not indicate a specific gender, as in the case of person or engineer. However, there are also personal nouns with lexical gender, such as king or queen (Hellinger 2001). As a result, in English there is a risk of creating sentences that are not gender-neutral. Within the EU, the promotion of the use of a more inclusive language represents an important objective. For this reason, in recent years, various documents containing guidelines on gender-neutral language have been elaborated to encourage members of the EU institutions to adopt a…