Search results for "corpus linguistic"
showing 10 items of 86 documents
Naming MINERALITY of French white wines : a contrastive study on the emergence of a " new " wine descriptor
2016
International audience; [Context] MINERAL / MINERALITY has emerged in wine prescriptive and descriptive discourses for about 20 years, especially to characterize special kinds of French white wines like Chablis in Northern Burgundy. Despite its very popular use by both experts and consumers, it still lacks a terminological definition that would be accepted by all wine professionals. For this reason, a research program was conducted in the last two years by an interdisciplinary team at the University of Burgundy and the Wine School of Changins (Switzerland) in order to cross semantic and sensory data enabling the establishment of a prototypical definition.[Aims] Building on the results of th…
Fake News Spreaders Detection: Sometimes Attention Is Not All You Need
2022
Guided by a corpus linguistics approach, in this article we present a comparative evaluation of State-of-the-Art (SotA) models, with a special focus on Transformers, to address the task of Fake News Spreaders (i.e., users that share Fake News) detection. First, we explore the reference multilingual dataset for the considered task, exploiting corpus linguistics techniques, such as chi-square test, keywords and Word Sketch. Second, we perform experiments on several models for Natural Language Processing. Third, we perform a comparative evaluation using the most recent Transformer-based models (RoBERTa, DistilBERT, BERT, XLNet, ELECTRA, Longformer) and other deep and non-deep SotA models (CNN,…
Tourism Destination Image, Tourism Discourse and UNESCO sites: a contrastive analysis
2018
The purpose of this paper is to analyse tourism discourse - i.e. English as specialised and promotional discourse in the tourism field (Dann 1996, Gotti 2006, Maci, 2013) – as it is applied on websites promoting UNESCO sites in Sicily and in Malta. The consequent Tourism Destination Image (Crompton 1979; Echtner & Ritchie 1991) conveyed by the websites will also be investigated. A mixed methodological approach, both qualitative and quantitative, has been adopted. More specifically, the Corpus Linguistics approach has been privileged (Teubert 2005; Nigro 2006). The websites considered concern five UNESCO sites in Sicily and five UNESCO sites in Malta in a comparative study; both corpora of w…
Revisiting corpus creation and analysis tools for translation tasks
2016
Many translation scholars have proposed the use of corpora to allow professional translators to produce high quality texts which read like originals. Yet, the diffusion of this methodology has been modest, one reason being the fact that software for corpora analyses have been developed with the linguist in mind, which means that they are generally complex and cumbersome, offering many advanced features, but lacking the level of usability and the specific features that meet translators’ needs. To overcome this shortcoming, we have developed TranslatorBank, a free corpus creation and analysis tool designed for translation tasks. TranslatorBank supports the creation of specialized monolingual …
Vagueness expressions in Italian, Spanish and English task-oriented dialogues
2017
In this article, we present a corpus-based analysis on the use of Vagueness Expressions (VEs) in Italian, Spanish and English in Task-oriented Dialogues. Following the distinction among informational, relational and discourse vagueness (Voghera 2012), we compare the width of the functional space of the most frequent VEs. In particular we investigate whether and to what extent the VEs cover all the types of vagueness in the three languages. Quantitative and qualitative analysis brings evidence about a high convergence in the vagueness functions expressed by the VEs of the three languages.
The role of news values in the discur-sive construction of the Brexit refer-endum in the UK press
2022
El objetivo principal de este estudio es explorar el discurso periodístico de la campaña del referéndum del Brexit en Reino Unido desde la perspectiva de los estudios del discurso asistidos por corpus (CADS). En concreto, se analiza cómo se construyeron discursivamente diferentes temas y debates relacionados con el Brexit en la cobertura de la prensa británica de calidad durante la campaña del referéndum. Asimismo, se investigan las diferencias ideológicas en la construcción discursiva de dicha cobertura según las afiliaciones políticas (izquierda-derecha) y las posturas ideológicas hacia el Brexit (Bandos de Salida-Permanencia). Para ello, se recopiló un corpus de cuatro diarios británicos…
Lietvārdu kolokāciju variācijas EUR-Lex korpusa tekstos
2021
Angļu valoda arī šobrīd ir nozīmīga ES institūciju darba valoda, kurā parādās pat dažas unikālas “eiro-angļu valodas” iezīmes. Šajā pētījumā aplūkotas dažas vārdu kopas, kas tai raksturīgas, nosakot Eur-LEX angļu valodas korpusa biežākās lietvārdu kolokācijas. Izmantotās metodes bija salīdzinošā literatūras avotu analīze, kā arī vairāki kvantitatīvi un kvalitatīvi korpusanalīzes paņēmieni. Rezultāti parādīja, ka dažas no šīm lietvārdu kolokācijām veido ES terminoloģiju, kamēr citas iespējams sadalīt tuvu sinonīmu grupās pēc semantiskās nozīmes. Iepriekš zināmas juridiskās angļu valodas iezīmes, piemēram, vārdu pāri, apzīmētāju novietojums pēc lietvārda un arhaiski vietniekvārdi tika novērot…
Impact of textual data augmentation on linguistic pattern extraction to improve the idiomaticity of extractive summaries
2021
International audience; The present work aims to develop a text summarisation system for financial texts with a focus on the fluidity of the target language. Linguistic analysis shows that the process of writing summaries should take into account not only terminological and collocational extraction, but also a range of linguistic material referred to here as the "support lexicon", that plays an important role in the cognitive organisation of the field. On this basis, this paper highlights the relevance of pre-training the CamemBERT model on a French financial dataset to extend its domainspecific vocabulary and fine-tuning it on extractive summarisation. We then evaluate the impact of textua…
Zum Zusammenspiel von Textsorten-und Kulturspezifika in der Übersetzung von Weinbesprechungen
2017
International audience; FragestellungWeinbesprechungen stellen für Weinliebhaber – aber auch für Winzer und Weinhändler – eine wichtige Informationsquelle dar, um Wein zu kaufen bzw. zu bewerben (Lehrer 1975, Suárez-Toste 2007, Wislocka Breit 2014, Gautier/Lavric 2015). Vor dem Hintergrund einer Globalisierung des Weinhandels werden diese Texte immer öfter übersetzt, was zur folgenden Frage führt: Wie wird im Übersetzungsprozess mit der Textsorten- und Kulturspezifik dieser Textsorte umgegangen?-Auf Textsortenebene weisen Weinbesprechungen sowohl übereinzelsprachliche, als auch sprachspezifische Züge auf: Sie reihen sich somit in bestimmte Diskurstraditionen (s. unten) ein, die sich in Ausg…
Couble discursive genre in the Sagrada Familia
2018
This paper focuses on professional genre discourse description applied in tourism context where two different service modalities allows the researcher to deal with two different discourses of the same touristic place: the Sagrada Familia in Barcelona. The aim is to define the genre of the specific discourse related to the different professional practices: the guided visit with the physical presence of an expert, and the guided visit using a socio-technical device where the information is recorded, in order to acknowledge its characteristics and measure the impact on cultural perceptions. The applied methodology relies on the Mann & Thomson's Rhetorical Structure Theory which allows the rese…