Search results for " corpus"
showing 10 items of 202 documents
Behind the institutional identity: shifting from we-clusters to I-clusters in diplomatic discourse
2011
As research on subjectivity has already shown (Bühler 1934; Mushin 2001), speakers do not just neutrally and mechanically describe states and affairs in the world sorting to objective and prefabricated linguistic formulations, but their personal identity sometimes crops up through a range of viewpoints. This paper is both a contribution to the literature on diplomatic discourse seen as the expression of the foreign policy of a country (Marshall 1990) and to the representation of political identities in specialized discourse (Fairclough 2003). The Diplomatic Corpus (DiCo), investigated in this study, comprises all the speeches delivered by the three British foreign ministers (Cook, Straw and…
Twitter comme « corpus » en sciences du langage : questions méthodologiques et pistes de recherche
2017
Doctoral; L’avènement des corpus et des travaux sur corpus en sciences du langage ont amené la discipline à décrire des ressources sans cesse diversifiées, qu’il s’agisse de corpus de référence ou ad hoc. Les formes de communication médiées par ordinateur (computer-mediated communication) n’échappent pas cette tendance et ce d’autant plus qu’il s’agit de données numériques natives. Parmi les différents types recensés à ce jour, cette communication s’intéressera spécifiquement à Twitter et à ses potentialités pour la recherche linguistique.A partir d’un corpus compilé à la Maison des Sciences de l’Homme de Dijon – mais aussi des autres initiatives documentées sur la plateforme Ortolang – il …
Part of Speech Tagging Using Hidden Markov Models
2020
Abstract In this paper, we present a wide range of models based on less adaptive and adaptive approaches for a PoS tagging system. These parameters for the adaptive approach are based on the n-gram of the Hidden Markov Model, evaluated for bigram and trigram, and based on three different types of decoding method, in this case forward, backward, and bidirectional. We used the Brown Corpus for the training and the testing phase. The bidirectional trigram model almost reaches state of the art accuracy but is disadvantaged by the decoding speed time while the backward trigram reaches almost the same results with a way better decoding speed time. By these results, we can conclude that the decodi…
Algorithmic Aspects of Speech Recognition: A Synopsis
2000
Speech recognition is an area with a sizable literature, but there is little discussion of the topic within the computer science algorithms community. Since many of the problems arising in speech recognition are well suited for algorithmic studies, we present them in terms familiar to algorithm designers. Such cross fertilization can breed fresh insights from new perspectives. This material is abstracted from A. L. Buchsbaum and R. Giancarlo, Algorithmic Aspects of Speech Recognition: An Introduction, ACM Journal of Experimental Algorithmics, Vol. 2, 1997, http://www.jea.acm.org.
Word sense disamibiguation combining conceptual distance, frequency and gloss
2004
Word sense disambiguation (WSD) is the process of assigning a meaning to a word based on the context in which it occurs. The absence of sense tagged training data is a real problem for the word sense disambiguation task. We present a method for the resolution of lexical ambiguity which relies on the use of the wide-coverage noun taxonomy of WordNet and the notion of conceptual distance among concepts, captured by a conceptual density formula developed for this purpose. The formula we propose, is a generalised form of the Agirre-Rigau conceptual density measure in which many (parameterised) refinements were introduced and an exhaustive evaluation of all meaningful combinations was performed.…
Towards a Non-Intrusive Context-Aware Speech Quality Model
2020
Understanding how humans judge perceived speech quality while interacting through Voice over Internet Protocol (VoIP) applications in real-time is essential to build a robust and accurate speech quality prediction model. Speech quality is degraded in the presence of background noise reducing the Quality of Experience (QoE). Speech Enhancement (SE) algorithms can improve speech quality in noisy environments. The publicly available NOIZEUS speech corpus contains speech in environmental background noise babble, car, street, and train at two Signal-to-noise ratio (SNRs) 5dB and 10dB. Objective Speech Quality Metrics (OSQM) are used to monitor and measure speech quality for VoIP applications. Th…
Disertación histórica de la festividad, y procesion del Corpus, que celebra cada año la ... Ciudad de Valencia, con explicación de los símbolos que v…
Vinyeta tip. a la port Fris, caplletra, bandes Sign.: [ ]2, A-F4 Notes a peu de pàg. separades del text per filet Reclams
Communicative guide for key conversational partners in the context of 'afasic conversation'
2005
Con este trabajo presentamos una Guía comunicativa diseñada para los interlocutores-clave (Whitworth, Perkins y Lesser 1997) que participan en conversaciones donde se incluye algún hablante con afasia. La necesidad de este tipo de guías surgió durante la elaboración del corpus PerLA ("PERcepción, Lenguaje y Afasia"), iniciado en 2000 en el área de Lingüística General de la UVEG. Presentamos algunos trabajos anteriores que abordan el tema de los interlocutores, como el Protocolo Pragmático diseñado por Carol Prutting y Diane Kirchner, el Entrenamiento conversacional de Audrey Holland, y la Terapia de Conversación Asistida de Aura Kagan. Los datos confirman que en estas situaciones ambos tipo…
La comparaison de corpus parallèles / comparables pour l’approche des macrostructures : les discours de conjoncture économique
2017
National audience; La présentation vise à discuter, à partir du cas des discours de conjoncture économique (BCE, banques centrales "nationales", instituts de conjoncture) l'apport croisé potentiel des corpus comparables et parallèles pour produire des ressources mobilisables pour le traducteur financier professionnel.
Violence in the Brothers Grimm's fairy tales: a corpus-based approach
2010
The purpose of this article is to carry out a corpus-based study on the presence of violence in a selection of eight tales by the Grimm's Brothers by looking at the terms which can be said to relate to the semantic field of violence. More specifically, this study will analyse a selection of eight tales in which the frequency of the words cut, dead and blood will be studied in detail. These words have been chosen due to their possible connection to violence after carrying out a quantitative analysis of the frequency of the whole main corpus. My initial hypothesis is that the corpus-based study of those eight tales would support my intuition regarding the high percentage of violence in the Br…