ŁUkasz Grabowski

Register Variation Across English Pharmaceutical Texts: A Corpus-driven Study of Keywords, Lexical Bundles and Phrase Frames in Patient Information Leaflets and Summaries of Product Characteristics

Abstract This study constitutes an initial step towards filling a gap in corpus linguistics studies of linguistic and phraseological variation across English pharmaceutical texts, in particular in terms of recurrent linguistic patterns. The study conducted from a register- perspective ( Biber & Conrad, 2009 ), which employs both quantitative and qualitative research procedures, aims to provide a corpus-driven description of vocabulary and phraseology, namely key words, lexical bundles, and phrase frames, used in patient information leaflets and summaries of product characteristics (represented by 463 and 146 texts, respectively) written originally in English and collected in two domain-spec…

research product

Chapter 3. Fine-tuning lexical bundles

research product

Towards Teaching English for Pharmaceutical Purposes: An Attempt at a Description of Key Vocabulary and Phraseology in Clinical Trial Protocols and European Public Assessment Reports

With the exception of medical schools or medical universities, English for Pharmaceutical Purposes is rarely taught as a specialist language course or ESP module at the university-level (e.g. designed specifically for training translators of specialist texts). This may be caused by, among other factors, the lack of comprehensive description of vocabulary and phraseology used across different pharmaceutical text types and genres, e.g. patient-pharmacist interactions, patient information leaflets, clinical trial protocols etc. This preliminary study is designed as an initial step to develop a description of vocabulary and phraseology, namely keywords, n-grams consisting of 4 words and phrase …

research product

Phrase frames in English pharmaceutical discourse a corpus-driven study of intradisciplinary register variation

Focusing on the exploration of intra-disciplinary register variation in the pharmaceutical domain, this corpus-driven study attempts to describe the use, composition and discourse functions of phrase frames, that is, contiguous sequences of words identical except for one (Fletcher, 2002-2007), found in samples of four English pharmaceutical text types, such as patient information leaflets, summaries of product characteristics, clinical trial protocols and chapters/sections from academic textbooks on pharmacology. The study deals with a specific sub-type of phrase frames, that is, 4-word units with a variable slot in the medial position, e.g. be * with caution, to take * medicine. The result…

research product

Mapping wordnets from the perspective of inter-lingual equivalence

Mapping wordnets from the perspective of inter-lingual equivalence This paper explores inter-lingual equivalence from the perspective of linking two large lexico-semantic databases, namely the Princeton WordNet of English and the plWordnet ( pl. Slowosiec ) of Polish. Wordnets are built as networks of lexico-semantic relations between words and their meanings, and constitute a type of monolingual dictionary cum thesaurus. The development of wordnets for different languages has given rise to many wordnet linking projects (e.g. EuroWordNet, Vossen, 2002). Regardless of a linking method used, these projects require defining rules for establishing equivalence links between wordnet building bloc…

research product

Towards Equivalence Links between Senses in PlWordNet and Princeton WordNet

AbstractThe paper focuses on the issue of creating equivalence links in the domain of bilingual computational lexicography. The existing interlingual links between plWordNet and Princeton WordNet synsets (sets of synonymous lexical units – lemma and sense pairs) are re-analysed from the perspective of equivalence types as defined in traditional lexicography and translation. Special attention is paid to cognitive and translational equivalents. A proposal of mapping lexical units is presented. Three types of links are defined: super-strong equivalence, strong equivalence and weak implied equivalence. The strong equivalences have a common set of formal, semantic and usage features, with some o…

research product

Is there a formula for formulaic language?

AbstractThis paper focuses on detecting and measuring traces of "formulaic language". For this purpose, we test a number of computational formulae that quantify the degree to which a text type incorporates inflexible sequences of words. We assess these candidate indices using a number of reference corpora representing a wide variety of text types, both routine and creative. We adopt the concept of "phrase-frame" proposed by Fletcher (2002–2007) as a means of exploring phraseological pattern variability. To date, there have been few studies explicitly addressing this issue, with the exception of Roemer (2010). We examine ten productivity indices, including Roemer's VPR, the Herfindahl-Hirsch…

research product

Phrase Frames as an Exploratory Tool for Studying English-to-Polish Translation Patterns: A Descriptive Corpus-Based Study

Designed as a proof-of-concept, this descriptive corpus-based study focuses on the concept of phrase frame, defined as a contiguous sequence of n words identical except for one (Fletcher 2002). Although phrase frames were already used as a means of exploring pattern variability across and within different text types or registers written in English, they have been rarely, if ever, employed so far as a unit of analysis in descriptive research on translation. In this study, we use the English‒Polish parallel corpus Paralela (Pęzik 2016) to identify and describe Polish translation patterns that emerge from two functionally-defined English phrase frames (it is * clear that, it is * difficult to ). …

research product

Crossing the Frontiers of Linguistic Typology: Lexical Differences and Translation Patterns in English and Russian Lolita by Vladimir Nabokov

This article presents the results of the corpus-driven comparison between the English-original (1955) and Russian auto-translation (1967) of the novel Lolita by Vladimir Nabokov. The aim of the study, which was facilitated by the computer program WordSmith Tools 4.0, was to answer the question whether the differences attested between the English and Russian parallel texts arise from translation strategies [Nabokov was an ardent advocate of literal translation as the only strategy of truly transposing the original text (Beaujour 1995: 716; Grayson 1977: 13–15)], or whether they are due to typological differences between the English and Russian languages. This corpus-driven study consists of …

research product

Quantifying English and Polish Lolitas: A Corpus-Driven Stylistic Comparison

The study presented in this article, which is a fragment of a larger study of translational and non- translational texts (Grabowski 2012), falls within the scope of descriptive translation studies (DTS) and corpus linguistics, with particular emphasis on the study of translation universals, on the example of English-original (written in 1955) and two independent Polish translations of the novel Lolita by V. Nabokov (by Stiller in 1991 and Klobukowski in 1997). According to Baker (1995: 243), universal features of translation or translation universals, constitute specific textual characteristics (e.g. lexical, grammatical or stylistic) typical of translated texts, irrespective of languages i…

research product

Keywords and lexical bundles within English pharmaceutical discourse: A corpus-driven description

Abstract Little attention has been paid so far to keywords and lexical bundles used in the English language typical of the pharmaceutical field. Conducted from a register-perspective (Biber & Conrad, 2009), this exploratory and descriptive research is intended to fill in the gap in corpus linguistics studies on phraseology and register variation within written English pharmaceutical discourse. More specifically, this empirical study presents a corpus-driven description of the use and functions of keywords (top-50 by keyness) complemented by a similar description of lexical bundles (top-50 by frequency) used across samples of patient information leaflets, summaries of product characteristics…

research product

Formulaic language

The notion of formulaicity has received increasing attention in disciplines and areas as diverse as linguistics, literary studies, art theory and art history. In recent years, linguistic studies of formulaicity have been flourishing and the very notion of formulaicity has been approached from various methodological and theoretical perspectives and with various purposes in mind. The linguistic approach to formulaicity is still in a state of rapid development and the objective of the current volume is to present the current explorations in the field. Papers collected in the volume make numerous suggestions for further development of the field and they are arranged into three complementary par…

research product

Provoke or encourage improvements? On semantic prosody in English-to-Polish translation

Originally defined as an aura of meaning associated with words used together in a particular context, semantic prosody is a complex linguistic concept, and there is no agreement among researchers as to its precise definition and level of operationalization (word, phrase, text or discourse). Although there have been some studies on semantic prosody in translation, their findings are rather inconclusive and limited to individual words and phrases. Also, there has been no research on semantic prosody conducted so far in Polish-English translation. Intending to fill in this gap, this paper, grounded in corpus linguistics, showcases the role of semantic prosody in a selected English-to-Polish tr…

research product

Sense Equivalence in plWordNet to Princeton WordNet Mapping

Abstract Though the interest in use of wordnets for lexicography is (gradually) growing, no research has been conducted so far on equivalence between lexical units (or senses) in inter-linked wordnets. In this paper, we present and validate a procedure of sense-linking between plWordNet and Princeton WordNet. The proposed procedure employs a continuum of three equivalence types: strong, regular and weak, distinguished by a custom-designed set of formal, semantic and translational features. To validate the procedure, three independent samples of 120 sense pairs were manually analysed with respect to the features. The results show that synsets from the two wordnets linked by interlingual syno…

research product

Leksinių samplaikų sąrašo tikslinimas: bandymas taikyti Formulex metodą

A number of corpus studies focusing on the description of the use and functions of lexical bundles havebeen conducted recently in order to explore the phraseology of learner language. As with any studiesof lexical bundles, the problem of overlapping or structurally incomplete items poses a particularchallenge. In practice, it is often difficult to align such units with specific discourse functions. The factthat lexical bundles do not constitute neat form-and-meaning mappings results from, among otherreasons, their being grounded in language use rather than language system. In this pilot study weattempt to test a new method called Formulex (Forsyth, 2015a; 2015b) to verify whether an applica…

research product


In this paper, we make an attempt to improve the textual fit of English-to-Polish translation of a peculiar type of multi-word units known in corpus linguistic literature as lexical bundles (Biber et al. 1999). Inspired by a study conducted by Grabar and Lefer (2015), we used the English-Polish parallel corpus Paralela (Pezik 2016) and the National Corpus of Polish (NKJP) to extract and explore the use - in terms of frequency distributions - of the Polish equivalents of selected English lexical bundles expressing attitudinal and epistemic stance. More precisely, we used the NKJP corpus to check whether the Polish equivalents are typical of contemporary Polish as found in native texts. The r…

research product

Formulaicity in constrained communication : an intermodal approach

DeutschIn dieser an Korpuslinguistik, Formelsprache und Studien uber eingeschrankte Kommunikation grenzende Forschungsstudie, die sich hier auf Ubersetzung, Dolmetschen und L2 konzentriert, wollen wir uberprufen, ob die in der polnisch-englischen Komponente eines intermodalen EPTIC-Korpus gefundenen eingeschrankten Texte sich von den einheimischen Texten unterscheiden in Bezug auf die Verwendung benachbarter Wortkombinationen, oft Bigrams genannt, und ob ahnliche Muster in gesprochenen und geschriebenen Registern gefunden werden konnen. Dazu erarbeiteten wir das Poisson-Regressionsmodell mit festen und zufalligen Effekten. Die Ergebnisse zeigen, dass die ubersetzte Sprache zur hoheren Anzah…

research product

Towards a Cross-linguistic Study of Phraseology across Specialized Genres

This poster paper aims to present an early-stage work of a group of researchers collaborating within the project EMPHRASE. The corpus-based cross-linguistic studies of a specialised phraseology across different linguistic registers, genres and domains of language use have not received sufficient attention yet (Buendía 2013, Aguado 2007, Ramisch 2015, Grabowski 2018), notably in terms of turning the results of largely descriptive studies into actionable knowledge. The project revolves around three main axes: 1) compiling and structuring an inventory of word combinations from different genres, disciplines and languages; 2) exploring and analysing cross-genre characteristics as well as typical…

research product