Search results for "Bigram"
showing 10 items of 14 documents
Totally new and pretty awesome : Amplifier–adjective bigrams in GloWbE
2017
Abstract Previous work on adjectival intensification (e.g. very good , so glad , really great ) has mostly focussed on the adverbs in question, showing that different (native) varieties of English display distinctive preferences concerning intensifier choice. However, little is known so far about the role that intensifier-adjective units (bigrams) play. The present paper offers a first contribution to fill this research gap by focussing on a data-driven approach to (mostly) high-frequency bigrams and their collocational behaviour in the Corpus of Global Web-based English (GloWbE). Asymmetric and symmetric measures are employed to establish attraction and repulsion between adverb and adjecti…
Position coding effects in a 2D scenario: the case of musical notation.
2013
How does the cognitive system encode the location of objects in a visual scene? In the past decade, this question has attracted much attention in the field of visual-word recognition (e.g., "jugde" is perceptually very close to "judge"). Letter transposition effects have been explained in terms of perceptual uncertainty or shared "open bigrams". In the present study, we focus on note position coding in music reading (i.e., a 2D scenario). The usual way to display music is the staff (i.e., a set of 5 horizontal lines and their resultant 4 spaces). When reading musical notation, it is critical to identify not only each note (temporal duration), but also its pitch (y-axis) and its temporal seq…
Manulex-infra: Distributional characteristics of grapheme—phoneme mappings, and infralexical and lexical units in child-directed written material
2007
It is well known that the statistical characteristics of a language, such as word frequency or the consistency of the relationships between orthography and phonology, influence literacy acquisition. Accordingly, linguistic databases play a central role by compiling quantitative and objective estimates about the principal variables that affect reading and writing acquisition. We describe a new set of Web-accessible databases of French orthography whose main characteristic is that they are based on frequency analyses of words occurring in reading books used in the elementary school grades. Quantitative estimates were made for several infralexical variables (syllable, grapheme-to-phoneme mappi…
Do Grading Gray Stimuli Help to Encode Letter Position?
2021
Numerous experiments in the past decades recurrently showed that a transposed-letter pseudoword (e.g., JUGDE) is much more wordlike than a replacement-letter control (e.g., JUPTE). Critically, there is an ongoing debate as to whether this effect arises at a perceptual level (e.g., perceptual uncertainty at assigning letter position of an array of visual objects) or at an abstract language-specific level (e.g., via a level of “open bigrams” between the letter and word levels). Here, we designed an experiment to test the limits of perceptual accounts of letter position coding. The stimuli in a lexical decision task were presented either with a homogeneous letter intensity or with a graded gra…
Part of Speech Tagging Using Hidden Markov Models
2020
Abstract In this paper, we present a wide range of models based on less adaptive and adaptive approaches for a PoS tagging system. These parameters for the adaptive approach are based on the n-gram of the Hidden Markov Model, evaluated for bigram and trigram, and based on three different types of decoding method, in this case forward, backward, and bidirectional. We used the Brown Corpus for the training and the testing phase. The bidirectional trigram model almost reaches state of the art accuracy but is disadvantaged by the decoding speed time while the backward trigram reaches almost the same results with a way better decoding speed time. By these results, we can conclude that the decodi…
BuscaPalabras: a program for deriving orthographic and phonological neighborhood statistics and other psycholinguistic indices in Spanish.
2005
This article describes a Windows program that enables users to obtain a broad range of statistics concerning the properties of word and nonword stimuli in Spanish, including word frequency, syllable frequency, bigram and biphone frequency, orthographic similarity, orthographic and phonological structure, concreteness, familiarity, imageability, valence, arousal, and age-of-acquisition measures. It is designed for use by researchers in psycholinguistics, particularly those concerned with recognition of isolated words. The program computes measures of orthographic similarity online, with respect to either a default vocabulary of 31,491 Spanish words or a vocabulary specified by the user. In a…
Illusory conjunctions in French: The nature of sublexical units in visual word recognition
2005
The respective influence of orthographic redundancy (Seidenberg, 1987) and syllable boundaries (Rapp, 1992) on reading units in French was tested in three experiments, using the illusory conjunction paradigm (Prinzmetal, Treiman, & Rho, 1986). Bigram boundaries were defined according to bigram frequencies. The data showed that the syllable effect was attenuated or cancelled when syllable boundaries did not coincide with bigram boundaries. Reading units were defined by syllable and orthographic information. The implications of such findings for the dual route theory and the PDP model are discussed.
E-Hitz: A word frequency list and a program for deriving psycholinguistic statistics in an agglutinative language (Basque)
2007
We describe a Windows program that enables users to obtain a broad range of statistics concerning the properties of word and nonword stimuli in an agglutinative language (Basque), including measures of word frequency (at the whole-word and lemma levels), bigram and biphone frequency, orthographic similarity, orthographic and phonological structure, and syllable-based measures. It is designed for use by researchers in psycholinguistics, particularly those concerned with recognition of isolated words and morphology. In addition to providing standard orthographic and phonological neighborhood measures, the program can be used to obtain information about other forms of orthographic similarity, …
Geometric Algebra Rotors for Sub-symbolic Coding of Natural Language Sentences
2007
A sub-symbolic encoding methodology for natural language sentences is presented. The procedure is based on the creation of an LSA-inspired semantic space and associates rotation operators derived from Geometric Algebra to word bigrams of the sentence. The operators are subsequently applied to an orthonormal standard basis of the created semantic space according to the order in which words appear in the sentence. The final rotated basis is then coded as a vector and its orthogonal part constitutes the sub-symbolic coding of the sentence. Preliminary experimental results for a classification task, compared with the traditional LSA methodology, show the effectiveness of the approach.
ΔP as a measure of collocation strength
2018
AbstractThis paper explores the proposed benefits of ΔP (delta P) as a measure of collocation strength. Its focus is on contrasting ΔP with other, more commonly used, association measures, particularly transitional probabilities, but also mutual information and Lexical Gravity G. To this end, first the strong correlation between ΔP and transitional probability is illustrated with the help of two exemplary corpora. This is followed by an analysis of hesitation placement in spontaneous spoken English, based on the assumption that hesitations will not be placed within strong collocations. Results show that, despite their strong similarity, in some contexts ΔP is more predictive of hesitation p…