Search results for "Parsing"
showing 6 items of 46 documents
Dictionary-symbolwise flexible parsing
2012
AbstractLinear-time optimal parsing algorithms are rare in the dictionary-based branch of the data compression theory. A recent result is the Flexible Parsing algorithm of Matias and Sahinalp (1999) that works when the dictionary is prefix closed and the encoding of dictionary pointers has a constant cost. We present the Dictionary-Symbolwise Flexible Parsing algorithm that is optimal for prefix-closed dictionaries and any symbolwise compressor under some natural hypothesis. In the case of LZ78-like algorithms with variable costs and any, linear as usual, symbolwise compressor we show how to implement our parsing algorithm in linear time. In the case of LZ77-like dictionaries and any symbol…
Attention Direction in Static and Animated Diagrams
2010
Two key requirements for comprehending a diagram are to parse it into appropriate components and to establish relevant relationships between those components. These requirements can be particularly demanding when the diagram is complex and the viewers are novices in the depicted domain. Lack of domain-specific knowledge for top-down guidance of visual attention prejudices novices' extraction of task-relevant information. Static diagrams designed for novices often include visual cues intended to improve such information extraction. However, because current approaches to cueing tend to be largely intuitive, their effectiveness can be questionable. Further, animated diagrams with their percept…
A General Fuzzy-Parsing Scheme for Speech Recognition
1985
In this paper a Speech Recognition Methodology is proposed which is based on the general assumption of ‘fuzzyness’ of both speech-data and knowledge-sources. Besides this general principle, there are other fundamental assumptions which are also the bases of the proposed methodology: ‘Modularity’ in the knowledge organization, ‘Homogeneity’ in the representation of data and knowledge, ‘Passiveness’ of the ‘understanding flow’ (no backtraking or feedback), and ‘Parallelism’ in the recognition activity.
A role for backward transitional probabilities in word segmentation?
2008
A number of studies have shown that people exploit transitional probabilities between successive syllables to segment a stream of artificial continuous speech into words. It is often assumed that what is actually exploited are the forward transitional probabilities (given XY, the probability that X will be followed by Y ), even though the backward transitional probabilities (the probability that Y has been preceded by X) were equally informative about word structure in the languages involved in those studies. In two experiments, we showed that participants were able to learn the words from an artificial speech stream when the only available cues were the backward transitional probabilities.…
Using Automatic Morphological Tools to Process Data from a Learner Corpus of Hungarian
2014
The aim of this article is to show how automatic morphological tools originally used to analyze native speaker data can be applied to process data from a learner corpus of Hungarian. We collected written data from 35 students majoring in Hungarian studies at the University of Zagreb, Croatia. The data were analyzed by magyarlanc, a sentence splitter, morphological analyzer, POS-tagger and dependency parser, which found 667 unknown word forms. We investigated the recommendations made by the Hungarian spellchecker hunspell for these unknown words and the correct forms were manually chosen. It was found that if the first suggestion made by hunspell was automatically accepted, an accuracy score…
Inter-annotator agreement in spoken language annotation: Applying uα-family coefficients to discourse segmentation
2021
As databases make Corpus Linguistics a common tool for most linguists, corpus annotation becomes an increasingly important process. Corpus users do not need only raw data, but also annotated data, submitted to tagging or parsing processes through annotation protocols. One problem with corpus annotation lies in its reliability, that is, in the probability that its results can be replicable by independent researchers. Inter-annotation agreement (IAA) is the process which evaluates the probability that, applying the same protocol, different annotators reach similar results. To measure agreement, different statistical metrics are used. This study applies IAA for the first time to the Valencia E…