Search results for "SIMILARITY"
showing 10 items of 474 documents
An extension of the Burrows-Wheeler Transform
2007
AbstractWe describe and highlight a generalization of the Burrows–Wheeler Transform (bwt) to a multiset of words. The extended transformation, denoted by ebwt, is reversible. Moreover, it allows to define a bijection between the words over a finite alphabet A and the finite multisets of conjugacy classes of primitive words in A∗. Besides its mathematical interest, the extended transform can be useful for applications in the context of string processing. In the last part of this paper we illustrate one such application, providing a similarity measure between sequences based on ebwt.
Non-self-adjoint resolutions of the identity and associated operators
2013
Closed operators in Hilbert space defined by a non-self-adjoint resolution of the identity $$\{X(\lambda )\}_{\lambda \in {\mathbb R}}$$ , whose adjoints constitute also a resolution of the identity, are studied. In particular, it is shown that a closed operator $$B$$ has a spectral representation analogous to the familiar one for self-adjoint operators if and only if $$B=\textit{TAT}^{-1}$$ where $$A$$ is self-adjoint and $$T$$ is a bounded inverse.
Dissimilarity Application in Digitized Mammographic Images Classification.
2006
Purpose of this work is the development of an automatic classification system which could be useful for radiologists in the investigation of breast cancer. The software has been designed in the framework of the MAGIC-5 collaboration. In the traditional way of learning from examples of objects the classifiers are built in a feature space. However, an alternative ways can be found by constructing decision rules on dissimilarity (distance) representations. In such a recognition process a new object is described by its distances to (a subset of) the training samples. The use of the dissimilarities is especially of interest when features are difficult to obtain or when they have a little discrim…
Building Semantic Trees from XML Documents
2016
International audience; The distributed nature of the Web, as a decentralized system exchanging information between heterogeneous sources, has underlined the need to manage interoperability, i.e., the ability to automatically interpret information in Web documents exchanged between different sources, necessary for efficient information management and search applications. In this context, XML was introduced as a data representation standard that simplifies the tasks of interoperation and integration among heterogeneous data sources, allowing to represent data in (semi-) structured documents consisting of hierarchically nested elements and atomic attributes. However, while XML was shown most …
A novel XML document structure comparison framework based-on sub-tree commonalities and label semantics
2012
International audience; XML similarity evaluation has become a central issue in the database and information communities, its applications ranging over document clustering, version control, data integration and ranked retrieval. Various algorithms for comparing hierarchically structured data, XML documents in particular, have been proposed in the literature. Most of them make use of techniques for finding the edit distance between tree structures, XML documents being commonly modeled as Ordered Labeled Trees. Yet, a thorough investigation of current approaches led us to identify several similarity aspects, i.e., sub-tree related structural and semantic similarities, which are not sufficient…
XML document-grammar comparison: related problems and applications
2011
10.2478/s13537-011-0005-1; International audience; XML document comparison is becoming an ever more popular research issue due to the increasingly abundant use of XML. Likewise, a growing interest fosters the development of XML grammar matching and comparison, due to the proliferation of heterogeneous XML data sources, particularly on the Web. Nonetheless, the process of comparing XML documents with XML grammars, i.e., XML document and grammar similarity evaluation, has not yet received the attention it deserves. In this paper, we provide an overview on existing research related to XML document/grammar comparison, presenting the background and discussing the various techniques related to th…
Trading off accuracy for efficiency by randomized greedy warping
2016
Dynamic Time Warping (DTW) is a widely used distance measure for time series data mining. Its quadratic complexity requires the application of various techniques (e.g. warping constraints, lower-bounds) for deployment in real-time scenarios. In this paper we propose a randomized greedy warping algorithm for finding similarity between time series instances. We show that the proposed algorithm outperforms the simple greedy approach and also provides very good time series similarity approximation consistently, as compared to DTW. We show that the Randomized Time Warping (RTW) can be used in place of DTW as a fast similarity approximation technique by trading some classification accuracy for ve…
An Online Time Warping based Map Matching for Vulnerable Road Users’ Safety
2018
International audience; High penetration rate of Smartphones and their increased capabilities to sense, compute, store and communicate have made the devices vital components of intelligent transportation systems. However, their GPS positions accuracy remains insufficient for a lot of location-based applications especially traffic safety ones. In this paper, we developed a new algorithm which is able to improve smartphones GPS accuracy for vulnerable road users' traffic safety. It is a two-stage algorithm: in the first stage GPS readings obtained from smartphones are passed through Kalman filter to smooth deviated reading. Then an adaptive online time warping based map matching is applied to…
Similarity of GPS Trajectories Using Dynamic Time Warping: An Application to Cruise Tourism
2019
The aim of this research is to propose an analysis of the trajectories of cruise passengers at their destination using Dynamic Time Warping algorithm. Data collected by means of GPS devices relating to the behavior of cruise passengers in the port of Palermo have been analyzed in order to show similarities and differences among their spatial trajectories at destination. A cluster analysis has been performed in order to identify segments of cruise passengers, based on the similarity of their trajectories. The results have been compared in terms of several metrics derived from GPS tracking data in order to validate the proposed approach. Our findings are of interest from a methodological pers…
Extreme interdependence and extreme contagion between emerging markets
2007
Abstract This paper uses seemingly unrelated probit techniques to separate the transmission of a crisis due to broadly defined macroeconomic interdependence from contagion due to herding, avoiding some of the caveats of the more traditional cross-correlation approach. We find that pure contagion occurred in a limited number of country pairs generally belonging to the same region. A reduction in speculative pressure can also be identified between countries in different regional blocks. This seems to suggest that after an initial crisis episode, investors tend to discriminate on the basis of location and common macroeconomic weakness or perceived similarity.