Search results for "Network science"
showing 10 items of 103 documents
Measuring the agreement between brain connectivity networks.
2016
Investigating the level of similarity between two brain networks, resulting from measures of effective connectivity in the brain, can be of interest from many respects. In this study, we propose and test the idea to borrow measures of association used in machine learning to provide a measure of similarity between the structure of (un-weighted) brain connectivity networks. The measures here explored are the accuracy, Cohen's Kappa (K) and Area Under Curve (AUC). We implemented two simulation studies, reproducing two contexts of application that can be particularly interesting for practical applications, namely: i) in methodological studies, performed on surrogate data, aiming at comparing th…
Set similarity joins on mapreduce
2018
Set similarity joins, which compute pairs of similar sets, constitute an important operator primitive in a variety of applications, including applications that must process large amounts of data. To handle these data volumes, several distributed set similarity join algorithms have been proposed. Unfortunately, little is known about the relative performance, strengths and weaknesses of these techniques. Previous comparisons are limited to a small subset of relevant algorithms, and the large differences in the various test setups make it hard to draw overall conclusions. In this paper we survey ten recent, distributed set similarity join algorithms, all based on the MapReduce paradigm. We emp…
A naive relevance feedback model for content-based image retrieval using multiple similarity measures
2010
This paper presents a novel probabilistic framework to process multiple sample queries in content based image retrieval (CBIR). This framework is independent from the underlying distance or (dis)similarity measures which support the retrieval system, and only assumes mutual independence among their outcomes. The proposed framework gives rise to a relevance feedback mechanism in which positive and negative data are combined in order to optimally retrieve images according to the available information. A particular setting in which users interactively supply feedback and iteratively retrieve images is set both to model the system and to perform some objective performance measures. Several repo…
Interactive Image Retrieval Using Smoothed Nearest Neighbor Estimates
2010
Relevance feedback has been adopted by most recent Content Based Image Retrieval systems to reduce the semantic gap that exists between the subjective similarity among images and the similarity measures computed in a given feature space. Distance-based relevance feedback using nearest neighbors has been recently presented as a good tradeoff between simplicity and performance. In this paper, we analyse some shortages of this technique and propose alternatives that help improving the efficiency of the method in terms of the retrieval precision achieved. The resulting method has been evaluated on several repositories which use different feature sets. The results have been compared to those obt…
Cluster Aggregation for Analyzing Event-Related Potentials
2017
Topographic analysis are references independent for Event-Related Potentials (ERPs), and thus render statistically unambiguous results. This drives us to develop an effective clustering approach to finding temporal samples possessing similar topographies for analysing the temporal-spatial ERPs data. The previous study called CARTOOL used single clustering method to cluster ERP data. Indeed, given a clustering method, the quality of clustering varies with data and the number of clusters, motivating us to implement and compare multiple clustering algorithms via using multiple similarity measurements. By finding the minimum distance among the various clustering methods and selecting the most s…
Bioinformatics and Computational Biology
2009
Bioinformatics is a new, rapidly expanding field that uses computational approaches to answer biological questions (Baxevanis, 2005). These questions are answered by means of analyzing and mining biological data. The field of bioinformatics or computational biology is a multidisciplinary research and development environment, in which a variety of techniques from computer science, applied mathematics, linguistics, physics, and, statistics are used. The terms bioinformatics and computational biology are often used interchangeably (Baldi, 1998; Pevzner, 2000). This new area of research is driven by the wealth of data from high throughput genome projects, such as the human genome sequencing pro…
Selecting and Retaining Friends on the Basis of Cigarette Smoking Similarity
2013
This study examines whether friend selection, deselection, and socialization differ as a function of the level of cigarette smoking in the friendship group. A total of 1419 students (median age = 16) from upper secondary and vocational schools in Finland were included as targets in the peer network. Targets in the peer network were asked to nominate friends and describe their own cigarette smoking at two time points one year apart. Network analyses revealed similarity arising from selection and deselection on the basis of smoking. Selection effects (i.e., selecting new friends based on similarity) were stronger for adolescents in low-smoking groups. Deselection effects (i.e., dropping frien…
DBSCAN Algorithm for Document Clustering
2019
Abstract Document clustering is a problem of automatically grouping similar document into categories based on some similarity metrics. Almost all available data, usually on the web, are unclassified so we need powerful clustering algorithms that work with these types of data. All common search engines return a list of pages relevant to the user query. This list needs to be generated fast and as correct as possible. For this type of problems, because the web pages are unclassified, we need powerful clustering algorithms. In this paper we present a clustering algorithm called DBSCAN – Density-Based Spatial Clustering of Applications with Noise – and its limitations on documents (or web pages)…
Trading off accuracy for efficiency by randomized greedy warping
2016
Dynamic Time Warping (DTW) is a widely used distance measure for time series data mining. Its quadratic complexity requires the application of various techniques (e.g. warping constraints, lower-bounds) for deployment in real-time scenarios. In this paper we propose a randomized greedy warping algorithm for finding similarity between time series instances. We show that the proposed algorithm outperforms the simple greedy approach and also provides very good time series similarity approximation consistently, as compared to DTW. We show that the Randomized Time Warping (RTW) can be used in place of DTW as a fast similarity approximation technique by trading some classification accuracy for ve…
Similarity of GPS Trajectories Using Dynamic Time Warping: An Application to Cruise Tourism
2019
The aim of this research is to propose an analysis of the trajectories of cruise passengers at their destination using Dynamic Time Warping algorithm. Data collected by means of GPS devices relating to the behavior of cruise passengers in the port of Palermo have been analyzed in order to show similarities and differences among their spatial trajectories at destination. A cluster analysis has been performed in order to identify segments of cruise passengers, based on the similarity of their trajectories. The results have been compared in terms of several metrics derived from GPS tracking data in order to validate the proposed approach. Our findings are of interest from a methodological pers…