Search results for "SIMILARITY"

showing 10 items of 474 documents

SCCF Parameter and Similarity Measure Optimization and Evaluation

2019

Neighborhood-based Collaborative Filtering (CF) is one of the most successful and widely used recommendation approaches; however, it suffers from major flaws especially under sparse environments. Traditional similarity measures used by neighborhood-based CF to find similar users or items are not suitable in sparse datasets. Sparse Subspace Clustering and common liking rate in CF (SCCF), a recently published research, proposed a tunable similarity measure oriented towards sparse datasets; however, its performance can be maximized and requires further analysis and investigation. In this paper, we propose and evaluate the performance of a new tuning mechanism, using the Mean Absolute Error (MA…

Computer science020206 networking & telecommunications02 engineering and technologyRecommender systemSimilarity measurecomputer.software_genreMeasure (mathematics)Similarity (network science)Subspace clustering0202 electrical engineering electronic engineering information engineeringCollaborative filtering020201 artificial intelligence & image processingData miningcomputerSelection (genetic algorithm)Overall efficiency

researchProduct

Compression-based classification of biological sequences and structures via the Universal Similarity Metric: experimental assessment.

2007

Abstract Background Similarity of sequences is a key mathematical notion for Classification and Phylogenetic studies in Biology. It is currently primarily handled using alignments. However, the alignment methods seem inadequate for post-genomic studies since they do not scale well with data set size and they seem to be confined only to genomic and proteomic sequences. Therefore, alignment-free similarity measures are actively pursued. Among those, USM (Universal Similarity Metric) has gained prominence. It is based on the deep theory of Kolmogorov Complexity and universality is its most novel striking feature. Since it can only be approximated via data compression, USM is a methodology rath…

Computer scienceAlgorismesPrediction by partial matchingCompression dissimilaritycomputer.software_genreBiochemistryProtein Structure SecondaryPhylogenetic studiesStructural BiologySequence Analysis ProteinDatabases Proteinlcsh:QH301-705.5Biological dataNCDApplied MathematicsGenomicsClassificationCDComputer Science ApplicationsBenchmarking:Informàtica::Informàtica teòrica [Àrees temàtiques de la UPC]Universal compression dissimilarityArea Under CurveMetric (mathematics)lcsh:R858-859.7Data miningAlgorithmsData compressionResearch Article:Informàtica::Aplicacions de la informàtica::Bioinformàtica [Àrees temàtiques de la UPC]Normalization (statistics)lcsh:Computer applications to medicine. Medical informaticsBioinformatics Sequence Alignment AlgorithmsSet (abstract data type)Similarity (network science)Normalized compression sissimilarityData compression (Computer science)AnimalsHumansAmino Acid SequenceMolecular BiologyBiologyDades -- Compressió (Informàtica)USMUniversal similarity metricProteinsUCDProtein Structure TertiaryData setGenòmicaStatistical classificationlcsh:Biology (General)ROC CurvecomputerSequence AlignmentSoftwareBMC bioinformatics

researchProduct

Least-squares community extraction in feature-rich networks using similarity data

2021

We explore a doubly-greedy approach to the issue of community detection in feature-rich networks. According to this approach, both the network and feature data are straightforwardly recovered from the underlying unknown non-overlapping communities, supplied with a center in the feature space and intensity weight(s) over the network each. Our least-squares additive criterion allows us to search for communities one-by-one and to find each community by adding entities one by one. A focus of this paper is that the feature-space data part is converted into a similarity matrix format. The similarity/link values can be used in either of two modes: (a) as measured in the same scale so that one may …

Computer scienceEconomicsKernel FunctionsSocial Sciences02 engineering and technologyLeast squaresInfographicsTranslocation GeneticGeographical LocationsMedical Conditions0202 electrical engineering electronic engineering information engineeringMedicine and Health SciencesPsychologyCluster AnalysisOperator TheoryData ManagementMultidisciplinaryApplied MathematicsSimulation and ModelingQRExperimental PsychologyEuropeFeature (computer vision)Research DesignPhysical SciencesMedicine020201 artificial intelligence & image processingGraphsAlgorithmsNetwork AnalysisNetwork analysisResearch ArticleComputer and Information SciencesScienceFeature vectorScale (descriptive set theory)Research and Analysis MethodsColumn (database)Similarity (network science)020204 information systemsParasitic DiseasesLeast-Squares AnalysisFeature databusiness.industryData VisualizationBiology and Life SciencesPattern recognitionTropical DiseasesEconomic AnalysisMalariaPeople and PlacesArtificial intelligencebusinessMathematicsPLoS ONE

researchProduct

MetNet: A two-level approach to reconstructing and comparing metabolic networks

2021

Metabolic pathway comparison and interaction between different species can detect important information for drug engineering and medical science. In the literature, proposals for reconstructing and comparing metabolic networks present two main problems: network reconstruction requires usually human intervention to integrate information from different sources and, in metabolic comparison, the size of the networks leads to a challenging computational problem. We propose to automatically reconstruct a metabolic network on the basis of KEGG database information. Our proposal relies on a two-level representation of the huge metabolic network: the first level is graph-based and depicts pathways a…

Computer scienceEnzyme MetabolismMetabolic networkcomputer.software_genreBiochemistryInfographics0302 clinical medicineCluster AnalysisEnzyme ChemistryData ManagementMammals0303 health sciencesMultidisciplinaryBasis (linear algebra)Settore INF/01 - InformaticaQRChemical ReactionsEukaryotaGraphChemistryVertebratesPhysical SciencesMedicineCarbohydrate MetabolismData miningMetabolic PathwaysComputational problemGraphsNetwork AnalysisMetabolic Networks and PathwaysResearch ArticleComputer and Information SciencesComputingMethodologies_SIMULATIONANDMODELINGScience03 medical and health sciencesMetabolic NetworksSimilarity (psychology)Xenobiotic MetabolismAnimalsHumansMetabolomicsKEGGRepresentation (mathematics)Symbiosis030304 developmental biologyData VisualizationOrganismsBiology and Life SciencesMetabolismMetabolic pathwayComputingMethodologies_PATTERNRECOGNITIONMetabolismAmniotesEnzymologycomputerZoology030217 neurology & neurosurgerySoftwarePLoS ONE

researchProduct

Feature Dimensionality Reduction for Mammographic Report Classification

2016

The amount and the variety of available medical data coming from multiple and heterogeneous sources can inhibit analysis, manual interpretation, and use of simple data management applications. In this paper a deep overview of the principal algorithms for dimensionality reduction is carried out; moreover, the most effective techniques are applied on a dataset composed of 4461 mammographic reports is presented. The most useful medical terms are converted and represented using a TF-IDF matrix, in order to enable data mining and retrieval tasks. A series of query have been performed on the raw matrix and on the same matrix after the dimensionality reduction obtained using the most useful techni…

Computer scienceLatent semantic analysisbusiness.industryDimensionality reductionData managementCosine similarityPattern recognitionLatent Semantic Analysis (LSA)02 engineering and technologySingular Value Decomposition (SVD)Medical Application03 medical and health sciencesMatrix (mathematics)0302 clinical medicineFeature Dimensionality ReductionFeature (computer vision)Singular value decompositionPrincipal component analysis0202 electrical engineering electronic engineering information engineering020201 artificial intelligence & image processing030212 general & internal medicineArtificial intelligencebusinessPrincipal Component Analysis (PCA)

researchProduct

Measuring the agreement between brain connectivity networks.

2016

Investigating the level of similarity between two brain networks, resulting from measures of effective connectivity in the brain, can be of interest from many respects. In this study, we propose and test the idea to borrow measures of association used in machine learning to provide a measure of similarity between the structure of (un-weighted) brain connectivity networks. The measures here explored are the accuracy, Cohen's Kappa (K) and Area Under Curve (AUC). We implemented two simulation studies, reproducing two contexts of application that can be particularly interesting for practical applications, namely: i) in methodological studies, performed on surrogate data, aiming at comparing th…

Computer scienceModels NeurologicalStructure (category theory)Biomedical EngineeringSignal Processing; Biomedical Engineering; 1707; Health InformaticsHealth Informatics02 engineering and technologycomputer.software_genreMeasure (mathematics)Surrogate dataData modeling03 medical and health sciencesAnalysis of Variance Area Under Curve Brain Brain Mapping Computer Simulation Electroencephalography Humans Nerve Net Signal Processing Computer-Assisted Models Neurological0302 clinical medicineSimilarity (network science)0202 electrical engineering electronic engineering information engineeringHumansComputer SimulationSensitivity (control systems)1707Analysis of VarianceBrain MappingBrainElectroencephalographySignal Processing Computer-AssistedArea Under CurveSignal Processing020201 artificial intelligence & image processingData miningNerve Netcomputer030217 neurology & neurosurgeryAnnual International Conference of the IEEE Engineering in Medicine and Biology Society. IEEE Engineering in Medicine and Biology Society. Annual International Conference

researchProduct

Set similarity joins on mapreduce

2018

Set similarity joins, which compute pairs of similar sets, constitute an important operator primitive in a variety of applications, including applications that must process large amounts of data. To handle these data volumes, several distributed set similarity join algorithms have been proposed. Unfortunately, little is known about the relative performance, strengths and weaknesses of these techniques. Previous comparisons are limited to a small subset of relevant algorithms, and the large differences in the various test setups make it hard to draw overall conclusions. In this paper we survey ten recent, distributed set similarity join algorithms, all based on the MapReduce paradigm. We emp…

Computer scienceProcess (engineering)General EngineeringJoinsScale (descriptive set theory)02 engineering and technologycomputer.software_genreSet (abstract data type)Range (mathematics)Operator (computer programming)Similarity (network science)020204 information systems0202 electrical engineering electronic engineering information engineeringJoin (sigma algebra)020201 artificial intelligence & image processingData miningcomputerProceedings of the VLDB Endowment

researchProduct

A naive relevance feedback model for content-based image retrieval using multiple similarity measures

2010

This paper presents a novel probabilistic framework to process multiple sample queries in content based image retrieval (CBIR). This framework is independent from the underlying distance or (dis)similarity measures which support the retrieval system, and only assumes mutual independence among their outcomes. The proposed framework gives rise to a relevance feedback mechanism in which positive and negative data are combined in order to optimally retrieve images according to the available information. A particular setting in which users interactively supply feedback and iteratively retrieve images is set both to model the system and to perform some objective performance measures. Several repo…

Computer scienceRelevance feedbackContent-based image retrievalcomputer.software_genreSimilitudeSet (abstract data type)Similarity (network science)Artificial IntelligenceSignal ProcessingComputer Vision and Pattern RecognitionData miningImage retrievalcomputerSoftwareIndependence (probability theory)Pattern Recognition

researchProduct

Classification Similarity Learning Using Feature-Based and Distance-Based Representations: A Comparative Study

2015

Automatically measuring the similarity between a pair of objects is a common and important task in the machine learning and pattern recognition fields. Being an object of study for decades, it has lately received an increasing interest from the scientific community. Usually, the proposed solutions have used either a feature-based or a distance-based representation to perform learning and classification tasks. This article presents the results of a comparative experimental study between these two approaches for computing similarity scores using a classification-based method. In particular, we use the Support Vector Machine as a flexible combiner both for a high dimensional feature space and …

Computer sciencebusiness.industryFeature vectorPattern recognitionMachine learningcomputer.software_genreDistance measuresSupport vector machineArtificial IntelligenceFeature basedArtificial intelligencebusinessImage retrievalcomputerClassifier (UML)Similarity learningDistance basedApplied Artificial Intelligence

researchProduct

Interactive Image Retrieval Using Smoothed Nearest Neighbor Estimates

2010

Relevance feedback has been adopted by most recent Content Based Image Retrieval systems to reduce the semantic gap that exists between the subjective similarity among images and the similarity measures computed in a given feature space. Distance-based relevance feedback using nearest neighbors has been recently presented as a good tradeoff between simplicity and performance. In this paper, we analyse some shortages of this technique and propose alternatives that help improving the efficiency of the method in terms of the retrieval precision achieved. The resulting method has been evaluated on several repositories which use different feature sets. The results have been compared to those obt…

Computer sciencebusiness.industryFeature vectorRelevance feedbackPattern recognitionContent-based image retrievalcomputer.software_genrek-nearest neighbors algorithmSimilarity (network science)Feature (computer vision)Visual WordArtificial intelligenceData miningbusinessImage retrievalcomputer

researchProduct