Search results for "Abstract data type"

showing 10 items of 1140 documents

Temporal aggregation in chain graph models

2005

The dependence structure of an observed process induced by temporal aggregation of a time evolving hidden spatial phenomenon is addressed. Data are described by means of chain graph models and an algorithm to compute the chain graph resulting from the temporal aggregation of a directed acyclic graph is provided. This chain graph is the best graph which covers the independencies of the resulting process within the chain graph class. A sufficient condition that produces a memory loss of the observed process with respect to its hidden origin is analyzed. Some examples are used for illustrating algorithms and results.

Statistics and ProbabilityApplied MathematicsVoltage graphDirected graphStrength of a graphTopologyGraph (abstract data type)Statistics Probability and UncertaintyNull graphGraph propertyAlgorithmComplement graphMathematicsofComputing_DISCRETEMATHEMATICSMoral graphMathematicsJournal of Statistical Planning and Inference
researchProduct

A penalized approach for the bivariate ordered logistic model with applications to social and medical data

2018

Bivariate ordered logistic models (BOLMs) are appealing to jointly model the marginal distribution of two ordered responses and their association, given a set of covariates. When the number of categories of the responses increases, the number of global odds ratios to be estimated also increases, and estimation gets problematic. In this work we propose a non-parametric approach for the maximum likelihood (ML) estimation of a BOLM, wherein penalties to the differences between adjacent row and column effects are applied. Our proposal is then compared to the Goodman and Dale models. Some simulation results as well as analyses of two real data sets are presented and discussed.

Statistics and ProbabilityAssociation (object-oriented programming)05 social sciencesDale modelBivariate analysisLogistic regression01 natural sciencesbivariate ordered logistic modelSet (abstract data type)010104 statistics & probabilityordinal associationpenalized maximum likelihood estimation0502 economics and businessStatisticsCovariateDale model bivariate ordered logistic model penalized maximum likelihood estimation ordinal associationSettore SECS-S/05 - Statistica Sociale0101 mathematicsStatistics Probability and UncertaintyMarginal distributionSettore SECS-S/01 - Statistica050205 econometrics MathematicsOrdinal association
researchProduct

A model-based approach to Spotify data analysis: a Beta GLMM

2020

Digital music distribution is increasingly powered by automated mechanisms that continuously capture, sort and analyze large amounts of Web-based data. This paper deals with the management of songs audio features from a statistical point of view. In particular, it explores the data catching mechanisms enabled by Spotify Web API and suggests statistical tools for the analysis of these data. Special attention is devoted to songs popularity and a Beta model, including random effects, is proposed in order to give the first answer to questions like: which are the determinants of popularity? The identification of a model able to describe this relationship, the determination within the set of char…

Statistics and ProbabilityBeta GLMMDistribution (number theory)Computer scienceApplication Notes0211 other engineering and technologies02 engineering and technologycomputer.software_genreWeb API01 natural sciencesSet (abstract data type)010104 statistics & probabilitySpotify Web API audio features Popularity Index Beta GLMMsortSpotify Web API0101 mathematicsDigital audio021103 operations researchPoint (typography)Random effects modelData sciencePopularityIdentification (information)Popularity IndexData miningStatistics Probability and Uncertaintycomputeraudio feature
researchProduct

Multiple testing in candidate gene situations: a comparison of classical, discrete, and resampling-based procedures.

2011

In candidate gene association studies, usually several elementary hypotheses are tested simultaneously using one particular set of data. The data normally consist of partly correlated SNP information. Every SNP can be tested for association with the disease, e.g., using the Cochran-Armitage test for trend. To account for the multiplicity of the test situation, different types of multiple testing procedures have been proposed. The question arises whether procedures taking into account the discreteness of the situation show a benefit especially in case of correlated data. We empirically evaluate several different multiple testing procedures via simulation studies using simulated correlated SN…

Statistics and ProbabilityCandidate geneContrast (statistics)computer.software_genrePolymorphism Single NucleotideSet (abstract data type)Computational MathematicsSample size determinationResamplingData Interpretation StatisticalSample SizeStatisticsMultiple comparisons problemGeneticsCochran–Armitage test for trendRange (statistics)HumansComputer SimulationDiseaseData miningMolecular BiologycomputerGenetic Association StudiesMathematicsStatistical applications in genetics and molecular biology
researchProduct

Consensus among preference rankings: a new weighted correlation coefficient for linear and weak orderings

2021

AbstractPreference data are a particular type of ranking data where some subjects (voters, judges,...) express their preferences over a set of alternatives (items). In most real life cases, some items receive the same preference by a judge, thus giving rise to a ranking with ties. An important issue involving rankings concerns the aggregation of the preferences into a “consensus”. The purpose of this paper is to investigate the consensus between rankings with ties, taking into account the importance of swapping elements belonging to the top (or to the bottom) of the ordering (position weights). By combining the structure of $$\tau _x$$ τ x proposed by Emond and Mason (J Multi-Criteria Decis…

Statistics and ProbabilityClass (set theory)Correlation coefficientApplied Mathematics02 engineering and technologyType (model theory)01 natural sciencesComputer Science ApplicationsSet (abstract data type)010104 statistics & probabilityRankingPosition (vector)StatisticsWeighted Rank correlation coefficient Weighted Kemeny distance Position weightsTies0202 electrical engineering electronic engineering information engineering020201 artificial intelligence & image processing0101 mathematicsSettore SECS-S/01 - StatisticaPreference (economics)MathematicsRank correlationAdvances in Data Analysis and Classification
researchProduct

Community detection algorithm evaluation with ground-truth data

2018

International audience; Community structure is of paramount importance for the understanding of complex networks. Consequently, there is a tremendous effort in order to develop efficient community detection algorithms. Unfortunately, the issue of a fair assessment of these algorithms is a thriving open question. If the ground-truth community structure is available, various clustering-based metrics are used in order to compare it versus the one discovered by these algorithms. However, these metrics defined at the node level are fairly insensitive to the variation of the overall community structure. To overcome these limitations, we propose to exploit the topological features of the ‘communit…

Statistics and ProbabilityComputer science‘Community-graph’Community structureVariation (game tree)[INFO.INFO-RO]Computer Science [cs]/Operations Research [cs.RO]Complex networkCondensed Matter Physics01 natural sciencesGraph010305 fluids & plasmasCommunity structureSet (abstract data type)0103 physical sciencesNetwork analysis010306 general physicsCluster analysisAlgorithmNetwork analysis
researchProduct

MCRL: using a reference library to compress a metagenome into a non-redundant list of sequences, considering viruses as a case study

2019

Abstract Motivation Metagenomes offer a glimpse into the total genomic diversity contained within a sample. Currently, however, there is no straightforward way to obtain a non-redundant list of all putative homologs of a set of reference sequences present in a metagenome. Results To address this problem, we developed a novel clustering approach called ‘metagenomic clustering by reference library’ (MCRL), where a reference library containing a set of reference genes is clustered with respect to an assembled metagenome. According to our proposed approach, reference genes homologous to similar sets of metagenomic sequences, termed ‘signatures’, are iteratively clustered in a greedy fashion, re…

Statistics and ProbabilityContigComputer scienceRobustness (evolution)Computational biologyOriginal PapersBiochemistryComputer Science ApplicationsSet (abstract data type)Computational MathematicsComputational Theory and MathematicsMetagenomicsReference genesGene familyHuman viromeCluster analysisMolecular BiologyBioinformatics
researchProduct

Stochastic Learning for SAT- Encoded Graph Coloring Problems

2010

The graph coloring problem (GCP) is a widely studied combinatorial optimization problem due to its numerous applications in many areas, including time tabling, frequency assignment, and register allocation. The need for more efficient algorithms has led to the development of several GC solvers. In this paper, the authors introduce a team of Finite Learning Automata, combined with the random walk algorithm, using Boolean satisfiability encoding for the GCP. The authors present an experimental analysis of the new algorithm’s performance compared to the random walk technique, using a benchmark set containing SAT-encoding graph coloring test sets.

Statistics and ProbabilityDiscrete mathematicsControl and OptimizationTheoretical computer scienceComparability graphComputer Science ApplicationsGreedy coloringComputational MathematicsEdge coloringComputational Theory and MathematicsModeling and SimulationGraph (abstract data type)Decision Sciences (miscellaneous)Graph coloringFractional coloringGraph factorizationList coloringMathematicsInternational Journal of Applied Metaheuristic Computing
researchProduct

Analyzing Temperature Effects on Mortality Within theREnvironment: The Constrained Segmented Distributed Lag Parameterization

2010

Here we present and discuss the R package modTempEff including a set of functions aimed at modelling temperature effects on mortality with time series data. The functions fit a particular log linear model which allows to capture the two main features of mortality- temperature relationships: nonlinearity and distributed lag effect. Penalized splines and segmented regression constitute the core of the modelling framework. We briefly review the model and illustrate the functions throughout a simulated dataset.

Statistics and ProbabilityDistributed lagtemperature effects segmented relationship break point P-splines RMathematical optimizationComputer scienceP-splinesRsegmented relationshipSet (abstract data type)R packageNonlinear systemBreak pointApplied mathematicsLog-linear modelbreak pointStatistics Probability and UncertaintySegmented regressionTime seriesSettore SECS-S/01 - Statisticatemperature effectslcsh:Statisticslcsh:HA1-4737SoftwareJournal of Statistical Software
researchProduct

Ranking Scientific Journals Via Latent Class Models for Polytomous Item Response Data

2015

Summary We propose a model-based strategy for ranking scientific journals starting from a set of observed bibliometric indicators that represent imperfect measures of the unobserved ‘value’ of a journal. After discretizing the available indicators, we estimate an extended latent class model for polytomous item response data and use the estimated model to cluster journals. We illustrate our approach by using the data from the Italian research evaluation exercise that was carried out for the period 2004–2010, focusing on the set of journals that are considered relevant for the subarea statistics and financial mathematics. Using four bibliometric indicators (IF, IF5, AIS and the h-index), some…

Statistics and ProbabilityEconomics and EconometricEconomics and EconometricsClass (set theory)Research evaluationClusteringSet (abstract data type)Valutazione della Qualità delle RicercaCovariateStatisticsEconometricsFinite mixture modelsCluster analysisFinite mixture modelMathematicsGraded response modelMathematical financeItem response theory modelsItem response theory modelProbability and statisticsLatent class modelRankingStatistics Probability and UncertaintySettore SECS-S/01 - StatisticaValutazione della Qualità delle Ricerca; Clustering; Finite mixture models; Graded response model; Item response theory models; Research evaluation;Social Sciences (miscellaneous)Journal of the Royal Statistical Society Series A: Statistics in Society
researchProduct