Search results for "Tree"
showing 10 items of 1841 documents
WOODIV, a database of occurrences, functional traits, and phylogenetic data for all Euro-Mediterranean trees
2021
Trees play a key role in the structure and function of many ecosystems worldwide. In the Mediterranean Basin, forests cover approximately 22% of the total land area hosting a large number of endemics (46 species). Despite its particularities and vulnerability, the biodiversity of Mediterranean trees is not well known at the taxonomic, spatial, functional, and genetic levels required for conservation applications. The WOODIV database fills this gap by providing reliable occurrences, four functional traits (plant height, seed mass, wood density, and specific leaf area), and sequences from three DNA-regions (rbcL, matK, and trnH-psbA), together with modelled occurrences and a phylogeny for all…
Weighted distance-based trees for ranking data
2017
Within the framework of preference rankings, the interest can lie in finding which predictors and which interactions are able to explain the observed preference structures, because preference decisions will usually depend on the characteristics of both the judges and the objects being judged. This work proposes the use of a univariate decision tree for ranking data based on the weighted distances for complete and incomplete rankings, and considers the area under the ROC curve both for pruning and model assessment. Two real and well-known datasets, the SUSHI preference data and the University ranking data, are used to display the performance of the methodology.
Reducing the effect of the data order in algorithms for constructing phylogenetic trees.
1988
Metagenomics reveals our incomplete knowledge of global diversity
2008
Metagenomic sequencing obtains huge amounts of sequences from environmental and clinical samples, thus providing a glimpse of the global prokaryotic diversity of both species and genes in these sources. The current trend in metagenomic analysis follows the so-called gene-centric approach, focused on describing the environments by the study of the functional roles of the proteins encoded in the sequenced genes. In this way, it is clear that metagenomic analysis relies heavily on the accurate knowledge of the universe of proteins stored in the databases. Nevertheless, it is known that some biases exist in the composition of databases (which are rich in sequences from common, cultivable and ea…
Statistically validated hierarchical clustering: Nested partitions in hierarchical trees
2022
We develop an algorithm that is fast and scalable in the detection of a nested partition extracted from a dendrogram that is obtained from hierarchical clustering of a multivariate series. Our algorithm provides a -value for each clade observed in the hierarchical tree. The -value is obtained by computing many bootstrap replicas of the dissimilarity matrix and by performing a statistical test on each difference between the dissimilarity associated with a given clade and the dissimilarity of the clade of its parent node. We prove the efficacy of our algorithm with a set of benchmarks generated by a hierarchically nested factor model. We compare results obtained by our algorithm with those of…
Local bandwidth selection for kernel density estimation in a bifurcating Markov chain model
2020
International audience; We propose an adaptive estimator for the stationary distribution of a bifurcating Markov Chain onRd. Bifurcating Markov chains (BMC for short) are a class of stochastic processes indexed by regular binary trees. A kernel estimator is proposed whose bandwidths are selected by a method inspired by the works of Goldenshluger and Lepski [(2011), 'Bandwidth Selection in Kernel Density Estimation: Oracle Inequalities and Adaptive Minimax Optimality',The Annals of Statistics3: 1608-1632). Drawing inspiration from dimension jump methods for model selection, we also provide an algorithm to select the best constant in the penalty. Finally, we investigate the performance of the…
A geostatistical approach for dynamic life tables: The effect of mortality on remaining lifetime and annuities
2010
Dynamic life tables arise as an alternative to the standard (static) life table, with the aim of incorporating the evolution of mortality over time. The parametric model introduced by Lee and Carter in 1992 for projected mortality rates in the US is one of the most outstanding and has been used a great deal since then. Different versions of the model have been developed but all of them, together with other parametric models, consider the observed mortality rates as independent observations. This is a difficult hypothesis to justify when looking at the graph of the residuals obtained with any of these methods. Methods of adjustment and prediction based on geostatistical techniques which expl…
Parallel Construction and Query of Index Data Structures for Pattern Matching on Square Matrices
1999
AbstractWe describe fast parallel algorithms for building index data structures that can be used to gather various statistics on square matrices. The main data structure is the Lsuffix tree, which is a generalization of the classical suffix tree for strings. Given ann×ntext matrixA, we build our data structures inO(logn) time withn2processors on a CRCW PRAM, so that we can quickly processAin parallel as follows: (i) report some statistical information aboutA, e.g., find the largest repeated square submatrices that appear at least twice inAor determine, for each position inA, the smallest submatrix that occurs only there; (ii) given, on-line, anm×mpattern matrixPAT, check whether it occurs i…
Classification trees for multivariate ordinal response: an application to Student Evaluation Teaching
2016
Data from multiple items on an ordinal scale are commonly collected when qualitative variables, such as feelings, attitudes and many other behavioral and health-related variables are observed. In this paper we introduce a method to derive a distance-based tree for multivariate ordinal response that allows, when subject-specific characteristics are available, to derive common profiles for respondents giving the same/similar multivariate ratings. Special attention will be paid to the performance comparison in terms of AUC, for three different distances used as splitting criteria. Simulated data an a dataset from a Student Evaluation of Teaching survey will be used as illustrative examples. Th…
Degree stability of a minimum spanning tree of price return and volatility
2002
We investigate the time series of the degree of minimum spanning trees obtained by using a correlation based clustering procedure which is starting from (i) asset return and (ii) volatility time series. The minimum spanning tree is obtained at different times by computing correlation among time series over a time window of fixed length $T$. We find that the minimum spanning tree of asset return is characterized by stock degree values, which are more stable in time than the ones obtained by analyzing a minimum spanning tree computed starting from volatility time series. Our analysis also shows that the degree of stocks has a very slow dynamics with a time-scale of several years in both cases.