Search results for "High-dimension"
showing 10 items of 43 documents
A fast and recursive algorithm for clustering large datasets with k-medians
2012
Clustering with fast algorithms large samples of high dimensional data is an important challenge in computational statistics. Borrowing ideas from MacQueen (1967) who introduced a sequential version of the $k$-means algorithm, a new class of recursive stochastic gradient algorithms designed for the $k$-medians loss criterion is proposed. By their recursive nature, these algorithms are very fast and are well adapted to deal with large samples of data that are allowed to arrive sequentially. It is proved that the stochastic gradient algorithm converges almost surely to the set of stationary points of the underlying loss criterion. A particular attention is paid to the averaged versions, which…
The conditional censored graphical lasso estimator
2020
© 2020, Springer Science+Business Media, LLC, part of Springer Nature. In many applied fields, such as genomics, different types of data are collected on the same system, and it is not uncommon that some of these datasets are subject to censoring as a result of the measurement technologies used, such as data generated by polymerase chain reactions and flow cytometer. When the overall objective is that of network inference, at possibly different levels of a system, information coming from different sources and/or different steps of the analysis can be integrated into one model with the use of conditional graphical models. In this paper, we develop a doubly penalized inferential procedure for…
Extended differential geometric LARS for high-dimensional GLMs with general dispersion parameter
2018
A large class of modeling and prediction problems involves outcomes that belong to an exponential family distribution. Generalized linear models (GLMs) are a standard way of dealing with such situations. Even in high-dimensional feature spaces GLMs can be extended to deal with such situations. Penalized inference approaches, such as the $$\ell _1$$ or SCAD, or extensions of least angle regression, such as dgLARS, have been proposed to deal with GLMs with high-dimensional feature spaces. Although the theory underlying these methods is in principle generic, the implementation has remained restricted to dispersion-free models, such as the Poisson and logistic regression models. The aim of this…
Multivariate nonparametric tests of independence
2005
New test statistics are proposed for testing whether two random vectors are independent. Gieser and Randles, as well as Taskinen, Kankainen, and Oja have introduced and discussed multivariate extensions of the quadrant test of Blomqvist. This article serves as a sequel to this work and presents new multivariate extensions of Kendall's tau and Spearman's rho statistics. Two different approaches are discussed. First, interdirection proportions are used to estimate the cosines of angles between centered observation vectors and between differences of observation vectors. Second, covariances between affine-equivariant multivariate signs and ranks are used. The test statistics arising from these …
A Software Tool For Sparse Estimation Of A General Class Of High-dimensional GLMs
2022
Generalized linear models are the workhorse of many inferential problems. Also in the modern era with high-dimensional settings, such models have been proven to be effective exploratory tools. Most attention has been paid to Gaussian, binomial and Poisson settings, which have efficient computational implementations and where either the dispersion parameter is largely irrelevant or absent. However, general GLMs have dispersion parameters φ that affect the value of the log- likelihood. This in turn, affects the value of various information criteria such as AIC and BIC, and has a considerable impact on the computation and selection of the optimal model.The R-package dglars is one of the standa…
cglasso: An R Package for Conditional Graphical Lasso Inference with Censored and Missing Values
2023
Sparse graphical models have revolutionized multivariate inference. With the advent of high-dimensional multivariate data in many applied fields, these methods are able to detect a much lower-dimensional structure, often represented via a sparse conditional independence graph. There have been numerous extensions of such methods in the past decade. Many practical applications have additional covariates or suffer from missing or censored data. Despite the development of these extensions of sparse inference methods for graphical models, there have been so far no implementations for, e.g., conditional graphical models. Here we present the general-purpose package cglasso for estimating sparse co…
Structure of equilibrium states on self-affine sets and strict monotonicity of affinity dimension
2016
A fundamental problem in the dimension theory of self‐affine sets is the construction of high‐dimensional measures which yield sharp lower bounds for the Hausdorff dimension of the set. A natural strategy for the construction of such high‐dimensional measures is to investigate measures of maximal Lyapunov dimension; these measures can be alternatively interpreted as equilibrium states of the singular value function introduced by Falconer. While the existence of these equilibrium states has been well known for some years their structure has remained elusive, particularly in dimensions higher than two. In this article we give a complete description of the equilibrium states of the singular va…
Using Differential Geometry for Sparse High-Dimensional Risk Regression Models
2023
With the introduction of high-throughput technologies in clinical and epidemiological studies, the need for inferential tools that are able to deal with fat data-structures, i.e., relatively small number of observations compared to the number of features, is becoming more prominent. In this paper we propose an extension of the dgLARS method to high-dimensional risk regression models. The main idea of the proposed method is to use the differential geometric structure of the partial likelihood function in order to select the optimal subset of covariates.
INVESTIGATION, REALIZATION, AND ENTANGLEMENT CHARACTERIZATION OF COMPLEX OPTICAL QUANTUM STATES
2020
Meeting the Challenges of High-Dimensional Single-Cell Data Analysis in Immunology
2019
Recent advances in cytometry have radically altered the fate of single-cell proteomics by allowing a more accurate understanding of complex biological systems. Mass cytometry (CyTOF) provides simultaneous single-cell measurements that are crucial to understand cellular heterogeneity and identify novel cellular subsets. High-dimensional CyTOF data were traditionally analyzed by gating on bivariate dot plots, which are not only laborious given the quadratic increase of complexity with dimension but are also biased through manual gating. This review aims to discuss the impact of new analysis techniques for in-depths insights into the dynamics of immune regulation obtained from static snapshot …