Search results for "sparse"

showing 10 items of 75 documents

Do Country Stereotypes Exist in PISA? A Clustering Approach for Large, Sparse, and Weighted Data

2015

Certain stereotypes can be associated with people from different countries. For example, the Italians are expected to be emotional, the Germans functional, and the Chinese hard-working. In this study, we cluster all 15-year-old students representing the 68 different nations and territories that participated in the latest Programme for International Student Assessment (PISA 2012). The hypothesis is that the students will start to form their own country groups when clustered according to the scale indices that summarize many of the students’ characteristics. In order to meet PISA data analysis requirements, we use a novel combination of our previously published algorithmic components to reali…

Sparse Cluster IndicesCountry StereotypePISAWeighted Clustering
researchProduct

Analysing Student Performance using Sparse Data of Core Bachelor Courses

2015

Curricula for Computer Science (CS) degrees are characterized by the strong occupational orientation of the discipline. In the BSc degree structure, with clearly separate CS core studies, the learning skills for these and other required courses may vary a lot, which is shown in students' overall performance. To analyze this situation, we apply nonstandard educational data mining techniques on a preprocessed log file of the passed courses. The joint variation in the course grades is studied through correlation analysis while intrinsic groups of students are created and analyzed using a robust clustering technique. Since not all students attended all courses, there is a nonstructured sparsity…

Sparse Educational Datacurricula refinementCorrelation Analysiskolmiomittaussparse educational datacorrelation analysisRobust ClusteringComputingMilieux_COMPUTERSANDEDUCATIONmultilayer perceptronMultilayer PerceptrontriangulationCurricula Refinementrobust clustering
researchProduct

CUDA-enabled Sparse Matrix–Vector Multiplication on GPUs using atomic operations

2013

We propose the Sliced Coordinate Format (SCOO) for Sparse Matrix-Vector Multiplication on GPUs.An associated CUDA implementation which takes advantage of atomic operations is presented.We propose partitioning methods to transform a given sparse matrix into SCOO format.An efficient Dual-GPU implementation which overlaps computation and communication is described.Extensive performance comparisons of SCOO compared to other formats on GPUs and CPUs are provided. Existing formats for Sparse Matrix-Vector Multiplication (SpMV) on the GPU are outperforming their corresponding implementations on multi-core CPUs. In this paper, we present a new format called Sliced COO (SCOO) and an efficient CUDA i…

SpeedupComputer Networks and CommunicationsComputer scienceSparse matrix-vector multiplicationParallel computingComputer Graphics and Computer-Aided DesignTheoretical Computer ScienceMatrix (mathematics)CUDAArtificial IntelligenceHardware and ArchitectureBenchmark (computing)MultiplicationGeneral-purpose computing on graphics processing unitsSoftwareSparse matrixParallel Computing
researchProduct

Extending graphical models for applications: on covariates, missingness and normality

2021

The authors of the paper “Bayesian Graphical Models for Modern Biological Applications” have put forward an important framework for making graphical models more useful in applied settings. In this discussion paper, we give a number of suggestions for making this framework even more suitable for practical scenarios. Firstly, we show that an alternative and simplified definition of covariate might make the framework more manageable in high-dimensional settings. Secondly, we point out that the inclusion of missing variables is important for practical data analysis. Finally, we comment on the effect that the Gaussianity assumption has in identifying the underlying conditional independence graph…

Statistics and ProbabilityComputer sciencemedia_common.quotation_subjectMissing dataConditional graphical modelsCopula graphical modelsMissing dataCovariateEconometricsSparse inferenceGraphical modelStatistics Probability and UncertaintyNormalitymedia_common
researchProduct

dglars: An R Package to Estimate Sparse Generalized Linear Models

2014

dglars is a publicly available R package that implements the method proposed in Augugliaro, Mineo, and Wit (2013), developed to study the sparse structure of a generalized linear model. This method, called dgLARS, is based on a differential geometrical extension of the least angle regression method proposed in Efron, Hastie, Johnstone, and Tibshirani (2004). The core of the dglars package consists of two algorithms implemented in Fortran 90 to efficiently compute the solution curve: a predictor-corrector algorithm, proposed in Augugliaro et al. (2013), and a cyclic coordinate descent algorithm, proposed in Augugliaro, Mineo, and Wit (2012). The latter algorithm, as shown here, is significan…

Statistics and ProbabilityGeneralized linear modelEXPRESSIONMathematical optimizationTISSUESFortrancyclic coordinate descent algorithmdgLARSFeature selectionDANTZIG SELECTORpredictor-corrector algorithmLIKELIHOODLEAST ANGLE REGRESSIONsparse modelsDifferential (infinitesimal)differential geometrylcsh:Statisticslcsh:HA1-4737computer.programming_languageMathematicsLeast-angle regressionExtension (predicate logic)Expression (computer science)generalized linear modelsBREAST-CANCER RISKVARIABLE SELECTIONDifferential geometrydifferential geometry generalized linear models dgLARS predictor-corrector algorithm cyclic coordinate descent algorithm sparse models variable selection.MARKERSHRINKAGEStatistics Probability and UncertaintyHAPLOTYPESSettore SECS-S/01 - StatisticacomputerAlgorithmSoftware
researchProduct

Differential geometric least angle regression: a differential geometric approach to sparse generalized linear models

2013

Summary Sparsity is an essential feature of many contemporary data problems. Remote sensing, various forms of automated screening and other high throughput measurement devices collect a large amount of information, typically about few independent statistical subjects or units. In certain cases it is reasonable to assume that the underlying process generating the data is itself sparse, in the sense that only a few of the measured variables are involved in the process. We propose an explicit method of monotonically decreasing sparsity for outcomes that can be modelled by an exponential family. In our approach we generalize the equiangular condition in a generalized linear model. Although the …

Statistics and ProbabilityGeneralized linear modelSparse modelMathematical optimizationGeneralized linear modelsVariable selectionPath following algorithmEquiangular polygonGeneralized linear modelLASSODANTZIG SELECTORsymbols.namesakeExponential familyLasso (statistics)Sparse modelsDifferential geometryInformation geometryCOORDINATE DESCENTFisher informationERRORMathematicsLeast-angle regressionLeast angle regressionGeneralized degrees of freedomsymbolsSHRINKAGEStatistics Probability and UncertaintySimple linear regressionInformation geometrySettore SECS-S/01 - StatisticaAlgorithmCovariance penalty theory
researchProduct

The smallest singular value of a shifted $d$-regular random square matrix

2017

We derive a lower bound on the smallest singular value of a random d-regular matrix, that is, the adjacency matrix of a random d-regular directed graph. Specifically, let $$C_1<d< c n/\log ^2 n$$ and let $$\mathcal {M}_{n,d}$$ be the set of all $$n\times n$$ square matrices with 0 / 1 entries, such that each row and each column of every matrix in $$\mathcal {M}_{n,d}$$ has exactly d ones. Let M be a random matrix uniformly distributed on $$\mathcal {M}_{n,d}$$ . Then the smallest singular value $$s_{n} (M)$$ of M is greater than $$n^{-6}$$ with probability at least $$1-C_2\log ^2 d/\sqrt{d}$$ , where c, $$C_1$$ , and $$C_2$$ are absolute positive constants independent of any other parameter…

Statistics and ProbabilityIdentity matrixAdjacency matrices01 natural sciencesSquare matrixCombinatorics010104 statistics & probabilityMatrix (mathematics)Mathematics::Algebraic GeometryFOS: MathematicsMathematics - Combinatorics60B20 15B52 46B06 05C80Adjacency matrix0101 mathematicsCondition numberCondition numberMathematicsRandom graphsRandom graphLittlewood–Offord theorySingularity010102 general mathematicsProbability (math.PR)InvertibilityRegular graphsSingular valueSmallest singular valueAnti-concentrationSingular probabilitySparse matricesCombinatorics (math.CO)Statistics Probability and UncertaintyRandom matricesRandom matrixMathematics - ProbabilityAnalysis
researchProduct

The Induced Smoothed lasso: A practical framework for hypothesis testing in high dimensional regression.

2020

This paper focuses on hypothesis testing in lasso regression, when one is interested in judging statistical significance for the regression coefficients in the regression equation involving a lot of covariates. To get reliable p-values, we propose a new lasso-type estimator relying on the idea of induced smoothing which allows to obtain appropriate covariance matrix and Wald statistic relatively easily. Some simulation experiments reveal that our approach exhibits good performance when contrasted with the recent inferential tools in the lasso framework. Two real data analyses are presented to illustrate the proposed framework in practice.

Statistics and ProbabilityStatistics::TheoryInduced smoothingEpidemiologyComputer scienceFeature selectionWald test01 natural sciencesasthma researchStatistics::Machine Learning010104 statistics & probability03 medical and health sciencesHealth Information ManagementLasso (statistics)Linear regressionsparse modelsStatistics::MethodologyComputer Simulation0101 mathematicssandwich formula030304 developmental biologyStatistical hypothesis testing0303 health sciencesCovariance matrixlung functionRegression analysisStatistics::Computationsparse modelResearch DesignAlgorithmSmoothingvariable selectionStatistical methods in medical research
researchProduct

Adaptive sparse representation of continuous input for tsetlin machines based on stochastic searching on the line

2021

This paper introduces a novel approach to representing continuous inputs in Tsetlin Machines (TMs). Instead of using one Tsetlin Automaton (TA) for every unique threshold found when Booleanizing continuous input, we employ two Stochastic Searching on the Line (SSL) automata to learn discriminative lower and upper bounds. The two resulting Boolean features are adapted to the rest of the clause by equipping each clause with its own team of SSLs, which update the bounds during the learning process. Two standard TAs finally decide whether to include the resulting features as part of the clause. In this way, only four automata altogether represent one continuous feature (instead of potentially h…

Stochastic Searching on the Line automatonBoosting (machine learning)decision support systemTK7800-8360Computer Networks and CommunicationsComputer scienceDiscriminative modelFeature (machine learning)Electrical and Electronic EngineeringArtificial neural networkrule-based learninginterpretable machine learninginterpretable AISparse approximationAutomatonRandom forestSupport vector machineVDP::Teknologi: 500Tsetlin MachineXAIHardware and ArchitectureControl and Systems EngineeringSignal ProcessingElectronicsTsetlin automataAlgorithm
researchProduct

High performance algorithms based on a new wawelet expansion for time dependent acoustics obstale scattering

2007

This paper presents a highly parallelizable numerical method to solve time dependent acoustic obstacle scattering problems. The method proposed is a generalization of the ``operator expansion method" developed by Recchioni and Zirilli [SIAM J.~Sci.~Comput., 25 (2003), 1158-1186]. The numerical method proposed reduces, via a perturbative approach, the solution of the scattering problem to the solution of a sequence of systems of first kind integral equations. The numerical solution of these systems of integral equations is challenging when scattering problems involving realistic obstacles and small wavelengths are solved. A computational method has been developed to solve these challenging p…

Time dependent acoustic scattering Helmholtz equation integral equation methodswavelet bases sparse linear systems
researchProduct