Search results for "Data mining"

showing 10 items of 907 documents

Opportunities and challenges of combined effect measures based on prioritized outcomes

2013

Many authors have proposed different approaches to combine multiple endpoints in a univariate outcome measure in the literature. In case of binary or time-to-event variables, composite endpoints, which combine several event types within a single event or time-to-first-event analysis are often used to assess the overall treatment effect. A main drawback of this approach is that the interpretation of the composite effect can be difficult as a negative effect in one component can be masked by a positive effect in another. Recently, some authors proposed more general approaches based on a priority ranking of outcomes, which moreover allow to combine outcome variables of different scale levels. …

Statistics and ProbabilityClinical Trials as TopicEpidemiologyUnivariatecomputer.software_genreOutcome (game theory)Treatment OutcomeRankingScale (social sciences)Component (UML)Outcome Assessment Health CareMultiple comparisons problemHumansComputer SimulationData miningcomputerProportional Hazards ModelsMathematicsStatistical hypothesis testingEvent (probability theory)Statistics in Medicine
researchProduct

An interest rates cluster analysis

2004

An empirical analysis of interest rates in money and capital markets is performed. We investigate a set of 34 different weekly interest rate time series during a time period of 16 years between 1982 and 1997. Our study is focused on the collective behavior of the stochastic fluctuations of these time-series which is investigated by using a clustering linkage procedure. Without any a priori assumption, we individuate a meaningful separation in 6 main clusters organized in a hierarchical structure.

Statistics and ProbabilityCollective behaviormedia_common.quotation_subjectFOS: Physical sciencesLinkage (mechanical)computer.software_genrelaw.inventionFOS: Economics and businesslawEconometricsCluster (physics)Cluster analysisCondensed Matter - Statistical Mechanicsmedia_commonStatistical Finance (q-fin.ST)Statistical Mechanics (cond-mat.stat-mech)EconophysicsSeries (mathematics)Quantitative Finance - Statistical FinanceCondensed Matter PhysicsInterest rateCondensed Matter - Other Condensed MatterData miningCapital marketcomputerOther Condensed Matter (cond-mat.other)
researchProduct

An overview of robust Bayesian analysis

1994

Robust Bayesian analysis is the study of the sensitivity of Bayesian answers to uncertain inputs. This paper seeks to provide an overview of the subject, one that is accessible to statisticians outside the field. Recent developments in the area are also reviewed, though with very uneven emphasis. © 1994 SEIO.

Statistics and ProbabilityComputer scienceBayesian probabilitycomputer.software_genreData scienceField (computer science)Bayesian robustnessN/ARobust Bayesian analysisPrior probabilityData miningSensitivity (control systems)Statistics Probability and Uncertaintycomputer
researchProduct

A new mathematical approach for the estimation of the AUC and its variability under different experimental designs in preclinical studies

2011

The aim of the present work was to develop a new mathematical method for estimating the area under the curve (AUC) and its variability that could be applied in different preclinical experimental designs and amenable to be implemented in standard calculation worksheets. In order to assess the usefulness of the new approach, different experimental scenarios were studied and the results were compared with those obtained with commonly used software: WinNonlin® and Phoenix WinNonlin®. The results do not show statistical differences among the AUC values obtained by both procedures, but the new method appears to be a better estimator of the AUC standard error, measured as the coverage of 95% confi…

Statistics and ProbabilityComputer scienceDrug Evaluation PreclinicalAdministration Oralcomputer.software_genreSoftwareCiprofloxacinArea under curveVariance estimationAnimalsPharmacology (medical)Rats WistarPharmacologyModels Statisticalbusiness.industryDesign of experimentsEstimatorModels TheoreticalConfidence intervalRatsStandard errorResearch DesignArea Under CurveData miningbusinesscomputerSoftwarePharmaceutical Statistics
researchProduct

Pathway analysis of high-throughput biological data within a Bayesian network framework

2011

Abstract Motivation: Most current approaches to high-throughput biological data (HTBD) analysis either perform individual gene/protein analysis or, gene/protein set enrichment analysis for a list of biologically relevant molecules. Bayesian Networks (BNs) capture linear and non-linear interactions, handle stochastic events accounting for noise, and focus on local interactions, which can be related to causal inference. Here, we describe for the first time an algorithm that models biological pathways as BNs and identifies pathways that best explain given HTBD by scoring fitness of each network. Results: Proposed method takes into account the connectivity and relatedness between nodes of the p…

Statistics and ProbabilityComputer scienceHigh-throughput screeningGene regulatory networkcomputer.software_genreModels BiologicalBiochemistrySynthetic dataBiological pathwayBayes' theoremHumansGene Regulatory NetworksCarcinoma Renal CellMolecular BiologyGeneBiological dataMicroarray analysis techniquesGene Expression ProfilingBayesian networkRobustness (evolution)Bayes TheoremPathway analysisKidney NeoplasmsHigh-Throughput Screening AssaysComputer Science ApplicationsGene expression profilingComputational MathematicsComputational Theory and MathematicsCausal inferenceData miningcomputerAlgorithmsSoftwareBioinformatics
researchProduct

Algorithms and tools for protein-protein interaction networks clustering, with a special focus on population-based stochastic methods

2014

Abstract Motivation: Protein–protein interaction (PPI) networks are powerful models to represent the pairwise protein interactions of the organisms. Clustering PPI networks can be useful for isolating groups of interacting proteins that participate in the same biological processes or that perform together specific biological functions. Evolutionary orthologies can be inferred this way, as well as functions and properties of yet uncharacterized proteins. Results: We present an overview of the main state-of-the-art clustering methods that have been applied to PPI networks over the past decade. We distinguish five specific categories of approaches, describe and compare their main features and …

Statistics and ProbabilityComputer sciencePopulationPopulation basedMachine learningcomputer.software_genreBiochemistryProtein protein interaction networkgenetic algorithmsProtein–protein interactionBioinformatics Clustering Biological NetworksPPI networkscomplex detectionProtein Interaction MappingAnimalsCluster AnalysisHumanseducationCluster analysisMolecular BiologyTopology (chemistry)Class (computer programming)education.field_of_studybusiness.industryfood and beveragesProteinsComputer Science ApplicationsComputational MathematicsComputational Theory and MathematicsArtificial intelligenceData miningbusinessFocus (optics)computerAlgorithms
researchProduct

Anthropometry: An R Package for Analysis of Anthropometric Data

2017

The development of powerful new 3D scanning techniques has enabled the generation of large up-to-date anthropometric databases which provide highly valued data to improve the ergonomic design of products adapted to the user population. As a consequence, Ergonomics and Anthropometry are two increasingly quantitative fields, so advanced statistical methodologies and modern software tools are required to get the maximum benefit from anthropometric data. This paper presents a new R package, called Anthropometry, which is available on the Comprehensive R Archive Network. It brings together some statistical methodologies concerning clustering, statistical shape analysis, statistical archetypal an…

Statistics and ProbabilityComputer sciencePopulationstatistical shape analysis02 engineering and technologycomputer.software_genre01 natural sciences010104 statistics & probabilitySoftware0202 electrical engineering electronic engineering information engineeringR; anthropometric data; clustering; statistical shape analysis; archetypal analysis; data depth0101 mathematicsarchetypal analysisCluster analysiseducationlcsh:Statisticslcsh:HA1-4737education.field_of_studyAnthropometric databusiness.industryStatistical shape analysisRHuman factors and ergonomicsAnthropometryanthropometric dataVignette020201 artificial intelligence & image processingData miningStatistics Probability and Uncertaintydata depthbusinesscomputerSoftwareclusteringJournal of Statistical Software
researchProduct

DySC: software for greedy clustering of 16S rRNA reads.

2012

Abstract Summary: Pyrosequencing technologies are frequently used for sequencing the 16S ribosomal RNA marker gene for profiling microbial communities. Clustering of the produced reads is an important but time-consuming task. We present Dynamic Seed-based Clustering (DySC), a new tool based on the greedy clustering approach that uses a dynamic seeding strategy. Evaluations based on the normalized mutual information (NMI) criterion show that DySC produces higher quality clusters than UCLUST and CD-HIT at a comparable runtime. Availability and implementation: DySC, implemented in C, is available at http://code.google.com/p/dysc/ under GNU GPL license. Contact:  bertil.schmidt@uni-mainz.de Sup…

Statistics and ProbabilityComputer sciencebusiness.industrySequence Analysis RNA16S ribosomal RNAcomputer.software_genreBiochemistryComputer Science ApplicationsComputational MathematicsSoftwareComputational Theory and MathematicsRNA Ribosomal 16SCluster AnalysisMetagenomeData miningCluster analysisbusinessMolecular BiologycomputerSoftwareBioinformatics (Oxford, England)
researchProduct

Recurrence Plots in Nonlinear Time Series Analysis: Free Software

2002

Recurrence plots are graphical devices specially suited to detect hidden dynamical patterns and nonlinearities in data. However, there are few programs available to apply such a mehodology. This paper reviews one of the best free programs to apply nonlinear time series analysis: Visual Recurrence Analysis (VRA). This program is targeted to recurrence analysis and the so-called Recurrence Quantitative Analysis (RQA, the quantitative counterpart of recurrence plots), although it includes many procedures in a friendly visual environment. Comparisons with alternative programs are performed.

Statistics and ProbabilityComputer sciencebusiness.industrycomputer.software_genreNonlinear time series analysisSoftwareQuantitative analysis (finance)StatisticsData miningStatistics Probability and Uncertaintybusinesslcsh:Statisticslcsh:HA1-4737computerSoftwareJournal of Statistical Software
researchProduct

Visualizing the flow of evidence in network meta-analysis and characterizing mixed treatment comparisons

2013

Network meta-analysis techniques allow for pooling evidence from different studies with only partially overlapping designs for getting a broader basis for decision support. The results are network-based effect estimates that take indirect evidence into account for all pairs of treatments. The results critically depend on homogeneity and consistency assumptions, which are sometimes difficult to investigate. To support such evaluation, we propose a display of the flow of evidence and introduce new measures that characterize the structure of a mixed treatment comparison. Specifically, a linear fixed effects model for network meta-analysis is considered, where the network estimates for two trea…

Statistics and ProbabilityDecision support systemEpidemiologyComputer scienceHomogeneity (statistics)PoolingLinear modelFixed effects modelDirected acyclic graphcomputer.software_genrePath lengthData miningLinear combinationcomputerStatistics in Medicine
researchProduct