Search results for "Database"

showing 10 items of 2136 documents

A parallel and sensitive software tool for methylation analysis on multicore platforms.

2015

Abstract Motivation: DNA methylation analysis suffers from very long processing time, as the advent of Next-Generation Sequencers has shifted the bottleneck of genomic studies from the sequencers that obtain the DNA samples to the software that performs the analysis of these samples. The existing software for methylation analysis does not seem to scale efficiently neither with the size of the dataset nor with the length of the reads to be analyzed. As it is expected that the sequencers will provide longer and longer reads in the near future, efficient and scalable methylation software should be developed. Results: We present a new software tool, called HPG-Methyl, which efficiently maps bis…

Statistics and ProbabilityMutation rateTime FactorsComputer scienceReal-time computingBisulfite sequencingMolecular Sequence DataGenomicsParallel computingcomputer.software_genremedicine.disease_causeBiochemistryGenomeBottleneckchemistry.chemical_compoundSoftwareMutation RateDatabases GeneticmedicineHumansSulfitesMolecular BiologyMutationMulti-core processorGenomeBase Sequencebusiness.industryHigh-Throughput Nucleotide SequencingMethylationGenomicsDNA MethylationOriginal PapersComputer Science ApplicationsComputational MathematicsComputational Theory and MathematicschemistryDNA methylationScalabilityMutationCompilerbusinesscomputerSequence AnalysisDNAAlgorithmsSoftwareBioinformatics (Oxford, England)

researchProduct

Global stability of protein folding from an empirical free energy function

2013

The principles governing protein folding stand as one of the biggest challenges of Biophysics. Modeling the global stability of proteins and predicting their tertiary structure are hard tasks, due in part to the variety and large number of forces involved and the difficulties to describe them with sufficient accuracy. We have developed a fast, physics-based empirical potential, intended to be used in global structure prediction methods. This model considers four main contributions: Two entropic factors, the hydrophobic effect and configurational entropy, and two terms resulting from a decomposition of close-packing interactions, namely the balance of the dispersive interactions of folded an…

Statistics and ProbabilityProtein FoldingEmpirical potential for proteinsConfiguration entropyPROTCALBioinformaticsGeneral Biochemistry Genetics and Molecular BiologyForce field (chemistry)Protein structureStatistical physicsDatabases ProteinQuantitative Biology::BiomoleculesModels StatisticalFoldXGeneral Immunology and MicrobiologyApplied MathematicsProteinsReproducibility of ResultsGeneral MedicineProtein tertiary structureProtein Structure TertiaryPrediction of protein folding stabilityModeling and SimulationLinear ModelsThermodynamicsProtein foldingGeneral Agricultural and Biological SciencesStatistical potentialAlgorithmsSoftwareTest dataJournal of Theoretical Biology

researchProduct

Bayesian hierarchical models in manufacturing bulk service queues

2006

In this paper, Queueing Theory and Bayesian statistical tools are used to analyze the congestion of various manufacturing bulk service queues with the same characteristics that are working independently of one another and in equilibrium. Hierarchical models are discussed in order to develop the whole inferential process for the parameters governing the system. Markov Chain Monte Carlo methods and numerical inversion of transforms are addressed to compute the posterior predictive distributions of the usual measures of performance in practice.

Statistics and ProbabilityQueueing theoryMathematical optimizationApplied MathematicsBayesian probabilityPosterior probabilityInversion (meteorology)Markov chain Monte CarloHierarchical database modelsymbols.namesakesymbolsEconometricsStatistics Probability and UncertaintyQueueMcmc algorithmMathematicsJournal of Statistical Planning and Inference

researchProduct

The relation between theory and application in statistics

1995

General comments on the relation between theory and application in statistics are made and emphasis placed on issues and principles of model formulation. Three examples are described in outline. Criteria for the choice of models are discussed.

Statistics and ProbabilityRelation (database)StatisticsStatistics Probability and UncertaintyMathematicsTest

researchProduct

Probabilistic small area risk assessment using GIS-based data: a case study on Finnish childhood diabetes

2000

A Bayesian hierarchical spatial model is constructed to describe the regional incidence of insulin dependent diabetes mellitus (IDDM) among the under 15-year-olds in Finland. The model exploits aggregated pixel-wise locations for both the cases and the population at risk. Typically such data arise from combining geographic information systems (GIS) with large databases. The dates of diagnosis and locations of the cases are observed from 1987 to 1996. The population at risk counts are available for every second year during the same period. A hierarchical model is suggested for the pixel wise case counts, including a population model to account for the uncertainty of the population at risk ov…

Statistics and ProbabilityRisk analysiseducation.field_of_studyGeographic information systemEpidemiologybusiness.industryBayesian probabilityPopulationStatistical modelHierarchical database model3. Good healthGeographyPopulation modelRisk assessmenteducationbusinessCartographyDemographyStatistics in Medicine

researchProduct

Overlap and diversity in antimicrobial peptide databases: Compiling a non-redundant set of sequences

2015

Abstract Motivation: The large variety of antimicrobial peptide (AMP) databases developed to date are characterized by a substantial overlap of data and similarity of sequences. Our goals are to analyze the levels of redundancy for all available AMP databases and use this information to build a new non-redundant sequence database. For this purpose, a new software tool is introduced. Results: A comparative study of 25 AMP databases reveals the overlap and diversity among them and the internal diversity within each database. The overlap analysis shows that only one database (Peptaibol) contains exclusive data, not present in any other, whereas all sequences in the LAMP_Patent database are inc…

Statistics and ProbabilitySimilarity (geometry)Computer scienceSequence analysisAntimicrobial peptidesPeptaibolPeptidecomputer.software_genreProceduresBiochemistrySet (abstract data type)chemistry.chemical_compoundProtein methodsSequence Analysis ProteinRedundancy (engineering)HumansDatabases ProteinMolecular BiologyAntimicrobial cationic peptideschemistry.chemical_classificationSequenceAntimicrobial cationic peptideDatabaseSequence databaseSequence analysisComputer Science ApplicationsAlgorithmComputational MathematicsChemistryProtein databaseComputational Theory and MathematicschemistryData miningNucleic acid databaseDatabases Nucleic AcidcomputerSoftwareAlgorithmsHuman

researchProduct

Design-based estimation for geometric quantiles with application to outlier detection

2010

Geometric quantiles are investigated using data collected from a complex survey. Geometric quantiles are an extension of univariate quantiles in a multivariate set-up that uses the geometry of multivariate data clouds. A very important application of geometric quantiles is the detection of outliers in multivariate data by means of quantile contours. A design-based estimator of geometric quantiles is constructed and used to compute quantile contours in order to detect outliers in both multivariate data and survey sampling set-ups. An algorithm for computing geometric quantile estimates is also developed. Under broad assumptions, the asymptotic variance of the quantile estimator is derived an…

Statistics and ProbabilityStatistics::TheoryTheoryofComputation_COMPUTATIONBYABSTRACTDEVICESStatistics::ApplicationsComputingMethodologies_SIMULATIONANDMODELINGApplied MathematicsMathematicsofComputing_NUMERICALANALYSISUnivariateInformationSystems_DATABASEMANAGEMENTEstimatorStatistics::ComputationQuantile regressionHorvitz–Thompson estimatorComputational MathematicsDelta methodComputational Theory and MathematicsTheoryofComputation_ANALYSISOFALGORITHMSANDPROBLEMCOMPLEXITYOutlierConsistent estimatorStatisticsStatistics::MethodologyMathematicsQuantileComputational Statistics & Data Analysis

researchProduct

RNA-Seq Atlas—a reference database for gene expression profiling in normal tissue by next-generation sequencing

2012

Abstract Motivation: Next-generation sequencing technology enables an entirely new perspective for clinical research and will speed up personalized medicine. In contrast to microarray-based approaches, RNA-Seq analysis provides a much more comprehensive and unbiased view of gene expression. Although the perspective is clear and the long-term success of this new technology obvious, bioinformatics resources making these data easily available especially to the biomedical research community are still evolving. Results: We have generated RNA-Seq Atlas, a web-based repository of RNA-Seq gene expression profiles and query tools. The website offers open and easy access to RNA-Seq gene expression pr…

Statistics and ProbabilitySystems biologyRNA-SeqComputational biologyBiologycomputer.software_genreBiochemistryNeoplasmsGene expressionHumansMicroarray databasesMolecular BiologyGeneOligonucleotide Array Sequence AnalysisInternetSequence Analysis RNAbusiness.industryGene Expression ProfilingHigh-Throughput Nucleotide SequencingComputer Science ApplicationsGene expression profilingComputational MathematicsComputational Theory and MathematicsGene chip analysisData miningPersonalized medicineDatabases Nucleic AcidbusinesscomputerSoftwareBioinformatics

researchProduct

Bayesian Design of “Successful” Replications

2002

Replication of experiments is commonin applied research. However, systematic studies of the goals and motivations of a “replication” are rare. As a consequence, there does not seem to be a precise notion of what a “success” when replicating means. This article discusses some of the possible goals for replication; this leads to different (but precise) notions of “success” when replicating. Bayesian hierarchical models allow for a flexible and explicit incorporation of the assumed relationship among the experiments. Bayesian predictive distributions are a natural tool to compute the probability of the replication being successful, and hence to design the replication so that the probability of…

Statistics and ProbabilityTheoretical computer scienceGeneral MathematicsBayesian probabilityHierarchical database modelBayesian designProbability of successNoncentral t-distributionReplication (statistics)Applied researchStatistics Probability and UncertaintyAlgorithmMathematicsStatistical hypothesis testingThe American Statistician

researchProduct

ARC A computerized system for urban garbage collection

1993

In this paper we present ARC a computerized system developed for urban garbage collection. The package is intended to help the planners in the design of efficient collection routes and to facilitate the study and evaluation of alternatives concerning issues such as the type and number of vehicles, frequency of collection and type and location of refuse containers. The final product is a “user friendly” system designed to be used by the planners without outside assistance.

Statistics and ProbabilityUser FriendlyInformation Systems and ManagementDatabaseComputer sciencebusiness.industryFinal productManagement Science and Operations Researchcomputer.software_genreArc (geometry)Modeling and SimulationEmbedded systemVehicle routing problemDiscrete Mathematics and CombinatoricsComputerized systemHeuristicsbusinesscomputerGarbage collectionTop

researchProduct