Search results for "Database"
showing 10 items of 2136 documents
A parallel and sensitive software tool for methylation analysis on multicore platforms.
2015
Abstract Motivation: DNA methylation analysis suffers from very long processing time, as the advent of Next-Generation Sequencers has shifted the bottleneck of genomic studies from the sequencers that obtain the DNA samples to the software that performs the analysis of these samples. The existing software for methylation analysis does not seem to scale efficiently neither with the size of the dataset nor with the length of the reads to be analyzed. As it is expected that the sequencers will provide longer and longer reads in the near future, efficient and scalable methylation software should be developed. Results: We present a new software tool, called HPG-Methyl, which efficiently maps bis…
Global stability of protein folding from an empirical free energy function
2013
The principles governing protein folding stand as one of the biggest challenges of Biophysics. Modeling the global stability of proteins and predicting their tertiary structure are hard tasks, due in part to the variety and large number of forces involved and the difficulties to describe them with sufficient accuracy. We have developed a fast, physics-based empirical potential, intended to be used in global structure prediction methods. This model considers four main contributions: Two entropic factors, the hydrophobic effect and configurational entropy, and two terms resulting from a decomposition of close-packing interactions, namely the balance of the dispersive interactions of folded an…
Bayesian hierarchical models in manufacturing bulk service queues
2006
In this paper, Queueing Theory and Bayesian statistical tools are used to analyze the congestion of various manufacturing bulk service queues with the same characteristics that are working independently of one another and in equilibrium. Hierarchical models are discussed in order to develop the whole inferential process for the parameters governing the system. Markov Chain Monte Carlo methods and numerical inversion of transforms are addressed to compute the posterior predictive distributions of the usual measures of performance in practice.
The relation between theory and application in statistics
1995
General comments on the relation between theory and application in statistics are made and emphasis placed on issues and principles of model formulation. Three examples are described in outline. Criteria for the choice of models are discussed.
Probabilistic small area risk assessment using GIS-based data: a case study on Finnish childhood diabetes
2000
A Bayesian hierarchical spatial model is constructed to describe the regional incidence of insulin dependent diabetes mellitus (IDDM) among the under 15-year-olds in Finland. The model exploits aggregated pixel-wise locations for both the cases and the population at risk. Typically such data arise from combining geographic information systems (GIS) with large databases. The dates of diagnosis and locations of the cases are observed from 1987 to 1996. The population at risk counts are available for every second year during the same period. A hierarchical model is suggested for the pixel wise case counts, including a population model to account for the uncertainty of the population at risk ov…
Overlap and diversity in antimicrobial peptide databases: Compiling a non-redundant set of sequences
2015
Abstract Motivation: The large variety of antimicrobial peptide (AMP) databases developed to date are characterized by a substantial overlap of data and similarity of sequences. Our goals are to analyze the levels of redundancy for all available AMP databases and use this information to build a new non-redundant sequence database. For this purpose, a new software tool is introduced. Results: A comparative study of 25 AMP databases reveals the overlap and diversity among them and the internal diversity within each database. The overlap analysis shows that only one database (Peptaibol) contains exclusive data, not present in any other, whereas all sequences in the LAMP_Patent database are inc…
Design-based estimation for geometric quantiles with application to outlier detection
2010
Geometric quantiles are investigated using data collected from a complex survey. Geometric quantiles are an extension of univariate quantiles in a multivariate set-up that uses the geometry of multivariate data clouds. A very important application of geometric quantiles is the detection of outliers in multivariate data by means of quantile contours. A design-based estimator of geometric quantiles is constructed and used to compute quantile contours in order to detect outliers in both multivariate data and survey sampling set-ups. An algorithm for computing geometric quantile estimates is also developed. Under broad assumptions, the asymptotic variance of the quantile estimator is derived an…
RNA-Seq Atlas—a reference database for gene expression profiling in normal tissue by next-generation sequencing
2012
Abstract Motivation: Next-generation sequencing technology enables an entirely new perspective for clinical research and will speed up personalized medicine. In contrast to microarray-based approaches, RNA-Seq analysis provides a much more comprehensive and unbiased view of gene expression. Although the perspective is clear and the long-term success of this new technology obvious, bioinformatics resources making these data easily available especially to the biomedical research community are still evolving. Results: We have generated RNA-Seq Atlas, a web-based repository of RNA-Seq gene expression profiles and query tools. The website offers open and easy access to RNA-Seq gene expression pr…
Bayesian Design of “Successful” Replications
2002
Replication of experiments is commonin applied research. However, systematic studies of the goals and motivations of a “replication” are rare. As a consequence, there does not seem to be a precise notion of what a “success” when replicating means. This article discusses some of the possible goals for replication; this leads to different (but precise) notions of “success” when replicating. Bayesian hierarchical models allow for a flexible and explicit incorporation of the assumed relationship among the experiments. Bayesian predictive distributions are a natural tool to compute the probability of the replication being successful, and hence to design the replication so that the probability of…
ARC A computerized system for urban garbage collection
1993
In this paper we present ARC a computerized system developed for urban garbage collection. The package is intended to help the planners in the design of efficient collection routes and to facilitate the study and evaluation of alternatives concerning issues such as the type and number of vehicles, frequency of collection and type and location of refuse containers. The final product is a “user friendly” system designed to be used by the planners without outside assistance.