Search results for "Software"
showing 10 items of 7396 documents
Sequential Monte Carlo methods in Bayesian joint models for longitudinal and time-to-event data
2020
The statistical analysis of the information generated by medical follow-up is a very important challenge in the field of personalized medicine. As the evolutionary course of a patient's disease progresses, his/her medical follow-up generates more and more information that should be processed immediately in order to review and update his/her prognosis and treatment. Hence, we focus on this update process through sequential inference methods for joint models of longitudinal and time-to-event data from a Bayesian perspective. More specifically, we propose the use of sequential Monte Carlo (SMC) methods for static parameter joint models with the intention of reducing computational time in each…
DySC: software for greedy clustering of 16S rRNA reads.
2012
Abstract Summary: Pyrosequencing technologies are frequently used for sequencing the 16S ribosomal RNA marker gene for profiling microbial communities. Clustering of the produced reads is an important but time-consuming task. We present Dynamic Seed-based Clustering (DySC), a new tool based on the greedy clustering approach that uses a dynamic seeding strategy. Evaluations based on the normalized mutual information (NMI) criterion show that DySC produces higher quality clusters than UCLUST and CD-HIT at a comparable runtime. Availability and implementation: DySC, implemented in C, is available at http://code.google.com/p/dysc/ under GNU GPL license. Contact: bertil.schmidt@uni-mainz.de Sup…
Musket: a multistage k-mer spectrum-based error corrector for Illumina sequence data
2012
Abstract Motivation: The imperfect sequence data produced by next-generation sequencing technologies have motivated the development of a number of short-read error correctors in recent years. The majority of methods focus on the correction of substitution errors, which are the dominant error source in data produced by Illumina sequencing technology. Existing tools either score high in terms of recall or precision but not consistently high in terms of both measures. Results: In this article, we present Musket, an efficient multistage k-mer-based corrector for Illumina short-read data. We use the k-mer spectrum approach and introduce three correction techniques in a multistage workflow: two-s…
Recurrence Plots in Nonlinear Time Series Analysis: Free Software
2002
Recurrence plots are graphical devices specially suited to detect hidden dynamical patterns and nonlinearities in data. However, there are few programs available to apply such a mehodology. This paper reviews one of the best free programs to apply nonlinear time series analysis: Visual Recurrence Analysis (VRA). This program is targeted to recurrence analysis and the so-called Recurrence Quantitative Analysis (RQA, the quantitative counterpart of recurrence plots), although it includes many procedures in a friendly visual environment. Comparisons with alternative programs are performed.
Testing Goodness-of-Fit with the Kernel Density Estimator: GoFKernel
2015
To assess the goodness-of-fit of a sample to a continuous random distribution, the most popular approach has been based on measuring, using either L∞ - or L2 -norms, the distance between the null hypothesis cumulative distribution function and the empirical cumulative distribution function. Indeed, as far as I know, almost all the tests currently available in R related to this issue (ks.test in package stats, ad.test in package ADGofTest, and ad.test, ad2.test, ks.test, v.test and w2.test in package truncgof) use one of these two distances on cumulative distribution functions. This paper (i) proposes dgeometric.test, a new implementation of the test that measures the discrepancy between a s…
Spanish electoral archive. SEA database
2021
This paper introduces the SEA database (acronym for Spanish Electoral Archive). SEA brings together the most complete public repository available to date on Spanish election outcomes. SEA holds all the results recorded from the electoral processes of General (1979–2019), Regional (1989–2021), Local (1979–2019) and European Parliamentary (1987–2019) elections held in Spain since the restoration of democracy in the late 70 s, in addition to other data sets with electoral content. The data are offered for free and is presented in a homogeneous and friendly format. Most of the databases are available for download with data from various electoral levels, including from the ballot box level. This…
A database for the monitoring of thermal anomalies over the Amazon forest and adjacent intertropical oceans
2015
AbstractAdvances in information technologies and accessibility to climate and satellite data in recent years have favored the development of web-based tools with user-friendly interfaces in order to facilitate the dissemination of geo/biophysical products. These products are useful for the analysis of the impact of global warming over different biomes. In particular, the study of the Amazon forest responses to drought have recently received attention by the scientific community due to the occurrence of two extreme droughts and sustained warming over the last decade. Thermal Amazoni@ is a web-based platform for the visualization and download of surface thermal anomalies products over the Ama…
Galaxy LIMS for next-generation sequencing.
2013
Abstract Summary: We have developed a laboratory information management system (LIMS) for a next-generation sequencing (NGS) laboratory within the existing Galaxy platform. The system provides lab technicians standard and customizable sample information forms, barcoded submission forms, tracking of input sample quality, multiplex-capable automatic flow cell design and automatically generated sample sheets to aid physical flow cell preparation. In addition, the platform provides the researcher with a user-friendly interface to create a request, submit accompanying samples, upload sample quality measurements and access to the sequencing results. As the LIMS is within the Galaxy platform, the …
Textual data compression in computational biology: a synopsis.
2009
Abstract Motivation: Textual data compression, and the associated techniques coming from information theory, are often perceived as being of interest for data communication and storage. However, they are also deeply related to classification and data mining and analysis. In recent years, a substantial effort has been made for the application of textual data compression techniques to various computational biology tasks, ranging from storage and indexing of large datasets to comparison and reverse engineering of biological networks. Results: The main focus of this review is on a systematic presentation of the key areas of bioinformatics and computational biology where compression has been use…
Visualizing the flow of evidence in network meta-analysis and characterizing mixed treatment comparisons
2013
Network meta-analysis techniques allow for pooling evidence from different studies with only partially overlapping designs for getting a broader basis for decision support. The results are network-based effect estimates that take indirect evidence into account for all pairs of treatments. The results critically depend on homogeneity and consistency assumptions, which are sometimes difficult to investigate. To support such evaluation, we propose a display of the flow of evidence and introduce new measures that characterize the structure of a mixed treatment comparison. Specifically, a linear fixed effects model for network meta-analysis is considered, where the network estimates for two trea…