Search results for "Algorithms"
showing 10 items of 1716 documents
Distinctive Histogenesis and Immunological Microenvironment Based on Transcriptional Profiles of Follicular Dendritic Cell Sarcomas
2017
Abstract Follicular dendritic cell (FDC) sarcomas are rare mesenchymal tumors with variable clinical, morphologic, and phenotypic characteristics. Transcriptome analysis was performed on multiple FDC sarcomas and compared with other mesenchymal tumors, microdissected Castleman FDCs, and normal fibroblasts. Using unsupervised analysis, FDC sarcomas clustered with microdissected FDCs, distinct from other mesenchymal tumors and fibroblasts. The specific endowment of FDC-related gene expression programs in FDC sarcomas emerged by applying a gene signature of differentially expressed genes (n = 1,289) between microdissected FDCs and fibroblasts. Supervised analysis comparing FDC sarcomas with mi…
Recentrifuge: Robust comparative analysis and contamination removal for metagenomics
2017
Metagenomic sequencing is becoming widespread in biomedical and environmental research, and the pace is increasing even more thanks to nanopore sequencing. With a rising number of samples and data per sample, the challenge of efficiently comparing results within a specimen and between specimens arises. Reagents, laboratory, and host related contaminants complicate such analysis. Contamination is particularly critical in low microbial biomass body sites and environments, where it can comprise most of a sample if not all. Recentrifuge implements a robust method for the removal of negative-control and crossover taxa from the rest of samples. With Recentrifuge, researchers can analyze results f…
Efficient Algorithms for Sequence Analysis with Entropic Profiles
2017
Entropy, being closely related to repetitiveness and compressibility, is a widely used information-related measure to assess the degree of predictability of a sequence. Entropic profiles are based on information theory principles, and can be used to study the under-/over-representation of subwords, by also providing information about the scale of conserved DNA regions. Here, we focus on the algorithmic aspects related to entropic profiles. In particular, we propose linear time algorithms for their computation that rely on suffix-based data structures, more specifically on the truncated suffix tree (TST) and on the enhanced suffix array (ESA). We performed an extensive experimental campaign …
Next-generation sequencing: big data meets high performance computing
2017
The progress of next-generation sequencing has a major impact on medical and genomic research. This high-throughput technology can now produce billions of short DNA or RNA fragments in excess of a few terabytes of data in a single run. This leads to massive datasets used by a wide range of applications including personalized cancer treatment and precision medicine. In addition to the hugely increased throughput, the cost of using high-throughput technologies has been dramatically decreasing. A low sequencing cost of around US$1000 per genome has now rendered large population-scale projects feasible. However, to make effective use of the produced data, the design of big data algorithms and t…
miRToolsGallery: a tag-based and rankable microRNA bioinformatics resources database portal
2017
Abstract Hundreds of bioinformatics tools have been developed for MicroRNA (miRNA) investigations including those used for identification, target prediction, structure and expression profile analysis. However, finding the correct tool for a specific application requires the tedious and laborious process of locating, downloading, testing and validating the appropriate tool from a group of nearly a thousand. In order to facilitate this process, we developed a novel database portal named miRToolsGallery. We constructed the portal by manually curating > 950 miRNA analysis tools and resources. In the portal, a query to locate the appropriate tool is expedited by being searchable, filterable and …
SpCLUST: Towards a fast and reliable clustering for potentially divergent biological sequences
2019
International audience; This paper presents SpCLUST, a new C++ package that takes a list of sequences as input, aligns them with MUSCLE, computes their similarity matrix in parallel and then performs the clustering. SpCLUST extends a previously released software by integrating additional scoring matrices which enables it to cover the clustering of amino-acid sequences. The similarity matrix is now computed in parallel according to the master/slave distributed architecture, using MPI. Performance analysis, realized on two real datasets of 100 nucleotide sequences and 1049 amino-acids ones, show that the resulting library substantially outperforms the original Python package. The proposed pac…
Reactome pathway analysis: a high-performance in-memory approach
2016
Reactome aims to provide bioinformatics tools for visualisation, interpretation and analysis of pathway knowledge to support basic research, genome analysis, modelling, systems biology and education. Pathway analysis methods have a broad range of applications in physiological and biomedical research; one of the main problems, from the analysis methods performance point of view, is the constantly increasing size of the data samples. Here, we present a new high-performance in-memory implementation of the well-established over-representation analysis method. To achieve the target, the over-representation analysis method is divided in four different steps and, for each of them, specific data st…
A framework for data-driven adaptive GUI generation based on DICOM
2018
Computer applications for diagnostic medical imaging provide generally a wide range of tools to support physicians in their daily diagnosis activities. Unfortunately, some functionalities are specialized for specific diseases or imaging modalities, while other ones are useless for the images under investigation. Nevertheless, the corresponding Graphical User Interface (GUI) widgets are still present on the screen reducing the image visualization area. As a consequence, the physician may be affected by cognitive overload and visual stress causing a degradation of performances, mainly due to unuseful widgets. In clinical environments, a GUI must represent a sequence of steps for image investi…
Comparison between iMSD and 2D-pCF analysis for molecular motion studies on in vivo cells: The case of the epidermal growth factor receptor.
2018
Image correlation analysis has evolved to become a valuable method of analysis of the diffusional motion of molecules in every points of a live cell. Here we compare the iMSD and the 2D-pCF approaches that provide complementary information. The iMSD method provides the law of diffusion and it requires spatial averaging over a small region of the cell. The 2D-pCF does not require spatial averaging and it gives information about obstacles for diffusion at pixel resolution. We show the analysis of the same set of data by the two methods to emphasize that both methods could be needed to have a comprehensive understanding of the molecular diffusional flow in a live cell.
Measuring spectrally-resolved information transfer.
2020
Information transfer, measured by transfer entropy, is a key component of distributed computation. It is therefore important to understand the pattern of information transfer in order to unravel the distributed computational algorithms of a system. Since in many natural systems distributed computation is thought to rely on rhythmic processes a frequency resolved measure of information transfer is highly desirable. Here, we present a novel algorithm, and its efficient implementation, to identify separately frequencies sending and receiving information in a network. Our approach relies on the invertible maximum overlap discrete wavelet transform (MODWT) for the creation of surrogate data in t…