Search results for "Theory"
showing 10 items of 24627 documents
Single-cell trajectories reconstruction, exploration and mapping of omics data with STREAM
2019
Single-cell transcriptomic assays have enabled the de novo reconstruction of lineage differentiation trajectories, along with the characterization of cellular heterogeneity and state transitions. Several methods have been developed for reconstructing developmental trajectories from single-cell transcriptomic data, but efforts on analyzing single-cell epigenomic data and on trajectory visualization remain limited. Here we present STREAM, an interactive pipeline capable of disentangling and visualizing complex branching trajectories from both single-cell transcriptomic and epigenomic data. We have tested STREAM on several synthetic and real datasets generated with different single-cell techno…
Informational and linguistic analysis of large genomic sequence collections via efficient Hadoop cluster algorithms
2018
Abstract Motivation Information theoretic and compositional/linguistic analysis of genomes have a central role in bioinformatics, even more so since the associated methodologies are becoming very valuable also for epigenomic and meta-genomic studies. The kernel of those methods is based on the collection of k-mer statistics, i.e. how many times each k-mer in {A,C,G,T}k occurs in a DNA sequence. Although this problem is computationally very simple and efficiently solvable on a conventional computer, the sheer amount of data available now in applications demands to resort to parallel and distributed computing. Indeed, those type of algorithms have been developed to collect k-mer statistics in…
A DFT study on the chiral synthesis of R-phenylacetyl carbinol within the quantum chemical cluster approach
2017
Abstract The reaction pathway leading to R-phenylacetyl carbinol within the quantum chemical cluster approach is addressed by means of density functional theory (DFT) calculations. The study includes calculation of Fukui functions, activation free energies, and potential energy surface scans, both in gas and solution phase. The protonation states of the nitrogen atoms of the pyrimidine moiety are determined. The reaction appears to be slightly exergonic (ΔG 0 = −5.6 and −4.0 kcal/mol for gas and solution phase, respectively) following a concerted synchronous mechanism having activation free energy barriers of 16.2 and 13.3 kcal/mol, in gas phase and solution phase, respectively.
FASTdoop: A versatile and efficient library for the input of FASTA and FASTQ files for MapReduce Hadoop bioinformatics applications
2017
Abstract Summary MapReduce Hadoop bioinformatics applications require the availability of special-purpose routines to manage the input of sequence files. Unfortunately, the Hadoop framework does not provide any built-in support for the most popular sequence file formats like FASTA or BAM. Moreover, the development of these routines is not easy, both because of the diversity of these formats and the need for managing efficiently sequence datasets that may count up to billions of characters. We present FASTdoop, a generic Hadoop library for the management of FASTA and FASTQ files. We show that, with respect to analogous input management routines that have appeared in the Literature, it offers…
The colored longest common prefix array computed via sequential scans
2018
Due to the increased availability of large datasets of biological sequences, the tools for sequence comparison are now relying on efficient alignment-free approaches to a greater extent. Most of the alignment-free approaches require the computation of statistics of the sequences in the dataset. Such computations become impractical in internal memory when very large collections of long sequences are considered. In this paper, we present a new conceptual data structure, the colored longest common prefix array (cLCP), that allows to efficiently tackle several problems with an alignment-free approach. In fact, we show that such a data structure can be computed via sequential scans in semi-exter…
Alignment-free sequence comparison using absent words
2018
Sequence comparison is a prerequisite to virtually all comparative genomic analyses. It is often realised by sequence alignment techniques, which are computationally expensive. This has led to increased research into alignment-free techniques, which are based on measures referring to the composition of sequences in terms of their constituent patterns. These measures, such as $q$-gram distance, are usually computed in time linear with respect to the length of the sequences. In this paper, we focus on the complementary idea: how two sequences can be efficiently compared based on information that does not occur in the sequences. A word is an {\em absent word} of some sequence if it does not oc…
Integrative analysis of structural variations using short-reads and linked-reads yields highly specific and sensitive predictions.
2020
Genetic diseases are driven by aberrations of the human genome. Identification of such aberrations including structural variations (SVs) is key to our understanding. Conventional short-reads whole genome sequencing (cWGS) can identify SVs to base-pair resolution, but utilizes only short-range information and suffers from high false discovery rate (FDR). Linked-reads sequencing (10XWGS) utilizes long-range information by linkage of short-reads originating from the same large DNA molecule. This can mitigate alignment-based artefacts especially in repetitive regions and should enable better prediction of SVs. However, an unbiased evaluation of this technology is not available. In this study, w…
On finite groups with many supersoluble subgroups
2017
[EN] The solubility of a finite group with less than 6 non-supersoluble subgroups is confirmed in the paper. Moreover we prove that a finite insoluble group has exactly 6 non-supersoluble subgroups if and only if it is isomorphic to A5 or SL2 (5). Furthermore, it is shown that a finite insoluble group has exactly 22 non-nilpotent subgroups if and only if it is isomorphic to A5 or SL2 (5). This confirms a conjecture of Zarrin (Arch Math (Basel) 99:201 206, 2012).
Health/Nutrition food claims and low-fat food purchase: Projected personality influence in young consumers
2017
Abstract Health/nutrition food claims are increasingly used in the food industry but firms still require deeper research to develop a better understanding of consumers in the low-fat food market. In pursuit of this goal, this paper analyses the influence of projected consumer personality on healthy claim credibility, Perceived product health, physical appearance and its repercussion on attitudes (overall attitude to the product) and behaviours (purchase intention). With a sample of 300 young consumers (15–25 years old) and through PLS techniques, our results show that project personality influences the credibility of claims about healthiness and physical appearance. Both concepts play a sig…
2016
We determine knotting probabilities and typical sizes of knots in double-stranded DNA for chains of up to half a million base pairs with computer simulations of a coarse-grained bead-stick model: Single trefoil knots and composite knots which include at least one trefoil as a prime factor are shown to be common in DNA chains exceeding 250,000 base pairs, assuming physiologically relevant salt conditions. The analysis is motivated by the emergence of DNA nanopore sequencing technology, as knots are a potential cause of erroneous nucleotide reads in nanopore sequencing devices and may severely limit read lengths in the foreseeable future. Even though our coarse-grained model is only based on …