Search results for "Theory"

showing 10 items of 24627 documents

Single-cell trajectories reconstruction, exploration and mapping of omics data with STREAM

2019

Single-cell transcriptomic assays have enabled the de novo reconstruction of lineage differentiation trajectories, along with the characterization of cellular heterogeneity and state transitions. Several methods have been developed for reconstructing developmental trajectories from single-cell transcriptomic data, but efforts on analyzing single-cell epigenomic data and on trajectory visualization remain limited. Here we present STREAM, an interactive pipeline capable of disentangling and visualizing complex branching trajectories from both single-cell transcriptomic and epigenomic data. We have tested STREAM on several synthetic and real datasets generated with different single-cell techno…

0301 basic medicineEpigenomicsMultifactor Dimensionality ReductionComputer scienceGeneral Physics and Astronomy02 engineering and technologyOmics dataMyoblastsMiceSingle-cell analysisGATA1 Transcription FactorMyeloid CellsLymphocyteslcsh:ScienceData processingMultidisciplinaryQGene Expression Regulation DevelopmentalRNA sequencingCell DifferentiationGenomics021001 nanoscience & nanotechnologyData processingDNA-Binding ProteinsInterferon Regulatory FactorsSingle-Cell Analysis0210 nano-technologyAlgorithmsOmics technologiesSignal TransductionLineage differentiationScienceComputational biologyGeneral Biochemistry Genetics and Molecular BiologyArticle03 medical and health sciencesErythroid CellsAnimalsCell LineageGeneral Chemistrydevelopmental trajectories visualizationHematopoietic Stem CellsPipeline (software)Visualization030104 developmental biologyTheoryofComputation_MATHEMATICALLOGICANDFORMALLANGUAGESCellular heterogeneitySingle cell analysilcsh:QGene expressionTranscriptomeTranscription FactorsNature Communications
researchProduct

Informational and linguistic analysis of large genomic sequence collections via efficient Hadoop cluster algorithms

2018

Abstract Motivation Information theoretic and compositional/linguistic analysis of genomes have a central role in bioinformatics, even more so since the associated methodologies are becoming very valuable also for epigenomic and meta-genomic studies. The kernel of those methods is based on the collection of k-mer statistics, i.e. how many times each k-mer in {A,C,G,T}k occurs in a DNA sequence. Although this problem is computationally very simple and efficiently solvable on a conventional computer, the sheer amount of data available now in applications demands to resort to parallel and distributed computing. Indeed, those type of algorithms have been developed to collect k-mer statistics in…

0301 basic medicineEpigenomicsgenomic analysis; hadoop; distributed computingStatistics and ProbabilityComputer scienceBig dataSequence assemblyGenomeBiochemistryDomain (software engineering)Set (abstract data type)03 medical and health sciencesdistributed computingSoftwareComputational Theory and MathematicAnimalsCluster AnalysisHumansA-DNAk-mer counting distributed computing hadoop map reduceMolecular BiologyEpigenomicsBacteriabusiness.industryk-mer countingEukaryotaLinguisticsComputer Science Applications1707 Computer Vision and Pattern RecognitionGenomicsSequence Analysis DNAComputer Science ApplicationsComputational Mathematics030104 developmental biologymap reduceComputational Theory and MathematicsDistributed algorithmgenomic analysisKernel (statistics)MetagenomehadoopbusinessAlgorithmAlgorithmsSoftware
researchProduct

A DFT study on the chiral synthesis of R-phenylacetyl carbinol within the quantum chemical cluster approach

2017

Abstract The reaction pathway leading to R-phenylacetyl carbinol within the quantum chemical cluster approach is addressed by means of density functional theory (DFT) calculations. The study includes calculation of Fukui functions, activation free energies, and potential energy surface scans, both in gas and solution phase. The protonation states of the nitrogen atoms of the pyrimidine moiety are determined. The reaction appears to be slightly exergonic (ΔG 0  = −5.6 and −4.0 kcal/mol for gas and solution phase, respectively) following a concerted synchronous mechanism having activation free energy barriers of 16.2 and 13.3 kcal/mol, in gas phase and solution phase, respectively.

0301 basic medicineExergonic reaction030102 biochemistry & molecular biologyPyrimidineEnantioselective synthesisGeneral Physics and AstronomyProtonation03 medical and health scienceschemistry.chemical_compound030104 developmental biologychemistryComputational chemistryPotential energy surfaceCluster (physics)MoietyDensity functional theoryPhysical and Theoretical ChemistryChemical Physics Letters
researchProduct

FASTdoop: A versatile and efficient library for the input of FASTA and FASTQ files for MapReduce Hadoop bioinformatics applications

2017

Abstract Summary MapReduce Hadoop bioinformatics applications require the availability of special-purpose routines to manage the input of sequence files. Unfortunately, the Hadoop framework does not provide any built-in support for the most popular sequence file formats like FASTA or BAM. Moreover, the development of these routines is not easy, both because of the diversity of these formats and the need for managing efficiently sequence datasets that may count up to billions of characters. We present FASTdoop, a generic Hadoop library for the management of FASTA and FASTQ files. We show that, with respect to analogous input management routines that have appeared in the Literature, it offers…

0301 basic medicineFASTQ formatStatistics and ProbabilityComputer scienceSequence analysismedia_common.quotation_subjectInformation Storage and RetrievalBioinformaticscomputer.software_genreGenomeBiochemistryDomain (software engineering)03 medical and health sciencesComputational Theory and MathematicHumansGenomic libraryQuality (business)DNA sequencingFASTQ; NGS; FASTQ; DNA sequencingMolecular Biologymedia_commonGene LibrarySequenceDatabaseSettore INF/01 - InformaticaGenome HumanComputer Science Applications1707 Computer Vision and Pattern RecognitionGenomicsSequence Analysis DNAFASTQFile formatComputer Science ApplicationsStatistics and Probability; Biochemistry; Molecular Biology; Computer Science Applications1707 Computer Vision and Pattern Recognition; Computational Theory and Mathematics; Computational MathematicsComputational Mathematics030104 developmental biologyComputational Theory and MathematicsNGSDatabase Management Systemscomputer
researchProduct

The colored longest common prefix array computed via sequential scans

2018

Due to the increased availability of large datasets of biological sequences, the tools for sequence comparison are now relying on efficient alignment-free approaches to a greater extent. Most of the alignment-free approaches require the computation of statistics of the sequences in the dataset. Such computations become impractical in internal memory when very large collections of long sequences are considered. In this paper, we present a new conceptual data structure, the colored longest common prefix array (cLCP), that allows to efficiently tackle several problems with an alignment-free approach. In fact, we show that such a data structure can be computed via sequential scans in semi-exter…

0301 basic medicineFOS: Computer and information sciencesAlignment-free methodsBurrows–Wheeler transformComputer scienceComputationAverage common substring0206 medical engineeringMatching statisticsScale (descriptive set theory)02 engineering and technologyTheoretical Computer Science03 medical and health sciencesComputer Science - Data Structures and AlgorithmsData Structures and Algorithms (cs.DS)Burrows-wheeler transformString (computer science)Computer Science (all)LCP arrayMatching statisticData structureSubstring030104 developmental biologyAlignment-free methods; Average common substring; Burrows-wheeler transform; Longest common prefix; Matching statistics; Theoretical Computer Science; Computer Science (all)Pairwise comparisonLongest common prefixAlgorithm020602 bioinformaticsAlignment-free method
researchProduct

Alignment-free sequence comparison using absent words

2018

Sequence comparison is a prerequisite to virtually all comparative genomic analyses. It is often realised by sequence alignment techniques, which are computationally expensive. This has led to increased research into alignment-free techniques, which are based on measures referring to the composition of sequences in terms of their constituent patterns. These measures, such as $q$-gram distance, are usually computed in time linear with respect to the length of the sequences. In this paper, we focus on the complementary idea: how two sequences can be efficiently compared based on information that does not occur in the sequences. A word is an {\em absent word} of some sequence if it does not oc…

0301 basic medicineFOS: Computer and information sciencesFormal Languages and Automata Theory (cs.FL)Computer Science - Formal Languages and Automata TheorySequence alignmentInformation System0102 computer and information sciencesCircular wordAbsent words01 natural sciencesUpper and lower boundsSequence comparisonTheoretical Computer ScienceCombinatorics03 medical and health sciencesComputer Science - Data Structures and AlgorithmsData Structures and Algorithms (cs.DS)Absent wordCircular wordsMathematicsSequenceSettore INF/01 - InformaticaProcess (computing)q-gramComputer Science Applications1707 Computer Vision and Pattern Recognitionq-gramsComposition (combinatorics)Computer Science Applications030104 developmental biologyComputational Theory and MathematicsForbidden words010201 computation theory & mathematicsFocus (optics)Forbidden wordWord (computer architecture)Information SystemsInteger (computer science)
researchProduct

Integrative analysis of structural variations using short-reads and linked-reads yields highly specific and sensitive predictions.

2020

Genetic diseases are driven by aberrations of the human genome. Identification of such aberrations including structural variations (SVs) is key to our understanding. Conventional short-reads whole genome sequencing (cWGS) can identify SVs to base-pair resolution, but utilizes only short-range information and suffers from high false discovery rate (FDR). Linked-reads sequencing (10XWGS) utilizes long-range information by linkage of short-reads originating from the same large DNA molecule. This can mitigate alignment-based artefacts especially in repetitive regions and should enable better prediction of SVs. However, an unbiased evaluation of this technology is not available. In this study, w…

0301 basic medicineFalse discovery rateComputer scienceArtificial Gene Amplification and ExtensionPolymerase Chain ReactionDatabase and Informatics MethodsSequencing techniques0302 clinical medicineBreast TumorsBasic Cancer ResearchMedicine and Health SciencesDNA sequencingBiology (General)EcologyHigh-Throughput Nucleotide SequencingGenomicsDNA Neoplasm3. Good healthIdentification (information)OncologyComputational Theory and MathematicsModeling and SimulationMCF-7 CellsFemaleSequence AnalysisResearch ArticleBioinformaticsQH301-705.5Breast NeoplasmsGenomicsComputational biologyResearch and Analysis MethodsHuman Genomics03 medical and health sciencesCellular and Molecular NeuroscienceCancer GenomicsGenomic MedicineBreast CancerGeneticsDNA Barcoding TaxonomicHumansMolecular Biology TechniquesMolecular BiologyEcology Evolution Behavior and SystematicsWhole genome sequencingLinkage (software)Whole Genome SequencingGenome HumanDideoxy DNA sequencingGenetic Diseases InbornCancers and NeoplasmsBiology and Life SciencesComputational BiologyStatistical modelSequence Analysis DNARepetitive RegionsLogistic Models030104 developmental biologyGenomic Structural VariationHuman genomeSequence Alignment030217 neurology & neurosurgeryPLoS Computational Biology
researchProduct

On finite groups with many supersoluble subgroups

2017

[EN] The solubility of a finite group with less than 6 non-supersoluble subgroups is confirmed in the paper. Moreover we prove that a finite insoluble group has exactly 6 non-supersoluble subgroups if and only if it is isomorphic to A5 or SL2 (5). Furthermore, it is shown that a finite insoluble group has exactly 22 non-nilpotent subgroups if and only if it is isomorphic to A5 or SL2 (5). This confirms a conjecture of Zarrin (Arch Math (Basel) 99:201 206, 2012).

0301 basic medicineFinite groupConjectureSoluble groupGroup (mathematics)General Mathematics010102 general mathematicsGrups Teoria de01 natural sciencesCombinatoricsMathematics::Group Theory03 medical and health sciences030104 developmental biologyLocally finite groupSupersoluble subgroup0101 mathematicsFinite groupMathematics::Representation TheoryMATEMATICA APLICADAMatemàticaMathematics
researchProduct

Health/Nutrition food claims and low-fat food purchase: Projected personality influence in young consumers

2017

Abstract Health/nutrition food claims are increasingly used in the food industry but firms still require deeper research to develop a better understanding of consumers in the low-fat food market. In pursuit of this goal, this paper analyses the influence of projected consumer personality on healthy claim credibility, Perceived product health, physical appearance and its repercussion on attitudes (overall attitude to the product) and behaviours (purchase intention). With a sample of 300 young consumers (15–25 years old) and through PLS techniques, our results show that project personality influences the credibility of claims about healthiness and physical appearance. Both concepts play a sig…

0301 basic medicineFood industrymedia_common.quotation_subjectMedicine (miscellaneous)Sample (statistics)Human physical appearanceGlobal attitudePurchase intention03 medical and health sciences0502 economics and businessCredibilityPersonalityTX341-641Product (category theory)Food marketmedia_commonYoung consumersProduct category030109 nutrition & dieteticsNutrition and Dieteticsbusiness.industryNutrition. Foods and food supplyHealth/nutrition claims05 social sciencesAdvertising050211 marketingProjected personalityLow-fat foodbusinessPsychologyFood ScienceJournal of Functional Foods
researchProduct

2016

We determine knotting probabilities and typical sizes of knots in double-stranded DNA for chains of up to half a million base pairs with computer simulations of a coarse-grained bead-stick model: Single trefoil knots and composite knots which include at least one trefoil as a prime factor are shown to be common in DNA chains exceeding 250,000 base pairs, assuming physiologically relevant salt conditions. The analysis is motivated by the emergence of DNA nanopore sequencing technology, as knots are a potential cause of erroneous nucleotide reads in nanopore sequencing devices and may severely limit read lengths in the foreseeable future. Even though our coarse-grained model is only based on …

0301 basic medicineGel electrophoresis of nucleic acidsBase pairMonte Carlo methodBiologyBioinformatics01 natural sciences03 medical and health sciencesCellular and Molecular Neurosciencechemistry.chemical_compoundstomatognathic system0103 physical sciencesGeneticsStatistical physics010306 general physicsMolecular BiologyTrefoilEcology Evolution Behavior and SystematicsPersistence lengthQuantitative Biology::BiomoleculesEcologyfood and beveragesMathematics::Geometric TopologyNanoporesurgical procedures operative030104 developmental biologyComputational Theory and MathematicschemistryModeling and SimulationNanopore sequencingDNAPLOS Computational Biology
researchProduct