Search results for "genomics"

showing 10 items of 1255 documents

FASTdoop: A versatile and efficient library for the input of FASTA and FASTQ files for MapReduce Hadoop bioinformatics applications

2017

Abstract Summary MapReduce Hadoop bioinformatics applications require the availability of special-purpose routines to manage the input of sequence files. Unfortunately, the Hadoop framework does not provide any built-in support for the most popular sequence file formats like FASTA or BAM. Moreover, the development of these routines is not easy, both because of the diversity of these formats and the need for managing efficiently sequence datasets that may count up to billions of characters. We present FASTdoop, a generic Hadoop library for the management of FASTA and FASTQ files. We show that, with respect to analogous input management routines that have appeared in the Literature, it offers…

0301 basic medicineFASTQ formatStatistics and ProbabilityComputer scienceSequence analysismedia_common.quotation_subjectInformation Storage and RetrievalBioinformaticscomputer.software_genreGenomeBiochemistryDomain (software engineering)03 medical and health sciencesComputational Theory and MathematicHumansGenomic libraryQuality (business)DNA sequencingFASTQ; NGS; FASTQ; DNA sequencingMolecular Biologymedia_commonGene LibrarySequenceDatabaseSettore INF/01 - InformaticaGenome HumanComputer Science Applications1707 Computer Vision and Pattern RecognitionGenomicsSequence Analysis DNAFASTQFile formatComputer Science ApplicationsStatistics and Probability; Biochemistry; Molecular Biology; Computer Science Applications1707 Computer Vision and Pattern Recognition; Computational Theory and Mathematics; Computational MathematicsComputational Mathematics030104 developmental biologyComputational Theory and MathematicsNGSDatabase Management Systemscomputer

researchProduct

Integrative analysis of structural variations using short-reads and linked-reads yields highly specific and sensitive predictions.

2020

Genetic diseases are driven by aberrations of the human genome. Identification of such aberrations including structural variations (SVs) is key to our understanding. Conventional short-reads whole genome sequencing (cWGS) can identify SVs to base-pair resolution, but utilizes only short-range information and suffers from high false discovery rate (FDR). Linked-reads sequencing (10XWGS) utilizes long-range information by linkage of short-reads originating from the same large DNA molecule. This can mitigate alignment-based artefacts especially in repetitive regions and should enable better prediction of SVs. However, an unbiased evaluation of this technology is not available. In this study, w…

0301 basic medicineFalse discovery rateComputer scienceArtificial Gene Amplification and ExtensionPolymerase Chain ReactionDatabase and Informatics MethodsSequencing techniques0302 clinical medicineBreast TumorsBasic Cancer ResearchMedicine and Health SciencesDNA sequencingBiology (General)EcologyHigh-Throughput Nucleotide SequencingGenomicsDNA Neoplasm3. Good healthIdentification (information)OncologyComputational Theory and MathematicsModeling and SimulationMCF-7 CellsFemaleSequence AnalysisResearch ArticleBioinformaticsQH301-705.5Breast NeoplasmsGenomicsComputational biologyResearch and Analysis MethodsHuman Genomics03 medical and health sciencesCellular and Molecular NeuroscienceCancer GenomicsGenomic MedicineBreast CancerGeneticsDNA Barcoding TaxonomicHumansMolecular Biology TechniquesMolecular BiologyEcology Evolution Behavior and SystematicsWhole genome sequencingLinkage (software)Whole Genome SequencingGenome HumanDideoxy DNA sequencingGenetic Diseases InbornCancers and NeoplasmsBiology and Life SciencesComputational BiologyStatistical modelSequence Analysis DNARepetitive RegionsLogistic Models030104 developmental biologyGenomic Structural VariationHuman genomeSequence Alignment030217 neurology & neurosurgeryPLoS Computational Biology

researchProduct

Shell palaeoproteomics: first application of peptide mass fingerprinting for the rapid identification of mollusc shells in archaeology.

2020

10 pages; International audience; Molluscs were one of the most widely-used natural resources in the past, and their shells are abundant among archaeological findings. However, our knowledge of the variety of shells that were circulating in prehistoric times (and thus their socio-economic and cultural value) is scarce due to the difficulty of achieving taxonomic determination of fragmented and/or worked remains. This study aims to obtain molecular barcodes based on peptide mass fingerprints (PMFs) of intracrystalline proteins, in order to obtain shell identification. Palaeoproteomic applications on shells are challenging, due to low concentration of molluscan proteins and an incomplete unde…

0301 basic medicineFreshwater bivalve[SHS.ARCHEO]Humanities and Social Sciences/Archaeology and PrehistoryBiophysicsShell (structure)BiologyBiochemistryPeptide Mapping03 medical and health sciencesPeptide mass fingerprintingAnimal Shells[SDV.BBM.GTP]Life Sciences [q-bio]/Biochemistry Molecular Biology/Genomics [q-bio.GN]Mollusc shellMollusc shellAnimalsPeptide mass fingerprintPeptide-mass fingerprintPhylogenyShellomics030102 biochemistry & molecular biologyPhylogenetic treeMALDI-TOF mass spectrometry; Mollusc shell; Palaeoproteomics; Peptide mass fingerprint; ShellomicsMALDI-TOF mass spectrometryPalaeoproteomicsArchaeologyBivalvia030104 developmental biologyTaxonArchaeologyIdentification (biology)Peptides

researchProduct

Use of deep learning methods to translate drug-induced gene expression changes from rat to human primary hepatocytes

2020

In clinical trials, animal and cell line models are often used to evaluate the potential toxic effects of a novel compound or candidate drug before progressing to human trials. However, relating the results of animal and in vitro model exposures to relevant clinical outcomes in the human in vivo system still proves challenging, relying on often putative orthologs. In recent years, multiple studies have demonstrated that the repeated dose rodent bioassay, the current gold standard in the field, lacks sufficient sensitivity and specificity in predicting toxic effects of pharmaceuticals in humans. In this study, we evaluate the potential of deep learning techniques to translate the pattern of …

0301 basic medicineGene ExpressionGene Expression Regulation/drug effectsPathology and Laboratory MedicineConvolutional neural networkTOXICITYMachine LearningVoeding Metabolisme en GenomicaTime Measurement0302 clinical medicineGene expressionMedicine and Health SciencesMeasurementClinical Trials as TopicMultidisciplinaryArtificial neural networkPharmaceuticsQRMetabolism and GenomicsTOXICOGENOMICS030220 oncology & carcinogenesisMetabolisme en GenomicaMedicineEngineering and TechnologyNutrition Metabolism and GenomicsHepatocytes/drug effectsAlgorithmsResearch ArticleComputer and Information SciencesClinical Trials as Topic/statistics & numerical dataNeural NetworksGenetic ToxicologyTOXICOLOGYSciencePredictive ToxicologyComputational biologyBiologyComputer03 medical and health sciencesDose Prediction MethodsDeep LearningVoedingArtificial IntelligenceIn vivoGeneticsLife ScienceAnimalsHumansGeneNutritionbusiness.industryDeep learningBiology and Life SciencesGold standard (test)REPRESENTATIONSRats030104 developmental biologyGene Expression RegulationHepatocytesArtificial intelligenceNeural Networks ComputerToxicogenomicsbusinessNeuroscience

researchProduct

"Islands of divergence" in the Atlantic cod genome represent polymorphic chromosomal rearrangements

2016

- In several species genetic differentiation across environmental gradients or between geographically separate populations has been reported to center at “genomic islands of divergence,” resulting in heterogeneous differentiation patterns across genomes. Here, genomic regions of elevated divergence were observed on three chromosomes of the highly mobile fish Atlantic cod (Gadus morhua) within geographically fine-scaled coastal areas. The “genomic islands” extended at least 5, 9.5, and 13 megabases on linkage groups 2, 7, and 12, respectively, and coincided with large blocks of linkage disequilibrium. For each of these three chromosomes, pairs of segregating, highly divergent alleles were id…

0301 basic medicineGene FlowLinkage disequilibriumpopulation genomicsGenomePolymorphism Single NucleotideChromosomesLinkage DisequilibriumDivergenceGene flowPopulation genomics03 medical and health sciencesecological adaptationVDP::Genetikk og genomikk: 474VDP::Genetics and genomics: 474GeneticsGadusAnimalsAllele:Genetikk og genomikk: 474 [VDP]Ecology Evolution Behavior and Systematicschromosomal rearrangementsChromosomal inversionGeneticsmarine organismsGenomebiologystructural polymorphismsbiology.organism_classificationAdaptation Physiological030104 developmental biologyGadus morhuaChromosome InversionMetagenomics:Genetics and genomics: 474 [VDP]Research Article

researchProduct

MiasDB: A Database of Molecular Interactions Associated with Alternative Splicing of Human Pre-mRNAs.

2016

Alternative splicing (AS) is pervasive in human multi-exon genes and is a major contributor to expansion of the transcriptome and proteome diversity. The accurate recognition of alternative splice sites is regulated by information contained in networks of protein-protein and protein-RNA interactions. However, the mechanisms leading to splice site selection are not fully understood. Although numerous databases have been built to describe AS, molecular interaction databases associated with AS have only recently emerged. In this study, we present a new database, MiasDB, that provides a description of molecular interactions associated with human AS events. This database covers 938 interactions …

0301 basic medicineGene regulatory networklcsh:MedicineRNA-binding proteinRNA-binding proteinscomputer.software_genreBiochemistryHistonesExonDatabase and Informatics MethodsDatabases GeneticProtein Interaction MappingRNA PrecursorsGene Regulatory NetworksDatabase Searchinglcsh:ScienceMultidisciplinaryDatabaseExonsGenomicsGenomic DatabasesNucleic acidsRNA splicingProteomeSequence AnalysisResearch ArticleSequence DatabasesBiologyResponse ElementsResearch and Analysis MethodsGenome Complexity03 medical and health sciencesGeneticsHumansMolecular Biology TechniquesSequencing TechniquesProtein InteractionsGeneMolecular BiologyInternetlcsh:RAlternative splicingIntronBiology and Life SciencesComputational BiologyProteinsGenome AnalysisIntronsAlternative Splicing030104 developmental biologyBiological DatabasesRNA processingRNAlcsh:QRNA Splice SitesGene expressioncomputerProtein KinasesTranscription FactorsPloS one

researchProduct

Measuring the clustering effect of BWT via RLE

2017

Abstract The Burrows–Wheeler Transform (BWT) is a reversible transformation on which are based several text compressors and many other tools used in Bioinformatics and Computational Biology. The BWT is not actually a compressor, but a transformation that performs a context-dependent permutation of the letters of the input text that often create runs of equal letters (clusters) longer than the ones in the original text, usually referred to as the “clustering effect” of BWT. In particular, from a combinatorial point of view, great attention has been given to the case in which the BWT produces the fewest number of clusters (cf. [5] , [16] , [21] , [23] ). In this paper we are concerned about t…

0301 basic medicineGeneral Computer SciencePermutationComputer Science (all)Binary number0102 computer and information sciencesQuantitative Biology::Genomics01 natural sciencesUpper and lower boundsTheoretical Computer ScienceCombinatorics03 medical and health sciencesPermutation030104 developmental biologyTransformation (function)BWT010201 computation theory & mathematicsRun-length encodingComputer Science::Data Structures and AlgorithmsCluster analysisPrimitive root modulo nBWT; Permutation; Run-length encoding; Theoretical Computer Science; Computer Science (all)Word (computer architecture)Run-length encodingMathematics

researchProduct

Genomics of speciation and introgression in Princess cichlid fishes from Lake Tanganyika.

2016

How variation in the genome translates into biological diversity and new species originate has endured as the mystery of mysteries in evolutionary biology. African cichlid fishes are prime model systems to address speciation-related questions for their remarkable taxonomic and phenotypic diversity, and the possible role of gene flow in this process. Here, we capitalize on genome sequencing and phylogenomic analyses to address the relative impacts of incomplete lineage sorting, introgression and hybrid speciation in the Neolamprologus savoryi-complex (the 'Princess cichlids') from Lake Tanganyika. We present a time-calibrated species tree based on whole-genome sequences and provide strong ev…

0301 basic medicineGenetic SpeciationIntrogressionGenomicsBiologyTanzaniaNucleotide diversityCoalescent theory03 medical and health sciencesCichlidGeneticsAnimalsEcology Evolution Behavior and SystematicsGeneticsGenomeCichlidsGenomicsbiology.organism_classificationLakes030104 developmental biologyGenetic SpeciationPhenotypeEvolutionary biologyHybrid speciationNeolamprologushuman activitiesMolecular ecology

researchProduct

Epigenetic regulation of DNA repair genes and implications for tumor therapy

2017

DNA repair represents the first barrier against genotoxic stress causing metabolic changes, inflammation and cancer. Besides its role in preventing cancer, DNA repair needs also to be considered during cancer treatment with radiation and DNA damaging drugs as it impacts therapy outcome. The DNA repair capacity is mainly governed by the expression level of repair genes. Alterations in the expression of repair genes can occur due to mutations in their coding or promoter region, changes in the expression of transcription factors activating or repressing these genes, and/or epigenetic factors changing histone modifications and CpG promoter methylation or demethylation levels. In this review we …

0301 basic medicineGeneticsDNA RepairDNA repairHealth Toxicology and MutagenesisDNA MethylationBiologyEpigenesis Genetic03 medical and health sciences030104 developmental biology0302 clinical medicineEpigenetics of physical exerciseNeoplasms030220 oncology & carcinogenesisDNA Repair ProteinDNA methylationGeneticsCancer researchAnimalsHumansCpG IslandsDNA mismatch repairEpigeneticsCancer epigeneticsEpigenomicsMutation Research/Reviews in Mutation Research

researchProduct

Evaluation of the RYR1 gene genetic diversity in the Latvian White pig breed

2016

The ryanodine receptor 1 (RYR1) is a calcium ion channel in the sarcoplasmic reticulum of skeletal muscle. Multiple polymorphic loci have been identified in the RYR1 gene in human and animals and some of them are associated with certain phenotypes. However, there are still few data on the RYR1 genetic variability in pig and only the missense mutation Arg615Cys, associated with the malignant hyperthermia, porcine stress syndrome and meat quality, has been studied in several commercial and local breeds. Aim. To genotype the rs344435545 (C1972T, Arg615Cys), rs196953058 (T8434C, Phe2769Leu) and rs323041392 (G12484A, Asp4119Asn) in the Latvian local pig breed Latvian White and to evaluate the ev…

0301 basic medicineGeneticspigGenetic diversityAnimal breedingbiologyQH301-705.5genetic diversityQH426-470biology.organism_classificationGeneral Biochemistry Genetics and Molecular BiologyBreedpolymorphism03 medical and health sciences030104 developmental biologyGenetic variationGenotypeRYR1GeneticsGenomics Transcriptomics and ProteomicsRestriction fragment length polymorphismAlleleBiology (General)Latvian White pigBiopolymers and Cell

researchProduct