Search results for "DATA MINING"

showing 10 items of 907 documents

Feasibility of sample size calculation for RNA-seq studies

2017

Sample size calculation is a crucial step in study design but is not yet fully established for RNA sequencing (RNA-seq) analyses. To evaluate feasibility and provide guidance, we evaluated RNA-seq sample size tools identified from a systematic search. The focus was on whether real pilot data would be needed for reliable results and on identifying tools that would perform well in scenarios with different levels of biological heterogeneity and fold changes (FCs) between conditions. We used simulations based on real data for tool evaluation. In all settings, the six evaluated tools provided widely different answers, which were strongly affected by FC. Although all tools failed for small FCs, s…

0301 basic medicineFold (higher-order function)Sequence Analysis RNAComputer scienceHigh-Throughput Nucleotide SequencingRNA-Seqcomputer.software_genre03 medical and health sciences030104 developmental biology0302 clinical medicineResearch DesignSample size determinationSample SizeFeasibility StudiesHumansData miningMolecular BiologycomputerSoftware030217 neurology & neurosurgeryInformation SystemsSystematic searchBriefings in Bioinformatics

researchProduct

Common Hits Approach: Combining Pharmacophore Modeling and Molecular Dynamics Simulations.

2017

We present a new approach that incorporates flexibility based on extensive MD simulations of protein-ligand complexes into structure-based pharmacophore modeling and virtual screening. The approach uses the multiple coordinate sets saved during the MD simulations and generates for each frame a pharmacophore model. Pharmacophore models with the same pharmacophore features are pooled. In this way the high number of pharmacophore models that results from the MD simulation is reduced to only a few hundred representative pharmacophore models. Virtual screening runs are performed with every representative pharmacophore model; the screening results are combined and rescored to generate a single hi…

0301 basic medicineGeneral Chemical EngineeringDrug Evaluation PreclinicalLibrary and Information SciencesMolecular Dynamics Simulationcomputer.software_genreLigandsLigandScoutCommon Hits Approach (CHA)03 medical and health sciencesMolecular dynamicsUser-Computer InterfaceComputational chemistryPharmacophore ModelingFlexibility (engineering)Virtual screeningChemistryFrame (networking)ProteinsGeneral ChemistryInto-structureSettore CHIM/08 - Chimica FarmaceuticaComputer Science Applications030104 developmental biologyData miningPharmacophorecomputerJournal of chemical information and modeling

researchProduct

CLOVE: classification of genomic fusions into structural variation events

2017

Background A precise understanding of structural variants (SVs) in DNA is important in the study of cancer and population diversity. Many methods have been designed to identify SVs from DNA sequencing data. However, the problem remains challenging because existing approaches suffer from low sensitivity, precision, and positional accuracy. Furthermore, many existing tools only identify breakpoints, and so not collect related breakpoints and classify them as a particular type of SV. Due to the rapidly increasing usage of high throughput sequencing technologies in this area, there is an urgent need for algorithms that can accurately classify complex genomic rearrangements (involving more than …

0301 basic medicineGenomicsBiologycomputer.software_genrelcsh:Computer applications to medicine. Medical informaticsBiochemistryChromosomesDNA sequencingSet (abstract data type)Structural variationUser-Computer Interface03 medical and health sciencesStructural BiologyEscherichia coliHumansCopy-number variationMolecular Biologylcsh:QH301-705.5InternetMethodology ArticleApplied MathematicsBreakpointGenomic rearrangementsDNAGenomicsStructural variationsComputer Science ApplicationsIdentification (information)030104 developmental biologylcsh:Biology (General)Nucleic Acid ConformationGraph (abstract data type)lcsh:R858-859.7Data miningcomputerAlgorithmsBMC Bioinformatics

researchProduct

FragClust and TestClust, two informatics tools for chemical structure hierarchical clustering analysis applied to lipidomics. The example of Alzheime…

2016

Lipidomic analysis is able to measure simultaneously thousands of compounds belonging to a few lipid classes. In each lipid class, compounds differ only by the acyl radical, ranging between C10:0 (capric acid) and C24:0 (lignoceric acid). Although some metabolites have a peculiar pathological role, more often compounds belonging to a single lipid class exert the same biological effect. Here, we present a lipidomics workflow that extracts the tandem mass spectrometry data from individual files and uses them to group compounds into structurally homogeneous clusters by chemical structure hierarchical clustering analysis (CHCA). The case-to-control peak area ratios of the metabolites are then a…

0301 basic medicineHigh-resolution mass spectrometrySettore MED/09 - Medicina InternaChemical structureComputational biologyPlasma biomarkers01 natural sciencesTriglycerideBiochemistryHomogeneous clustersAnalytical ChemistryCeramide03 medical and health sciencesAlzheimer DiseaseTandem Mass SpectrometryHealth informatics toolsLipidomicsHumansStatistical analysisData miningChromatography High Pressure LiquidAgedAged 80 and overMolecular StructureChemistry010401 analytical chemistryLipids0104 chemical sciencesHierarchical clusteringPhospholipid030104 developmental biologyWorkflowBiochemistryCase-Control StudiesSettore MED/26 - Neurologia

researchProduct

A multicenter study benchmarks software tools for label-free proteome quantification

2016

The consistent and accurate quantification of proteins by mass spectrometry (MS)-based proteomics depends on the performance of instruments, acquisition methods and data analysis software. In collaboration with the software developers, we evaluated OpenSWATH, SWATH2.0, Skyline, Spectronaut and DIA-Umpire, five of the most widely used software methods for processing data from SWATH-MS (sequential window acquisition of all theoretical fragment ion spectra), a method that uses data-independent acquisition (DIA) for label-free protein quantification. We analyzed high-complexity test datasets from hybrid proteome samples of defined quantitative composition acquired on two different MS instrument…

0301 basic medicineInternationalityProteomeComputer sciencemedia_common.quotation_subjectSoftware toolQuantitative proteomicsBiomedical EngineeringBioengineeringcomputer.software_genreBioinformaticsSensitivity and SpecificityApplied Microbiology and BiotechnologyArticleMass Spectrometry03 medical and health sciencesSoftwareQuality (business)media_commonLabel freeStaining and Labeling030102 biochemistry & molecular biologybusiness.industryReproducibility of ResultsBenchmarkingComputingMethodologies_PATTERNRECOGNITION030104 developmental biologyMulticenter studyProteomeBenchmark (computing)Molecular MedicineData miningbusinesscomputerAlgorithmsSoftwareBiotechnologyNature Biotechnology

researchProduct

The predictive value of microbiological findings on teeth, internal and external implant portions in clinical decision making

2017

International audience; Aim: The primary aim of this study was to evaluate 23 pathogens associated with peri-implantitis at inner part of implant connections, in peri-implant and periodontal pockets between patients suffering peri-implantitis and participants with healthy peri-implant tissues; the secondary aim was to estimate the predictive value of microbiological profile in patients wearing dental implants using data mining methods.Material and Methods: Fifty participants included in the present case─control study were scheduled for collection of plaque samples from the peri-implant pockets, internal connection, and periodontal pocket. Real-time polymerase chain reaction was performed to…

0301 basic medicineMalePeri-implantitisGingival and periodontal pocketAlveolar Bone LossDentistry0302 clinical medicineRadiography DentalMedicinePeri-implantitisperiodontitis[SDV.MHEP.RSOA] Life Sciences [q-bio]/Human health and pathology/Rhumatology and musculoskeletal system[SDV.MHEP.GEG] Life Sciences [q-bio]/Human health and pathology/Geriatry and gerontology[SDV.MHEP.GEG]Life Sciences [q-bio]/Human health and pathology/Geriatry and gerontologyMiddle AgedPredictive value3. Good healthDental Implantation[SDV.MHEP.RSOA]Life Sciences [q-bio]/Human health and pathology/Rhumatology and musculoskeletal systemFemaleOral SurgeryInfectionperi-implantitisAdultDecision treesClinical Decision-MakingDental PlaqueDental plaqueReal-Time Polymerase Chain Reaction03 medical and health sciencesHumansPeriodontal PocketPeriodontitisParvimonas micraData miningAgedPeriodontitisDental Implantsdecision treesbusiness.industryCase-control studydata mining; decision trees; infection; peri-implantitis; periodontitis030206 dentistrydata miningmedicine.diseaseinfection030104 developmental biologyCase-Control StudiesImplantbusinessTooth

researchProduct

Conf-VLKA: A structure-based revisitation of the Virtual Lock-and-key Approach

2016

In a previous work, we developed the in house Virtual Lock-and-Key Approach (VLKA) in order to evaluate target assignment starting from molecular descriptors calculated on known inhibitors used as an information source. This protocol was able to predict the correct biological target for the whole dataset with a good degree of reliability (80%), and proved experimentally, which was useful for the target fishing of unknown compounds. In this paper, we tried to remodel the previous in house developed VLKA in a more sophisticated one in order to evaluate the influence of 3D conformation of ligands on the accuracy of the prediction. We applied the same previous algorithm of scoring and ranking b…

0301 basic medicineMaterials Chemistry2506 Metals and AlloysInhibitorStructure-basedComputer scienceProtein ConformationProtein Data Bank (RCSB PDB)Molecular ConformationTarget fishingMolecular Dynamics Simulationcomputer.software_genreLigands01 natural sciencesDockingVlka03 medical and health sciencesMolecular descriptorMaterials ChemistryHumansPhysical and Theoretical ChemistryCluster analysisDatabases ProteinSimulationSpectroscopyBinding SitesProteinscomputer.file_formatDescriptorProtein Data BankComputer Graphics and Computer-Aided Design0104 chemical sciencesMolecular Docking Simulation010404 medicinal & biomolecular chemistry030104 developmental biologyProtein–ligand dockingBiological targetDocking (molecular)Biological targetStructure basedLigand-basedData miningcomputerAlgorithmsSoftwareProtein Binding

researchProduct

The macroecology of cancer incidences in humans is associated with large-scale assemblages of endemic infections.

2018

8 pages; International audience; It is now well supported that 20% of human cancers have an infectious causation (i.e., oncogenic agents). Accumulating evidence suggests that aside from this direct role, other infectious agents may also indirectly affect cancer epidemiology through interactions with the oncogenic agents within the wider infection community. Here, we address this hypothesis via analysis of large-scale global data to identify associations between human cancer incidence and assemblages of neglected infectious agents. We focus on a gradient of three widely-distributed cancers with an infectious cause: bladder (~2% of recorded cancer cases are due to Shistosoma haematobium), liv…

0301 basic medicineMicrobiology (medical)Endemic Diseases[SDV.CAN]Life Sciences [q-bio]/CancerMicrobiologyBiomesHelicobacter Infections[ SDV.CAN ] Life Sciences [q-bio]/Cancer03 medical and health sciencesSchistosomiasis haematobiaEnvironmental healthNeoplasmsPathogen-cancer interactionsEpidemiology of cancerGeneticsmedicine[ SDV.EE.IEO ] Life Sciences [q-bio]/Ecology environment/SymbiosisAnimalsHumansStomach cancerMolecular BiologyData miningEcology Evolution Behavior and SystematicsHuman cancer incidencesBladder cancerCancer preventionbiologyIncidenceCancerHelicobacter pyloriHepatitis Bmedicine.diseasebiology.organism_classificationHepatitis BHepatitis C3. Good health030104 developmental biologyInfectious DiseasesNeglected diseasesHost-Pathogen InteractionsFemalePublic HealthPublic health strategiesLiver cancer[SDV.EE.IEO]Life Sciences [q-bio]/Ecology environment/Symbiosis

researchProduct

Toward a direct and scalable identification of reduced models for categorical processes.

2017

The applicability of many computational approaches is dwelling on the identification of reduced models defined on a small set of collective variables (colvars). A methodology for scalable probability-preserving identification of reduced models and colvars directly from the data is derived—not relying on the availability of the full relation matrices at any stage of the resulting algorithm, allowing for a robust quantification of reduced model uncertainty and allowing us to impose a priori available physical information. We show two applications of the methodology: (i) to obtain a reduced dynamical model for a polypeptide dynamics in water and (ii) to identify diagnostic rules from a standar…

0301 basic medicineMultidisciplinarybusiness.industryComputer scienceDimensionality reductionBayesian inferenceMachine learningcomputer.software_genre01 natural sciencesReduction (complexity)010104 statistics & probability03 medical and health sciencesIdentification (information)030104 developmental biologyPhysical informationPhysical SciencesA priori and a posterioriArtificial intelligenceData mining0101 mathematicsCluster analysisbusinessCategorical variablecomputerProceedings of the National Academy of Sciences of the United States of America

researchProduct

The Anemonia viridis Venom: Coupling Biochemical Purification and RNA-Seq for Translational Research

2018

Blue biotechnologies implement marine bio-resources for addressing practical concerns. The isolation of biologically active molecules from marine animals is one of the main ways this field develops. Strikingly, cnidaria are considered as sustainable resources for this purpose, as they possess unique cells for attack and protection, producing an articulated cocktail of bioactive substances. The Mediterranean sea anemone Anemonia viridis has been studied extensively for years. In this short review, we summarize advances in bioprospecting of the A. viridis toxin arsenal. A. viridis RNA datasets and toxin data mining approaches are briefly described. Analysis reveals the major pool of neurotoxi…

0301 basic medicineNeurotoxinsPharmaceutical ScienceRNA-SeqVenomReviewComputational biologyCnidarian VenomAnemoniaTranslational Research Biomedicaltranscriptomics03 medical and health sciencescomputational biologyCnidarian VenomsDrug DiscoveryAnimalsData MiningMarine ToxinTranslational Medical Researchlcsh:QH301-705.5Pharmacology Toxicology and Pharmaceutics (miscellaneous)Sea AnemoneBioprospectingbiologyAnimalSequence Analysis RNASustainable resourcesDrug Discovery3003 Pharmaceutical ScienceRNAAnemonebio-prospectingbiology.organism_classificationSea Anemones030104 developmental biologyTranscriptomiclcsh:Biology (General)RNAMarine ToxinsNeurotoxinMarine toxinMarine Drugs

researchProduct