Search results for "Database"
showing 10 items of 2136 documents
Computational annotation of genes differentially expressed along olive fruit development
2009
Abstract Background Olea europaea L. is a traditional tree crop of the Mediterranean basin with a worldwide economical high impact. Differently from other fruit tree species, little is known about the physiological and molecular basis of the olive fruit development and a few sequences of genes and gene products are available for olive in public databases. This study deals with the identification of large sets of differentially expressed genes in developing olive fruits and the subsequent computational annotation by means of different software. Results mRNA from fruits of the cv. Leccino sampled at three different stages [i.e., initial fruit set (stage 1), completed pit hardening (stage 2) a…
gcType : a high-quality type strain genome database for microbial phylogenetic and functional research
2020
Abstract Taxonomic and functional research of microorganisms has increasingly relied upon genome-based data and methods. As the depository of the Global Catalogue of Microorganisms (GCM) 10K prokaryotic type strain sequencing project, Global Catalogue of Type Strain (gcType) has published 1049 type strain genomes sequenced by the GCM 10K project which are preserved in global culture collections with a valid published status. Additionally, the information provided through gcType includes >12 000 publicly available type strain genome sequences from GenBank incorporated using quality control criteria and standard data annotation pipelines to form a high-quality reference database. This …
Analyzing big datasets of genomic sequences: fast and scalable collection of k-mer statistics
2019
Abstract Background Distributed approaches based on the MapReduce programming paradigm have started to be proposed in the Bioinformatics domain, due to the large amount of data produced by the next-generation sequencing techniques. However, the use of MapReduce and related Big Data technologies and frameworks (e.g., Apache Hadoop and Spark) does not necessarily produce satisfactory results, in terms of both efficiency and effectiveness. We discuss how the development of distributed and Big Data management technologies has affected the analysis of large datasets of biological sequences. Moreover, we show how the choice of different parameter configurations and the careful engineering of the …
Preventive strategies and factors associated with surgically treated necrotising enterocolitis in extremely preterm infants: an international unit su…
2019
ObjectivesTo compare necrotising enterocolitis (NEC) prevention practices and NEC associated factors between units from eight countries of the International Network for Evaluation of Outcomes of Neonates, and to assess their association with surgical NEC rates.DesignProspective unit-level survey combined with retrospective cohort study.SettingNeonatal intensive care units in Australia/New Zealand, Canada, Finland, Israel, Spain, Sweden, Switzerland and Tuscany (Italy).PatientsExtremely preterm infants born between 240to 286weeks’ gestation, with birth weights<1500 g, and admitted between 2014–2015.ExposuresNEC prevention practices (probiotics, feeding, donor milk) using responses of an o…
Modeling crowd dynamics through coarse-grained data analysis
2018
International audience; Understanding and predicting the collective behaviour of crowds is essential to improve the efficiency of pedestrian flows in urban areas and minimize the risks of accidents at mass events. We advocate for the development of crowd traffic management systems, whereby observations of crowds can be coupled to fast and reliable models to produce rapid predictions of the crowd movement and eventually help crowd managers choose between tailored optimization strategies. Here, we propose a Bi-directional Macroscopic (BM) model as the core of such a system. Its key input is the fundamental diagram for bi-directional flows, i.e. the relation between the pedestrian fluxes and d…
IMI – Oral biopharmaceutics tools project – Evaluation of bottom-up PBPK prediction success part 4: Prediction accuracy and software comparisons with…
2020
Oral drug absorption is a complex process depending on many factors, including the physicochemical properties of the drug, formulation characteristics and their interplay with gastrointestinal physiology and biology. Physiological-based pharmacokinetic (PBPK) models integrate all available information on gastro-intestinal system with drug and formulation data to predict oral drug absorption. The latter together with in vitro-in vivo extrapolation and other preclinical data on drug disposition can be used to predict plasma concentration-time profiles in silico. Despite recent successes of PBPK in many areas of drug development, an improvement in their utility for evaluating oral absorption i…
Probabilistic techniques for bridging the semantic gap in schema alignment
Connecting pieces of informations from heterogeneous sources sharing the same domain is an open challenge in Semantic Web, Big Data and business communities. The main problem in this research area is to bridge the expressiveness gap between relational databases and ontologies. In general, an ontology is more expressive and captures more semantic information behind data than a relational database does. On the other side, databases are the most common used persistent storage system and they grant benefits such as security and data integrity but they need to be managed by expert users. The problem is quite significant above all when enterprise or corporate ontologies are used to share infomation…
Network reconstruction for trans acting genetic loci using multi-omics data and prior information.
2022
Background: Molecular measurements of the genome, the transcriptome, and the epigenome, often termed multi-omics data, provide an in-depth view on biological systems and their integration is crucial for gaining insights in complex regulatory processes. These data can be used to explain disease related genetic variants by linking them to intermediate molecular traits (quantitative trait loci, QTL). Molecular networks regulating cellular processes leave footprints in QTL results as so-called trans-QTL hotspots. Reconstructing these networks is a complex endeavor and use of biological prior information can improve network inference. However, previous efforts were limited in the types of priors…
BioTIME: A database of biodiversity time series for the Anthropocene
2018
Abstract Motivation The BioTIME database contains raw data on species identities and abundances in ecological assemblages through time. These data enable users to calculate temporal trends in biodiversity within and amongst assemblages using a broad range of metrics. BioTIME is being developed as a community-led open-source database of biodiversity time series. Our goal is to accelerate and facilitate quantitative analysis of temporal patterns of biodiversity in the Anthropocene. Main types of variables included The database contains 8,777,413 species abundance records, from assemblages consistently sampled for a minimum of 2 years, which need not necessarily be consecutive. In addition, th…
Controlling false match rates in record linkage using extreme value theory
2011
AbstractCleansing data from synonyms and homonyms is a relevant task in fields where high quality of data is crucial, for example in disease registries and medical research networks. Record linkage provides methods for minimizing synonym and homonym errors thereby improving data quality. We focus our attention to the case of homonym errors (in the following denoted as ‘false matches’), in which records belonging to different entities are wrongly classified as equal. Synonym errors (‘false non-matches’) occur when a single entity maps to multiple records in the linkage result. They are not considered in this study because in our application domain they are not as crucial as false matches. Fa…