Search results for "Informatics"
showing 10 items of 2542 documents
Analyzing big datasets of genomic sequences: fast and scalable collection of k-mer statistics
2019
Abstract Background Distributed approaches based on the MapReduce programming paradigm have started to be proposed in the Bioinformatics domain, due to the large amount of data produced by the next-generation sequencing techniques. However, the use of MapReduce and related Big Data technologies and frameworks (e.g., Apache Hadoop and Spark) does not necessarily produce satisfactory results, in terms of both efficiency and effectiveness. We discuss how the development of distributed and Big Data management technologies has affected the analysis of large datasets of biological sequences. Moreover, we show how the choice of different parameter configurations and the careful engineering of the …
Modeling crowd dynamics through coarse-grained data analysis
2018
International audience; Understanding and predicting the collective behaviour of crowds is essential to improve the efficiency of pedestrian flows in urban areas and minimize the risks of accidents at mass events. We advocate for the development of crowd traffic management systems, whereby observations of crowds can be coupled to fast and reliable models to produce rapid predictions of the crowd movement and eventually help crowd managers choose between tailored optimization strategies. Here, we propose a Bi-directional Macroscopic (BM) model as the core of such a system. Its key input is the fundamental diagram for bi-directional flows, i.e. the relation between the pedestrian fluxes and d…
Controlling false match rates in record linkage using extreme value theory
2011
AbstractCleansing data from synonyms and homonyms is a relevant task in fields where high quality of data is crucial, for example in disease registries and medical research networks. Record linkage provides methods for minimizing synonym and homonym errors thereby improving data quality. We focus our attention to the case of homonym errors (in the following denoted as ‘false matches’), in which records belonging to different entities are wrongly classified as equal. Synonym errors (‘false non-matches’) occur when a single entity maps to multiple records in the linkage result. They are not considered in this study because in our application domain they are not as crucial as false matches. Fa…
EHRtemporalVariability
2020
Functions to delineate temporal dataset shifts in Electronic Health Records through the projection and visualization of dissimilarities among data temporal batches. This is done through the estimation of data statistical distributions over time and their projection in non-parametric statistical manifolds, uncovering the patterns of the data latent temporal variability. EHRtemporalVariability is particularly suitable for multi-modal data and categorical variables with a high number of values, common features of biomedical data where traditional statistical process control or time-series methods may not be appropriate. EHRtemporalVariability allows you to explore and identify dataset shifts t…
VegItaly: Technical features, crucial issues and some solutions
2012
VegItaly is at present the largest Italian vegetation database. It is the result of a collaborative project aspiring to represent a major reference for the Italian vegetation scientists. The paper emphasizes its benefits for phytosociological data management and describes the solutions adopted to solve several technical problems, like the treatment of different vegetation stratification systems, the conversion of vegetation cover values, taxonomic and syntaxonomic issues, data import and access. The structure of the taxonomic list produced to support the storing of data is described. It allows an easy management of synonymic relationships and is constantly updated according to new publicati…
Technology-Supported Guidance Models Stimulating the Development of Critical Thinking in Clinical Practice: Protocol for a Mixed Methods Systematic R…
2020
BackgroundCritical thinking is an essential skill that nursing students need to develop. Technological tools have opened new avenues for technology-supported guidance models, but the challenges and facilitators of such guidance models, as well as how they stimulate the development of critical thinking, remain unclear.ObjectiveWe developed a protocol for a mixed methods systematic review to investigate the use of technology-supported guidance models that stimulate the development of critical thinking in nursing education clinical practice.MethodsA convergent integrated design following the Joanna Briggs Institute Manual for Evidence Synthesis will be employed. A pair of authors will select t…
Assessing the format and content of journal published and non-journal published rapid review reports: A comparative study
2020
Background As production of rapid reviews (RRs) increases in healthcare, knowing how to efficiently convey RR evidence to various end-users is important given they are often intended to directly inform decision-making. Little is known about how often RRs are produced in the published or unpublished domains, and what and how information is structured. Objectives To compare and contrast report format and content features of journal-published (JP) and non-journal published (NJP) RRs. Methods JP RRs were identified from key databases, and NJP RRs were identified from a grey literature search of 148 RR producing organizations and were sampled proportionate to cluster size by organization and pro…
PyCellBase, an efficient python package for easy retrieval of biological data from heterogeneous sources.
2019
Background Biological databases and repositories are incrementing in diversity and complexity over the years. This rapid expansion of current and new sources of biological knowledge raises serious problems of data accessibility and integration. To handle the growing necessity of unification, CellBase was created as an integrative solution. CellBase provides a centralized NoSQL database containing biological information from different and heterogeneous sources. Access to this information is done through a RESTful web service API, which provides an efficient interface to the data. Results In this work we present PyCellBase, a Python package that provides programmatic access to the rich RESTfu…
Analysis of Lipid Experiments (ALEX): A Software Framework for Analysis of High-Resolution Shotgun Lipidomics Data
2013
Global lipidomics analysis across large sample sizes produces high-content datasets that require dedicated software tools supporting lipid identification and quantification, efficient data management and lipidome visualization. Here we present a novel software-based platform for streamlined data processing, management and visualization of shotgun lipidomics data acquired using high-resolution Orbitrap mass spectrometry. The platform features the ALEX framework designed for automated identification and export of lipid species intensity directly from proprietary mass spectral data files, and an auxiliary workflow using database exploration tools for integration of sample information, computat…
Convolutional Neural Network With Shape Prior Applied to Cardiac MRI Segmentation.
2019
In this paper, we present a novel convolutional neural network architecture to segment images from a series of short-axis cardiac magnetic resonance slices (CMRI). The proposed model is an extension of the U-net that embeds a cardiac shape prior and involves a loss function tailored to the cardiac anatomy. Since the shape prior is computed offline only once, the execution of our model is not limited by its calculation. Our system takes as input raw magnetic resonance images, requires no manual preprocessing or image cropping and is trained to segment the endocardium and epicardium of the left ventricle, the endocardium of the right ventricle, as well as the center of the left ventricle. Wit…