Search results for " database"
showing 10 items of 684 documents
Population geocoding for healthcare management. Technical challenges and quality issues
2015
The present work aims at describing the main issues related with population geocoding for healthcare management. Some of the available procedures for geocoding multiple addresses are described and an indicator of quality of the geocoded addresses is proposed. As a case study, the geocoding of population addresses of a set of 9 Sicilian Municipalities is described and results deriving from the use of two different methods are compared in terms of quality. Some potential applications of population geocoding in healthcare management are finally discussed.
Streamlining distributed Deep Learning I/O with ad hoc file systems
2021
With evolving techniques to parallelize Deep Learning (DL) and the growing amount of training data and model complexity, High-Performance Computing (HPC) has become increasingly important for machine learning engineers. Although many compute clusters already use learning accelerators or GPUs, HPC storage systems are not suitable for the I/O requirements of DL workflows. Therefore, users typically copy the whole training data to the worker nodes or distribute partitions. Because DL depends on randomized input data, prior work stated that partitioning impacts DL accuracy. Their solutions focused mainly on training I/O performance on a high-speed network but did not cover the data stage-in pro…
VegItaly: Technical features, crucial issues and some solutions
2012
VegItaly is at present the largest Italian vegetation database. It is the result of a collaborative project aspiring to represent a major reference for the Italian vegetation scientists. The paper emphasizes its benefits for phytosociological data management and describes the solutions adopted to solve several technical problems, like the treatment of different vegetation stratification systems, the conversion of vegetation cover values, taxonomic and syntaxonomic issues, data import and access. The structure of the taxonomic list produced to support the storing of data is described. It allows an easy management of synonymic relationships and is constantly updated according to new publicati…
Distributed Real-Time Sentiment Analysis for Big Data Social Streams
2014
Big data trend has enforced the data-centric systems to have continuous fast data streams. In recent years, real-time analytics on stream data has formed into a new research field, which aims to answer queries about "what-is-happening-now" with a negligible delay. The real challenge with real-time stream data processing is that it is impossible to store instances of data, and therefore online analytical algorithms are utilized. To perform real-time analytics, pre-processing of data should be performed in a way that only a short summary of stream is stored in main memory. In addition, due to high speed of arrival, average processing time for each instance of data should be in such a way that…
Effectively and efficiently supporting crowd-enabled databases via NoSQL paradigms
2013
In this paper we provide an overview of the Hints From the Crowd (HFC) project, whose main goal is to build a NoSQL database system for large collections of product reviews; the database is queried by expressing a natural language sentence; the result is a list of products ranked based on the relevance of reviews w.r.t. the natural language sentence. The best ranked products in the result list can be seen as the best hints for the user based on crowd opinions (the reviews). The HFC prototype has been developed as a web application, independent of the particular application domain of the collected product reviews. Queries are performed by evaluating a text-based ranking metric for sets of re…
Migration of Relational Database to Document-Oriented Database: Structure Denormalization and Data Transformation
2015
Relational databases remain the leading data storage technology. Nevertheless, many companies want to reduce operating expenses, to make scalable applications that use cloud computing technologies. Use of NoSQL database is one of the possible solutions, and it is forecasted that the NoSQL market will be growing at a CAGR of approximately 50 percent over the next five years. The paper offers a solution for quick data migration from a relational database into a document-oriented database. We have created semi-automatically two logical levels over physical data. Users can refine generated logical data model and configure data migration template for each needed document. Data migration features…
PDB: A pictorial database oriented to data analysis
1993
The paper describes a new pictorial database oriented to image analysis, implemented inside the MIDAS data analysis system. Pictorial databases need expressive data structures in order to represent a wide class of information from the numerical to the visual. The model of the database is relational; however, a full normalization is not achievable, owing to the complexity of the visual information. The paper reports the general design and notes on the software implementation. Preliminary experiments show the performance of the pictorial database. Copyright © 1993 John Wiley & Sons, Ltd
PyCellBase, an efficient python package for easy retrieval of biological data from heterogeneous sources.
2019
Background Biological databases and repositories are incrementing in diversity and complexity over the years. This rapid expansion of current and new sources of biological knowledge raises serious problems of data accessibility and integration. To handle the growing necessity of unification, CellBase was created as an integrative solution. CellBase provides a centralized NoSQL database containing biological information from different and heterogeneous sources. Access to this information is done through a RESTful web service API, which provides an efficient interface to the data. Results In this work we present PyCellBase, a Python package that provides programmatic access to the rich RESTfu…
Increase in norovirus activity reported in Europe.
2006
A large increase in norovirus outbreaks in Hungary and Germany was reported to European national health authorities via the Foodborne Viruses in Europe network
A brief history of the formation of DNA databases in forensic science within Europe.
2001
The introduction of DNA analysis to forensic science brought with it a number of choices for analysis, not all of which were compatible. As laboratories throughout Europe were eager to use the new technology different systems became routine in different laboratories and consequently, there was no basis for the exchange of results. A period of co-operation then started in which a nucleus of forensic scientists agreed on an uniform system. This collaboration spread to incorporate most of the established forensic science laboratories in Europe and continued through two major changes in the technology. At each step agreement was reached on which systems to use. From the beginning it was realise…