Search results for " Databases"
showing 10 items of 140 documents
A summary of genomic databases: overview and discussion
2009
In the last few years both the amount of electronically stored biological data and the number of biological data repositories grew up significantly (today, more than eight hundred can be counted thereof). In spite of the enormous amount of available resources, a user may be disoriented when he/she searches for specific data. Thus, the accurate analysis of biological data and repositories turn out to be useful to obtain a systematic view of biological database structures, tools and contents and, eventually, to facilitate the access and recovery of such data. In this chapter, we propose an analysis of genomic databases, which are databases of fundamental importance for the research in bioinfo…
The rise of the middle author: Investigating collaboration and division of labor in biomedical research using partial alphabetical authorship
2017
Contemporary biomedical research is performed by increasingly large teams. Consequently, an increasingly large number of individuals are being listed as authors in the bylines, which complicates the proper attribution of credit and responsibility to individual authors. Typically, more importance is given to the first and last authors, while it is assumed that the others (the middle authors) have made smaller contributions. However, this may not properly reflect the actual division of labor because some authors other than the first and last may have made major contributions. In practice, research teams may differentiate the main contributors from the rest by using partial alphabetical author…
MEDLEM database, a data collection on large Elasmobranchs in the Mediterranean and Black seas
2020
The Mediterranean Large Elasmobranchs Monitoring (MEDLEM) database contains more than 3,000 records (with more than 4,000 individuals) of large elasmobranch species from 21 different countries around the Mediterranean and Black seas, observed from 1666 to 2017. The principal species included in the archive are the devil ray (1,868 individuals), the basking shark (935 individuals), the blue shark (622 individuals), and the great white shark (342 individuals). In the last decades, other species such as the thresher shark (187 individuals), the shortfin mako (180 individuals), and the spiny butterfly ray (138) were reported with increasing frequency. This was possibly due to increased public a…
Quality, comparability and methods of analysis of data on childhood cancer in Europe (1978-1997): report from the Automated Childhood Cancer Informat…
2006
International audience; In collaboration with 62 population-based cancer registries contributing to the Automated Childhood Cancer Information System (ACCIS), we built a database to study incidence and survival of children and adolescents with cancer in Europe. We describe the methods and evaluate the quality and internal comparability of the database, by geographical region, period of registration, type of registry and other characteristics. Data on 88,465 childhood and 15,369 adolescent tumours registered during 1978-1997 were available. Geographical differences in incidence are caused partly by differences in definition of eligible cases. The observed increase in incidence rates cannot b…
The partition sum of methane at high temperature
2008
11 pages, 4 Tables, 3 Figures Computer code on line at http://icb.u-bourgogne.fr/JSP/TIPS.jsp; International audience; The total internal partition function of methane is revisited to provide reliable values at high temperature. A multi-resolution approach is used to perform a direct summation over all the rovibrational energy levels up to the dissociation limit. A computer code is executable on line at the URL : http://icb.u-bourgogne.fr/JSP/TIPS.jsp to allow the calculation of the partition sum of methane at temperatures up to 3000 K. It also provides detailed information on the density of states in the relevant spectral ranges. The recommended values include uncertainty estimates. It is …
The miniJPAS survey: a preview of the Universe in 56 colours
2021
Full list of authors: Bonoli, S.; Marín-Franch, A.; Varela, J.; Vázquez Ramió, H.; Abramo, L. R.; Cenarro, A. J.; Dupke, R. A.; Vílchez, J. M.; Cristóbal-Hornillos, D.; González Delgado, R. M.; Hernández-Monteagudo, C.; López-Sanjuan, C.; Muniesa, D. J.; Civera, T.; Ederoclite, A.; Hernán-Caballero, A.; Marra, V.; Baqui, P. O.; Cortesi, A.; Cypriano, E. S.; Daflon, S.; de Amorim, A. L.; Díaz-García, L. A.; Diego, J. M.; Martínez-Solaeche, G.; Pérez, E.; Placco, V. M.; Prada, F.; Queiroz, C.; Alcaniz, J.; Alvarez-Candal, A.; Cepa, J.; Maroto, A. L.; Roig, F.; Siffert, B. B.; Taylor, K.; Benitez, N.; Moles, M.; Sodré, L.; Carneiro, S.; Mendes de Oliveira, C.; Abdalla, E.; Angulo, R. E.; Apari…
Analyzing big datasets of genomic sequences: fast and scalable collection of k-mer statistics
2019
Abstract Background Distributed approaches based on the MapReduce programming paradigm have started to be proposed in the Bioinformatics domain, due to the large amount of data produced by the next-generation sequencing techniques. However, the use of MapReduce and related Big Data technologies and frameworks (e.g., Apache Hadoop and Spark) does not necessarily produce satisfactory results, in terms of both efficiency and effectiveness. We discuss how the development of distributed and Big Data management technologies has affected the analysis of large datasets of biological sequences. Moreover, we show how the choice of different parameter configurations and the careful engineering of the …
BioTIME: A database of biodiversity time series for the Anthropocene
2018
Abstract Motivation The BioTIME database contains raw data on species identities and abundances in ecological assemblages through time. These data enable users to calculate temporal trends in biodiversity within and amongst assemblages using a broad range of metrics. BioTIME is being developed as a community-led open-source database of biodiversity time series. Our goal is to accelerate and facilitate quantitative analysis of temporal patterns of biodiversity in the Anthropocene. Main types of variables included The database contains 8,777,413 species abundance records, from assemblages consistently sampled for a minimum of 2 years, which need not necessarily be consecutive. In addition, th…
Population geocoding for healthcare management. Technical challenges and quality issues
2015
The present work aims at describing the main issues related with population geocoding for healthcare management. Some of the available procedures for geocoding multiple addresses are described and an indicator of quality of the geocoded addresses is proposed. As a case study, the geocoding of population addresses of a set of 9 Sicilian Municipalities is described and results deriving from the use of two different methods are compared in terms of quality. Some potential applications of population geocoding in healthcare management are finally discussed.
Distributed Real-Time Sentiment Analysis for Big Data Social Streams
2014
Big data trend has enforced the data-centric systems to have continuous fast data streams. In recent years, real-time analytics on stream data has formed into a new research field, which aims to answer queries about "what-is-happening-now" with a negligible delay. The real challenge with real-time stream data processing is that it is impossible to store instances of data, and therefore online analytical algorithms are utilized. To perform real-time analytics, pre-processing of data should be performed in a way that only a short summary of stream is stored in main memory. In addition, due to high speed of arrival, average processing time for each instance of data should be in such a way that…