Search results for " Dataset"
showing 7 items of 37 documents
Comparing the use of ERA5 reanalysis dataset and ground-based agrometeorological data under different climates and topography in Italy
2022
Study region: The study region is represented by seven irrigation districts distributed under different climate and topography conditions in Italy. Study focus: This study explores the reliability and consistency of the global ERA5 single levels and ERA5-Land reanalysis datasets in predicting the main agrometeorological estimates commonly used for crop water requirements calculation. In particular, the reanalysis data was compared, variable-by-variable (e.g., solar radiation, R; air temperature, T; relative humidity, RH; wind speed, u; reference evapotranspiration, ET), with in situ agrometeorological observations obtained from 66 automatic weather stations (2008–2020). In addition, the pre…
Lightweight LCP construction for next-generation sequencing datasets
2012
The advent of "next-generation" DNA sequencing (NGS) technologies has meant that collections of hundreds of millions of DNA sequences are now commonplace in bioinformatics. Knowing the longest common prefix array (LCP) of such a collection would facilitate the rapid computation of maximal exact matches, shortest unique substrings and shortest absent words. CPU-efficient algorithms for computing the LCP of a string have been described in the literature, but require the presence in RAM of large data structures. This prevents such methods from being feasible for NGS datasets. In this paper we propose the first lightweight method that simultaneously computes, via sequential scans, the LCP and B…
Towards A Twitter Observatory: A Multi-Paradigm Framework For Collecting, Storing And Analysing Tweets
2016
International audience; In this article we show how a multi-paradigm framework can fulfil the requirements of tweets analysis and reduce the waiting time for researchers that use computational resources and storage systems to support large-scale data analysis. The originality of our approach is to combine concerns about data harvesting, data storage, data analysis and data visualisation into a framework that supports inductive reasoning in multidisciplinary scientific research. Our main contribution is a polyglot storage system with a generic data model to support logical data independence and a set of tools that can provide a suitable solution for mixing different types of algorithms in or…
A Dataset of Annotated Omnidirectional Videos for Distancing Applications
2021
Omnidirectional (or 360°) cameras are acquisition devices that, in the next few years, could have a big impact on video surveillance applications, research, and industry, as they can record a spherical view of a whole environment from every perspective. This paper presents two new contributions to the research community: the CVIP360 dataset, an annotated dataset of 360° videos for distancing applications, and a new method to estimate the distances of objects in a scene from a single 360° image. The CVIP360 dataset includes 16 videos acquired outdoors and indoors, annotated by adding information about the pedestrians in the scene (bounding boxes) and the distances to the camera of some point…
speedglm: Fitting Linear and Generalized Linear Models to large data sets.
2009
This is an R packge to fit (generalized) linear models to large data sets. For data loaded in R memory the fitting is usually fast, especially if R is linked against an optimized BLAS. For data sets of size greater of R memory, the fitting is made by an updating algorithm
Selecting significant respondents from large audience datasets: The case of the World Hobbit Project
2016
International projects, online questionnaires, or data mining techniques now allow audience researchers to gather very large and complex datasets. But whilst data collection capacity is hugely growing, qualitative analysis, conversely, becomes increasingly difficult to conduct. In this paper, I suggest a strategy that might allow the researcher to manage this complexity. The World Hobbit Project dataset (36,109 cases), including answers to both closed and open-ended questions, was used for this purpose. The strategy proposed here is based on between-methods sequential triangulation, and tries to combine statistical techniques (k-means clustering) with textual analysis. K-means clustering pe…
The Influence of Religion on Life Satisfaction in Italy
2021
Italy is the cradle of Catholicism and, despite the secularization process, religion continues to be part of its national culture. Although Italian sociologists have investigated the religious paradigm in Italy, there are aspects of such phenomenon still little explored. This paper examines the potential influence religion has on individuals’ life satisfaction. Data from the European Value Study survey provides evidence of a two-way interaction between religion and life satisfaction, with a substantial effect only in the case of public religious forms. This association seems to be moved by the mechanism of social support, and it differs across Italian regions. Results confirm the hypothesis…