Search results for " data"

showing 10 items of 7516 documents

Estimating aggregated nutrient fluxes in four Finnish rivers via Gaussian state space models

2013

Reliable estimates of the nutrient fluxes carried by rivers from land-based sources to the sea are needed for efficient abatement of marine eutrophication. Although nutrient concentrations in rivers generally display large temporal variation, sampling and analysis for nutrients, unlike flow measurements, are rarely performed on a daily basis. The infrequent data calls for ways to reliably estimate the nutrient concentrations of the missing days. Here, we use the Gaussian state space models with daily water flow as a predictor variable to predict missing nutrient concentrations for four agriculturally impacted Finnish rivers. Via simulation of Gaussian state space models, we are able to esti…

sparse dataharva aineistoPHOSPHORUS LOADOceanografi hydrologi och vattenresurserFINLANDKalmanin tasoitinsimulationSERIESinterpolationOceanography Hydrology and Water ResourcesKalmanin suodinKalman smootherSTREAMSsimulointiKalman filterinterpolointi
researchProduct

Weighted Clustering of Sparse Educational Data

2015

Clustering as an unsupervised technique is predominantly used in unweighted settings. In this paper, we present an efficient version of a robust clustering algorithm for sparse educational data that takes the weights, aligning a sample with the corresponding population, into account. The algorithm is utilized to divide the Finnish student population of PISA 2012 (the latest data from the Programme for International Student Assessment) into groups, according to their attitudes and perceptions towards mathematics, for which one third of the data is missing. Furthermore, necessary modifications of three cluster indices to reveal an appropriate number of groups are proposed and demonstrated. pe…

sparse educational dataPISAclustering
researchProduct

SparseHC: A Memory-efficient Online Hierarchical Clustering Algorithm

2014

Computing a hierarchical clustering of objects from a pairwise distance matrix is an important algorithmic kernel in computational science. Since the storage of this matrix requires quadratic space with respect to the number of objects, the design of memory-efficient approaches is of high importance to this research area. In this paper, we address this problem by presenting a memory-efficient online hierarchical clustering algorithm called SparseHC. SparseHC scans a sorted and possibly sparse distance matrix chunk-by-chunk. Meanwhile, a dendrogram is built by merging cluster pairs as and when the distance between them is determined to be the smallest among all remaining cluster pairs. The k…

sparse matrixClustering high-dimensional dataTheoretical computer scienceonline algorithmsComputer scienceSingle-linkage clusteringComplete-linkage clusteringNearest-neighbor chain algorithmConsensus clusteringmemory-efficient clusteringCluster analysisk-medians clusteringGeneral Environmental ScienceSparse matrix:Engineering::Computer science and engineering [DRNTU]k-medoidsDendrogramConstrained clusteringHierarchical clusteringDistance matrixCanopy clustering algorithmGeneral Earth and Planetary SciencesFLAME clusteringHierarchical clustering of networkshierarchical clusteringAlgorithmProcedia Computer Science
researchProduct

Spatial spillovers in France: a study on individual count data at the city level

2007

Our study aims to measure the effects of spatial R&D spillovers on firms' patent production at the city level. We use an original method to estimate the spatial dimension of spillovers using count data. The method, based on a generalized cross entropy approach, allows us to test spatial auto-correlation. The main result is that when there are local spillovers, their impact on knowledge production is different according to the geographical area and the sector.

spatial auto-correlationspatialcity levelgeneralised cross entropy approach[ SHS.ECO ] Humanities and Social Sciences/Economies and financesexternalités de connaissancedonnées de comptageauto-corrélation spatiale[SHS.ECO] Humanities and Social Sciences/Economics and Financespatial spillovers[SHS.ECO]Humanities and Social Sciences/Economics and Financecount datamaximum d'entropie généralisé
researchProduct

Detecting clusters in spatially correlated waveforms

2017

Seismic networks often record signals characterized by similar shapes that provide important information according to their geographic positions. We propose an approach to identify homogeneous clusters of seismic waves, combining analysis of waveforms with metadata and spectrogram information. In waveforms clustering, cross-correlation measures between signals may presents some limitations, so we refer to more recent contributes relating data-depth based clustering analysis. The mechanism for alignment is also an important topic of the analysis: warping (or aligning) procedures identify nuisance effects in phase variation, that, if ignored, may result in a possible loss of information and t…

spatial clusteringfast fourier transform.Seismic waveformfunctional data analysiSettore SECS-S/01 - StatisticaSeismic waveforms; spatial clustering; functional data analysis; fast fourier transform.
researchProduct

Spatial Econometrics and Spatial Data Pooled over Time: Towards an Adapted Modelling Approach

2013

International audience

spatial data[ SHS.ECO ] Humanities and Social Sciences/Economies and finances[SHS.ECO] Humanities and Social Sciences/Economics and FinanceSpatial econometrics[SHS.ECO]Humanities and Social Sciences/Economics and FinanceComputingMilieux_MISCELLANEOUS
researchProduct

Continuum: A spatiotemporal data model to represent and qualify filiation relationships

2013

International audience; This work introduces an ontology-based spatio-temporal data model to represent entities evolving in space and time. A dynamic phenomenon generates a complex relationship network between the entities involved in the process. At the abstract level, the relationships can be identity or topological filiations. The existence of an identity filiation depends on whether the object changes its identity or not. On the other hand, topological filiations are based exclusively on the spatial component, like in the case of growth, reduction, merging or splitting. When combining identity and topological filiations, six filiation relationships are obtained, forming a second abstrac…

spatial dynamicsTheoretical computer sciencefiliationintegrity constraintsSpatio-temporal modelingspatio-temporal evolutionComputer scienceOntology (information science)Object (computer science)computer.software_genreSemantic data modelConsistency (database systems)[ INFO.INFO-HC ] Computer Science [cs]/Human-Computer Interaction [cs.HC]Data modelData integrityI.2.4 [ARTIFICIAL INTELLIGENCE]: Knowledge Representation Formalisms and Methods - Semantic networks. I.2.3 [ARTIFICIAL INTELLIGENCE]: Deduction and Theorem Proving - Inference engines.Identity (object-oriented programming)semanticreasoningData mining[INFO.INFO-HC]Computer Science [cs]/Human-Computer Interaction [cs.HC][INFO.INFO-HC] Computer Science [cs]/Human-Computer Interaction [cs.HC]computerSemantic Web
researchProduct

Probabilistic and preferential sampling approaches offer integrated perspectives of Italian forest diversity

2023

Aim: Assessing the performances of different sampling approaches for documenting community diversity may help to identify optimal sampling efforts and strategies, and to enhance conservation and monitoring planning. Here, we used two data sets based on probabilistic and preferential sampling schemes of Italian forest vegetation to analyze the multifaceted performances of the two approaches across three major forest types at a large scale. Location: Italy. Methods: We pooled 804 probabilistic and 16,259 preferential forest plots as samples of vascular plant diversity across the country. We balanced the two data sets in terms of sizes, plot size, geographical position, and vegetation types. F…

spatially constrained rarefaction curveEcologyco-occurrence datadetrended correspondence analysisregional surveytemperate forestsvegetation databaseindicator species analysisbiodiversity co-occurrence data detrended correspondence analysis indicator species analysis regional survey spatially constrained rarefaction curve temperate forests vegetation database zonal vegetationPlant Sciencezonal vegetationbiodiversityJournal of Vegetation Science
researchProduct

Joint second-order parameter estimation for spatio-temporal log-Gaussian Cox processes

2018

We propose a new fitting method to estimate the set of second-order parameters for the class of homogeneous spatio-temporal log-Gaussian Cox point processes. With simulations, we show that the proposed minimum contrast procedure, based on the spatio-temporal pair correlation function, provides reliable estimates and we compare the results with the current available methods. Moreover, the proposed method can be used in the case of both separable and non-separable parametric specifications of the correlation function of the underlying Gaussian Random Field. We describe earthquake sequences comparing several Cox model specifications.

spatio-temporal pair correlation functionEnvironmental EngineeringGaussianminimum contrast methodnon-separable covariance function010502 geochemistry & geophysics01 natural sciencesPoint processGaussian random fieldSet (abstract data type)010104 statistics & probabilitysymbols.namesakeCorrelation functionEnvironmental Chemistry0101 mathematicsSafety Risk Reliability and Qualityearthquakes0105 earth and related environmental sciencesGeneral Environmental ScienceWater Science and TechnologyParametric statisticsMathematicslog-Gaussian Cox processesEstimation theoryContrast (statistics)symbolsEarthquakes Log-Gaussian Cox processes Minimum contrast method Non-separable covariance function Spatio-temporal pair correlation functionSettore SECS-S/01 - StatisticaAlgorithm
researchProduct

The Dawn of the Human-Machine Era: A forecast of new and emerging language technologies

2021

New language technologies are coming, thanks to the huge and competing private investment fuelling rapid progress; we can either understand and foresee their effects, or be taken by surprise and spend our time trying to catch up. This report scketches out some transformative new technologies that are likely to fundamentally change our use of language. Some of these may feel unrealistically futuristic or far-fetched, but a central purpose of this report - and the wider LITHME network - is to illustrate that these are mostly just the logical development and maturation of technologies currently in prototype. But will everyone benefit from all these shiny new gadgets? Throughout this report we …

speaking through technologymachine learningDatavetenskap (datalogi)Computer SciencesComputer sciencelinguistic dataLanguage technologyhuman integrated speaking devicesSpeech technologychatbotsHuman–machine systemlanguage technologiesData science
researchProduct