Search results for "Stream"

showing 10 items of 682 documents

Forest of Normalized Trees: Fast and Accurate Density Estimation of Streaming Data

2018

Density estimation of streaming data is a relevant task in numerous domains. In this paper, a novel non-parametric density estimator called FRONT (forest of normalized trees) is introduced. It uses a structure of multiple normalized trees, segments the feature space of the data stream through a periodically updated linear transformation and is able to adapt to ever evolving data streams. FRONT provides accurate density estimation and performs favorably compared to existing online density estimators in terms of the average log score on multiple standard data sets. Its low complexity, linear runtime as well as constant memory usage, makes FRONT by design suitable for large data streams. Final…

Data streamComputer scienceData stream miningFeature vectorEstimator02 engineering and technologyDensity estimation01 natural sciencesData modeling010104 statistics & probabilityKernel (statistics)0202 electrical engineering electronic engineering information engineering020201 artificial intelligence & image processing0101 mathematicsRandom variableAlgorithm2018 IEEE 5th International Conference on Data Science and Advanced Analytics (DSAA)
researchProduct

Prototype-based learning on concept-drifting data streams

2014

Data stream mining has gained growing attentions due to its wide emerging applications such as target marketing, email filtering and network intrusion detection. In this paper, we propose a prototype-based classification model for evolving data streams, called SyncStream, which dynamically models time-changing concepts and makes predictions in a local fashion. Instead of learning a single model on a sliding window or ensemble learning, SyncStream captures evolving concepts by dynamically maintaining a set of prototypes in a new data structure called the P-tree. The prototypes are obtained by error-driven representativeness learning and synchronization-inspired constrained clustering. To ide…

Data streamConcept driftbusiness.industryComputer scienceData stream miningConstrained clusteringcomputer.software_genreData structureMachine learningEnsemble learningSynchronization (computer science)Data miningArtificial intelligencebusinesscomputerProceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining
researchProduct

New results for finding common neighborhoods in massive graphs in the data stream model

2008

AbstractWe consider the problem of finding pairs of vertices that share large common neighborhoods in massive graphs. We give lower bounds for randomized, two-sided error algorithms that solve this problem in the data-stream model of computation. Our results correct and improve those of Buchsbaum, Giancarlo, and Westbrook [On finding common neighborhoods in massive graphs, Theoretical Computer Science, 299 (1–3) 707–718 (2004)]

Data streamDiscrete mathematicsGeneral Computer ScienceExtremal graph theorySpace lower boundsModel of computationCommunication complexityGraph theoryUpper and lower boundsTheoretical Computer ScienceExtremal graph theoryCombinatoricsGraph algorithms for data streamsAlgorithms Theoretical Computer SciencedGraph algorithmsCommunication complexityComputer Science(all)MathematicsTheoretical Computer Science
researchProduct

Quantifying Vegetation Biophysical Variables from Imaging Spectroscopy Data: A Review on Retrieval Methods

2019

An unprecedented spectroscopic data stream will soon become available with forthcoming Earth-observing satellite missions equipped with imaging spectroradiometers. This data stream will open up a vast array of opportunities to quantify a diversity of biochemical and structural vegetation properties. The processing requirements for such large data streams require reliable retrieval techniques enabling the spatiotemporally explicit quantification of biophysical variables. With the aim of preparing for this new era of Earth observation, this review summarizes the state-of-the-art retrieval methods that have been applied in experimental imaging spectroscopy studies inferring all kinds of vegeta…

Data streamEarth observation010504 meteorology & atmospheric sciencesComputer scienceUT-Hybrid-D010502 geochemistry & geophysicscomputer.software_genreQuantitative Biology - Quantitative Methods01 natural sciencesArticleGeochemistry and PetrologyFOS: Electrical engineering electronic engineering information engineeringQuantitative Methods (q-bio.QM)0105 earth and related environmental sciencesParametric statisticsData stream miningImage and Video Processing (eess.IV)Electrical Engineering and Systems Science - Image and Video Processing15. Life on land22/4 OA procedureRegressionImaging spectroscopyGeophysicsSpectroradiometer13. Climate actionMulticollinearityFOS: Biological sciencesITC-ISI-JOURNAL-ARTICLEData miningcomputerSurveys in Geophysics
researchProduct

Optical remote sensing and the retrieval of terrestrial vegetation bio-geophysical properties – A review

2015

Abstract: Forthcoming superspectral satellite missions dedicated to land monitoring, as well as planned imaging spectrometers, will unleash an unprecedented data stream. The processing requirements for such large data streams involve processing techniques enabling the spatio-temporally explicit quantification of vegetation properties. Typically retrieval must be accurate, robust and fast. Hence, there is a strict requirement to identify next-generation bio-geophysical variable retrieval algorithms which can be molded into an operational processing chain. This paper offers a review of state-of-the-art retrieval methods for quantitative terrestrial bio-geophysical variable extraction using op…

Data streamEconomicsComputer scienceOperational variable retrievalcomputer.software_genreLaboratory of Geo-information Science and Remote SensingMachine learningPhysicalLaboratorium voor Geo-informatiekunde en Remote SensingBio-geophysical variablesComputers in Earth SciencesParametricEngineering (miscellaneous)Parametric statisticsRemote sensingData stream miningPhysicsTransparency (human–computer interaction)VegetationPE&RCNon-parametricHybridAtomic and Molecular Physics and OpticsComputer Science ApplicationsVariable (computer science)SatelliteData miningEngineering sciences. TechnologyRetrievabilitycomputerISPRS Journal of Photogrammetry and Remote Sensing
researchProduct

Distributed Real-Time Sentiment Analysis for Big Data Social Streams

2014

Big data trend has enforced the data-centric systems to have continuous fast data streams. In recent years, real-time analytics on stream data has formed into a new research field, which aims to answer queries about "what-is-happening-now" with a negligible delay. The real challenge with real-time stream data processing is that it is impossible to store instances of data, and therefore online analytical algorithms are utilized. To perform real-time analytics, pre-processing of data should be performed in a way that only a short summary of stream is stored in main memory. In addition, due to high speed of arrival, average processing time for each instance of data should be in such a way that…

Data streamFOS: Computer and information sciencesComputer Science - Computation and LanguageComputer sciencebusiness.industryData stream miningSentiment analysisBig dataMachine Learning (stat.ML)Databases (cs.DB)Data structurecomputer.software_genreField (computer science)Computer Science - Information RetrievalTree (data structure)Computer Science - DatabasesComputer Science - Distributed Parallel and Cluster ComputingAnalyticsStatistics - Machine LearningData miningDistributed Parallel and Cluster Computing (cs.DC)businesscomputerComputation and Language (cs.CL)Information Retrieval (cs.IR)
researchProduct

Online Density Estimation of Heterogeneous Data Streams in Higher Dimensions

2016

The joint density of a data stream is suitable for performing data mining tasks without having access to the original data. However, the methods proposed so far only target a small to medium number of variables, since their estimates rely on representing all the interdependencies between the variables of the data. High-dimensional data streams, which are becoming more and more frequent due to increasing numbers of interconnected devices, are, therefore, pushing these methods to their limits. To mitigate these limitations, we present an approach that projects the original data stream into a vector space and uses a set of representatives to provide an estimate. Due to the structure of the est…

Data streamMahalanobis distanceComputer scienceData stream miningbusiness.industry02 engineering and technologyDensity estimationcomputer.software_genreSet (abstract data type)Software020204 information systems0202 electrical engineering electronic engineering information engineering020201 artificial intelligence & image processingData miningbusinesscomputerCurse of dimensionalityVector space
researchProduct

A Selective Change Driven System for High-Speed Motion Analysis.

2016

Vision-based sensing algorithms are computationally-demanding tasks due to the large amount of data acquired and processed. Visual sensors deliver much information, even if data are redundant, and do not give any additional information. A Selective Change Driven (SCD) sensing system is based on a sensor that delivers, ordered by the magnitude of its change, only those pixels that have changed most since the last read-out. This allows the information stream to be adjusted to the computation capabilities. Following this strategy, a new SCD processing architecture for high-speed motion analysis, based on processing pixels instead of full frames, has been developed and implemented into a Field …

Data streamMotion analysisLaser scanningComputer scienceReal-time computing02 engineering and technologylcsh:Chemical technology01 natural sciencesBiochemistryArticleAnalytical ChemistryInformàtica0202 electrical engineering electronic engineering information engineeringlcsh:TP1-1185data-flow architectureElectrical and Electronic EngineeringImage sensorhigh-speed visual acquisitionField-programmable gate arrayInstrumentationDataflow architecturePixellaser scanning020208 electrical & electronic engineering010401 analytical chemistryFrame (networking)Arquitectura d'ordinadorsAtomic and Molecular Physics and Optics0104 chemical sciencesCMOS image sensor; event-based vision; high-speed visual acquisition; data-flow architecture; FPGA system; laser scanningCMOS image sensorevent-based visionFPGA systemSensors (Basel, Switzerland)
researchProduct

Sequential Learning with LS-SVM for Large-Scale Data Sets

2006

We present a subspace-based variant of LS-SVMs (i.e. regularization networks) that sequentially processes the data and is hence especially suited for online learning tasks. The algorithm works by selecting from the data set a small subset of basis functions that is subsequently used to approximate the full kernel on arbitrary points. This subset is identified online from the data stream. We improve upon existing approaches (esp. the kernel recursive least squares algorithm) by proposing a new, supervised criterion for the selection of the relevant basis functions that takes into account the approximation error incurred from approximating the kernel as well as the reduction of the cost in th…

Data streamSupport vector machineApproximation errorBasis functionSequence learningLarge scale dataAlgorithmRegularization (mathematics)Subspace topologyMathematics
researchProduct

Twitter troļļi - statistikas metodes automātiski ģenerēta satura noteikšanai

2016

Bakalaura darbā „Twitter troļļi – statistikas metodes automātiski ģenerēta satura noteikšanai” tiek pētīts un salīdzināts sociālās vietnes Twitter lietojums dažādu lietotāju grupu vidū. Darba mērķis ir pētīt dažādas metodes automātiski ģenerēta satura noteikšanai Twitter vietnē, kā arī cita veida aizdomīga Twitter lietojuma noteikšanai. Izmantojot publiski pieejamos Twitter lietotāju datus, praktiski tiek pielietotas vienkāršas statistikas metodes, lai identificētu aizdomīgu Twitter lietojumu. Darba rezultātā tika atklātas vairākas anomālijas Twitter lietotāju datos, kas norāda uz to, ka izmantotās statistikas metodes varētu būt sekmīgas Twitter troļļu identificēšanā.

DatorzinātneTwittertroļļiTwitter REST APIstatistikaTwitter Streaming API
researchProduct