Search results for "Data stream"

showing 10 items of 50 documents

Machine learning information fusion in Earth observation: A comprehensive review of methods, applications and data sources

2020

This paper reviews the most important information fusion data-driven algorithms based on Machine Learning (ML) techniques for problems in Earth observation. Nowadays we observe and model the Earth with a wealth of observations, from a plethora of different sensors, measuring states, fluxes, processes and variables, at unprecedented spatial and temporal resolutions. Earth observation is well equipped with remote sensing systems, mounted on satellites and airborne platforms, but it also involves in-situ observations, numerical models and social media data streams, among other data sources. Data-driven approaches, and ML techniques in particular, are the natural choice to extract significant i…

FOS: Computer and information sciencesEarth observationComputer Science - Machine LearningComputer scienceComputer Vision and Pattern Recognition (cs.CV)Computer Science - Computer Vision and Pattern Recognition02 engineering and technologyMachine learningcomputer.software_genreField (computer science)Machine Learning (cs.LG)Set (abstract data type)0202 electrical engineering electronic engineering information engineeringbusiness.industryData stream mining020206 networking & telecommunicationsNumerical modelsSensor fusionInformation fusionHardware and ArchitectureSignal Processing020201 artificial intelligence & image processingArtificial intelligencebusinesscomputerSoftwareInformation SystemsInformation Fusion
researchProduct

A two-armed bandit collective for hierarchical examplar based mining of frequent itemsets with applications to intrusion detection

2014

Published version of a chapter in the book: Transactions on Computational Collective Intelligence XIV. Also available from the publisher at: http://dx.doi.org/10.1007/978-3-662-44509-9_1 In this paper we address the above problem by posing frequent item-set mining as a collection of interrelated two-armed bandit problems. We seek to find itemsets that frequently appear as subsets in a stream of itemsets, with the frequency being constrained to support granularity requirements. Starting from a randomly or manually selected examplar itemset, a collective of Tsetlin automata based two-armed bandit players - one automaton for each item in the examplar - learns which items should be included in …

Finite-state machineVDP::Technology: 500::Information and communication technology: 550::Computer technology: 551Computational complexity theoryData stream miningComputer scienceNearest neighbor searchSearch engine indexingInformationSystems_DATABASEMANAGEMENTIntrusion detection systemcomputer.software_genreCardinalityAnomaly detectionData miningcomputer
researchProduct

DeCyMo: Decentralized Cyber-physical System for Monitoring and Controlling Industries and Homes

2018

The recent revolution of the Internet of Things has given the birth of a series of new technologies and cyber-physical systems to be used in industrial and home scenarios. Cyber- physical systems include physical and software components for providing smart monitoring and control with flexibility and adaptability to the operating context. The IoT paradigm enables the intertwined use of physical and software components through the interconnection of devices that exchange data with each other without direct human interaction in several fields, especially in industrial and home environments. We propose DeCyMo, a decentralized architecture that aims at solving common IoT issues and vulnerabiliti…

Flexibility (engineering)Emerging technologiesComputer scienceSettore ING-INF/03 - Telecomunicazionimedia_common.quotation_subjectCyber-physical systemContext (language use)Computer securitycomputer.software_genreExtensibilityAdaptabilityComponent-based software engineeringArchitecturecomputercyber-physical system blockchain IoT data streammedia_common
researchProduct

Scalable Clustering by Iterative Partitioning and Point Attractor Representation

2016

Clustering very large datasets while preserving cluster quality remains a challenging data-mining task to date. In this paper, we propose an effective scalable clustering algorithm for large datasets that builds upon the concept of synchronization. Inherited from the powerful concept of synchronization, the proposed algorithm, CIPA (Clustering by Iterative Partitioning and Point Attractor Representations), is capable of handling very large datasets by iteratively partitioning them into thousands of subsets and clustering each subset separately. Using dynamic clustering by synchronization, each subset is then represented by a set of point attractors and outliers. Finally, CIPA identifies the…

Fuzzy clusteringGeneral Computer ScienceComputer scienceSingle-linkage clusteringCorrelation clusteringConstrained clustering02 engineering and technologycomputer.software_genreComputingMethodologies_PATTERNRECOGNITIONData stream clusteringCURE data clustering algorithm020204 information systems0202 electrical engineering electronic engineering information engineeringCanopy clustering algorithm020201 artificial intelligence & image processingData miningCluster analysiscomputerACM Transactions on Knowledge Discovery from Data
researchProduct

A Novel Clustering Algorithm based on a Non-parametric "Anti-Bayesian" Paradigm

2015

The problem of clustering, or unsupervised classification, has been solved by a myriad of techniques, all of which depend, either directly or implicitly, on the Bayesian principle of optimal classification. To be more specific, within a Bayesian paradigm, if one is to compare the testing sample with only a single point in the feature space from each class, the optimal Bayesian strategy would be to achieve this based on the distance from the corresponding means or central points in the respective distributions. When this principle is applied in clustering, one would assign an unassigned sample into the cluster whose mean is the closest, and this can be done in either a bottom-up or a top-dow…

Fuzzy clusteringbusiness.industryComputer scienceCorrelation clusteringConstrained clusteringPattern recognitioncomputer.software_genreData stream clusteringCURE data clustering algorithmCanopy clustering algorithmAffinity propagationArtificial intelligenceData miningbusinessCluster analysiscomputer
researchProduct

A critical review on the implementation of static data sampling techniques to detect network attacks

2021

International audience; Given that the Internet traffic speed and volume are growing at a rapid pace, monitoring the network in a real-time manner has introduced several issues in terms of computing and storage capabilities. Fast processing of traffic data and early warnings on the detected attacks are required while maintaining a single pass over the traffic measurements. To palliate these problems, one can reduce the amount of traffic to be processed by using a sampling technique and detect the attacks based on the sampled traffic. Different parameters have an impact on the efficiency of this process, mainly, the applied sampling policy and sampling ratio. In this paper, we investigate th…

General Computer ScienceComputer science020209 energyReal-time computingintrusion detection system (IDS)data streamsContext (language use)02 engineering and technologyIntrusion detection system[INFO.INFO-SE]Computer Science [cs]/Software Engineering [cs.SE]Data sampling[INFO.INFO-IU]Computer Science [cs]/Ubiquitous Computing[INFO.INFO-CR]Computer Science [cs]/Cryptography and Security [cs.CR]statistical analysisSampling process0202 electrical engineering electronic engineering information engineeringGeneral Materials ScienceStatic dataGeneral EngineeringVolume (computing)Process (computing)Sampling (statistics)Internet traffic[INFO.INFO-MO]Computer Science [cs]/Modeling and SimulationTK1-9971[INFO.INFO-MA]Computer Science [cs]/Multiagent Systems [cs.MA]020201 artificial intelligence & image processing[INFO.INFO-ET]Computer Science [cs]/Emerging Technologies [cs.ET]Electrical engineering. Electronics. Nuclear engineering[INFO.INFO-DC]Computer Science [cs]/Distributed Parallel and Cluster Computing [cs.DC]
researchProduct

Higher-Fidelity Frugal and Accurate Quantile Estimation Using a Novel Incremental <italic>Discretized</italic> Paradigm

2018

Traditional pattern classification works with the moments of the distributions of the features and involves the estimation of the means and variances. As opposed to this, more recently, research has indicated the power of using the quantiles of the distributions because they are more robust and applicable for non-parametric methods. The estimation of the quantiles is even more pertinent when one is mining data streams. However, the complexity of quantile estimation is much higher than the corresponding estimation of the mean and variance, and this increased complexity is more relevant as the size of the data increases. Clearly, in the context of infinite data streams, a computational and sp…

General Computer ScienceDiscretizationLearning automataData stream miningComputer scienceGeneral EngineeringEstimatorContext (language use)02 engineering and technologyRobustness (computer science)0202 electrical engineering electronic engineering information engineering020201 artificial intelligence & image processingGeneral Materials ScienceAlgorithmQuantileIEEE Access
researchProduct

EUDAQ $-$ A Data Acquisition Software Framework for Common Beam Telescopes

2019

EUDAQ is a generic data acquisition software developed for use in conjunction with common beam telescopes at charged particle beam lines. Providing high-precision reference tracks for performance studies of new sensors, beam telescopes are essential for the research and development towards future detectors for high-energy physics. As beam time is a highly limited resource, EUDAQ has been designed with reliability and ease-of-use in mind. It enables flexible integration of different independent devices under test via their specific data acquisition systems into a top-level framework. EUDAQ controls all components globally, handles the data flow centrally and synchronises and records the data…

Physics - Instrumentation and DetectorsDetector control systems (detector and experiment monitoring and slow-control systems architecture hardware algorithms databases)data acquisitionData management01 natural sciences7. Clean energyHigh Energy Physics - Experiment030218 nuclear medicine & medical imagingHigh Energy Physics - Experiment (hep-ex)0302 clinical medicineData acquisitionbeam [charged particle]Particle tracking detectors[PHYS.HEXP]Physics [physics]/High Energy Physics - Experiment [hep-ex]hardwareDetectors and Experimental Techniquesphysics.ins-detInstrumentationMathematical PhysicsData processingData stream miningPhysicsDetectorInstrumentation and Detectors (physics.ins-det)control systemCharged particle beamdatabases)Particle Physics - ExperimentComputer hardwareperformancearchitectureData acquisition system for beam tests [5]FOS: Physical sciencesalgorithmsprogramming03 medical and health sciencesCalorimeterscharged particle: beam0103 physical sciencesddc:530ddc:610[PHYS.PHYS.PHYS-INS-DET]Physics [physics]/Physics [physics]/Instrumentation and Detectors [physics.ins-det]hep-ex010308 nuclear & particles physicsbusiness.industryDetector control systems (detector and experiment monitoring and slow-control systemsData acquisition conceptsData flow diagramdata managementbusinessBeam (structure)
researchProduct

Application of dictionary learning to denoise LIGO’s blip noise transients

2020

Data streams of gravitational-wave detectors are polluted by transient noise features, or ``glitches,'' of instrumental and environmental origin. In this work we investigate the use of total variation methods and learned dictionaries to mitigate the effect of those transients in the data. We focus on a specific type of transient, ``blip" glitches, as this is the most common type of glitch present in the LIGO detectors and their waveforms are easy to identify. We randomly select 100 blip glitches scattered in the data from advanced LIGO's O1 run, as provided by the citizen-science project Gravity Spy. Our results show that dictionary-learning methods are a valid approach to model and subtrac…

Physics010308 nuclear & particles physicsData stream miningAstrophysics::High Energy Astrophysical PhenomenaAstrophysics::Instrumentation and Methods for AstrophysicsFOS: Physical sciencesBinary numberGeneral Relativity and Quantum Cosmology (gr-qc)Type (model theory)01 natural sciencesGeneral Relativity and Quantum CosmologyLIGOGlitchNoiseTransient noise0103 physical sciencesAstrophysics::Solar and Stellar AstrophysicsTransient (computer programming)010306 general physicsAlgorithmPhysical Review D
researchProduct

Grain—A Java data analysis system for Total Data Readout

2008

Grain is a data analysis system developed to be used with the novel Total Data Readout data acquisition system. In Total Data Readout all the electronics channels are read out asynchronously in singles mode and each data item is timestamped. Event building and analysis has to be done entirely in the software post-processing the data stream. A flexible and efficient event parser and the accompanying software system have been written entirely in Java. The design and implementation of the software are discussed along with experiences gained in running real-life experiments.

PhysicsData streamNuclear and High Energy PhysicsData processingParsingJavabusiness.industryEvent (computing)computer.software_genreSoftwareData acquisitionSoftware systembusinessInstrumentationcomputerComputer hardwarecomputer.programming_languageNuclear Instruments and Methods in Physics Research Section A: Accelerators, Spectrometers, Detectors and Associated Equipment
researchProduct