Search results for " stream"
showing 10 items of 205 documents
Structural clustering of millions of molecular graphs
2014
We propose an algorithm for clustering very large molecular graph databases according to scaffolds (i.e., large structural overlaps) that are common between cluster members. Our approach first partitions the original dataset into several smaller datasets using a greedy clustering approach named APreClus based on dynamic seed clustering. APreClus is an online and instance incremental clustering algorithm delaying the final cluster assignment of an instance until one of the so-called pending clusters the instance belongs to has reached significant size and is converted to a fixed cluster. Once a cluster is fixed, APreClus recalculates the cluster centers, which are used as representatives for…
The Argument Dependency Model
2015
This chapter summarizes the architecture of the extended Argument Dependency Model (eADM), a model of language comprehension that aspires toward neurobiological plausibility. It combines design principles from neurobiology with insights on cross-linguistic diversity. Like other current models, the eADM posits that auditory language processing proceeds along two distinct streams in the brain emanating from auditory cortex: the antero-ventral and postero-dorsal streams. Both streams are organized hierarchically and information processing takes place in a cascaded fashion. Each stream has functionally unified computational properties congruent with its role in primate audition. While the dorsa…
On the Online Classification of Data Streams Using Weak Estimators
2016
In this paper, we propose a novel online classifier for complex data streams which are generated from non-stationary stochastic properties. Instead of using a single training model and counters to keep important data statistics, the introduced online classifier scheme provides a real-time self-adjusting learning model. The learning model utilizes the multiplication-based update algorithm of the Stochastic Learning Weak Estimator (SLWE) at each time instant as a new labeled instance arrives. In this way, the data statistics are updated every time a new element is inserted, without requiring that we have to rebuild its model when changes occur in the data distributions. Finally, and most impo…
Moving Learning Machine Towards Fast Real-Time Applications: A High-Speed FPGA-based Implementation of the OS-ELM Training Algorithm
2018
Currently, there are some emerging online learning applications handling data streams in real-time. The On-line Sequential Extreme Learning Machine (OS-ELM) has been successfully used in real-time condition prediction applications because of its good generalization performance at an extreme learning speed, but the number of trainings by a second (training frequency) achieved in these continuous learning applications has to be further reduced. This paper proposes a performance-optimized implementation of the OS-ELM training algorithm when it is applied to real-time applications. In this case, the natural way of feeding the training of the neural network is one-by-one, i.e., training the neur…
Efficient anomaly detection on sampled data streams with contaminated phase I data
2020
International audience; Control chart algorithms aim to monitor a process over time. This process consists of two phases. Phase I, also called the learning phase, estimates the normal process parameters, then in Phase II, anomalies are detected. However, the learning phase itself can contain contaminated data such as outliers. If left undetected, they can jeopardize the accuracy of the whole chart by affecting the computed parameters, which leads to faulty classifications and defective data analysis results. This problem becomes more severe when the analysis is done on a sample of the data rather than the whole data. To avoid such a situation, Phase I quality must be guaranteed. The purpose…
Clustering categorical data: A stability analysis framework
2011
Clustering to identify inherent structure is an important first step in data exploration. The k-means algorithm is a popular choice, but K-means is not generally appropriate for categorical data. A specific extension of k-means for categorical data is the k-modes algorithm. Both of these partition clustering methods are sensitive to the initialization of prototypes, which creates the difficulty of selecting the best solution for a given problem. In addition, selecting the number of clusters can be an issue. Further, the k-modes method is especially prone to instability when presented with ‘noisy’ data, since the calculation of the mode lacks the smoothing effect inherent in the calculation …
Managing sensor data streams in a smart home application
2020
A challenge in developing an ambient activity recognition system for use in elder care is finding a balance between the sophistication of the system and a cost structure that fits within the budgets of public and private sector healthcare organisations. Much activity recognition research in the context of elder care is based on dense networks of sensors and advanced methods, such as supervised machine learning algorithms. This paper presents the data processing aspects of an activity recognition system based on a simpler, knowledge-based unsupervised approach, designed for a sparse network of sensors. By structuring sensor data management as a streaming system, we provide a simple programmi…
Tropical–extratropical interactions related to upper-level troughs at low latitudes
2007
Abstract Momentum and kinetic energy fluxes associated with low-latitude transient disturbances at upper-levels play an important role in the general circulation of the atmosphere. They are related to eastward and equatorward propagating, positively tilted wave trains from the extratropics. Theoretical, modelling and observational studies show that this particular kind of tropical–extratropical interaction is most common in regions of mean upper-level westerlies at low latitudes, i.e. over the central and eastern Pacific and Atlantic Oceans during boreal winter and spring. The penetration of an upper-level trough into the Tropics is often associated with enhanced convection and the formatio…
WDM switching employing a hybrid silicon-plasmonic A-MZI
2012
We demonstrate a system-level evaluation of an A-MZI with 60μm long DLSPP active branches exhibiting more than 14dB extinction ratio. Error-free switching operation is achieved for a 4×10Gb/s incoming WDM data stream with only 13.1mW power consumption.
Integrating LSTMs with Online Density Estimation for the Probabilistic Forecast of Energy Consumption
2019
In machine learning applications in the energy sector, it is often necessary to have both highly accurate predictions and information about the probabilities of certain scenarios to occur. We address this challenge by integrating and combining long short-term memory networks (LSTMs) and online density estimation into a real-time data streaming architecture of an energy trader. The online density estimation is done in the MiDEO framework, which estimates joint densities of data streams based on ensembles of chains of Hoeffding trees. One attractive feature of the solution is that queries can be sent to the here-called forecast-based point density estimators (FPDE) to derive information from …