0000000000073540

AUTHOR

Hugo Lewi Hammer

Higher-Fidelity Frugal and Accurate Quantile Estimation Using a Novel Incremental <italic>Discretized</italic> Paradigm

Traditional pattern classification works with the moments of the distributions of the features and involves the estimation of the means and variances. As opposed to this, more recently, research has indicated the power of using the quantiles of the distributions because they are more robust and applicable for non-parametric methods. The estimation of the quantiles is even more pertinent when one is mining data streams. However, the complexity of quantile estimation is much higher than the corresponding estimation of the mean and variance, and this increased complexity is more relevant as the size of the data increases. Clearly, in the context of infinite data streams, a computational and sp…

research product

Improving Classification of Tweets Using Linguistic Information from a Large External Corpus

The bag of words representation of documents is often unsatisfactory as it ignores relationships between important terms that do not co-occur literally. Improvements might be achieved by expanding the vocabulary with other relevant word, like synonyms.

research product

On the Classification of Dynamical Data Streams Using Novel “Anti–Bayesian” Techniques

The classification of dynamical data streams is among the most complex problems encountered in classification. This is, firstly, because the distribution of the data streams is non-stationary, and it changes without any prior “warning”. Secondly, the manner in which it changes is also unknown. Thirdly, and more interestingly, the model operates with the assumption that the correct classes of previously-classified patterns become available at a juncture after their appearance. This paper pioneers the use of unreported novel schemes that can classify such dynamical data streams by invoking the recently-introduced “Anti- Bayesian” (AB) techniques. Contrary to the Bayesian paradigm, that compar…

research product

On the classification of dynamical data streams using novel “Anti-Bayesian” techniques

Abstract The classification of dynamical data streams is among the most complex problems encountered in classification. This is, firstly, because the distribution of the data streams is non-stationary, and it changes without any prior “warning”. Secondly, the manner in which it changes is also unknown. Thirdly, and more interestingly, the model operates with the assumption that the correct classes of previously-classified patterns become available at a juncture after their appearance. This paper pioneers the use of unreported novel schemes that can classify such dynamical data streams by invoking the recently-introduced “Anti-Bayesian” (AB) techniques. Contrary to the Bayesian paradigm, tha…

research product

Balanced difficulty task finder: an adaptive recommendation method for learning tasks based on the concept of state of flow

An adaptive task difficulty assignment method which we reckon as balanced difficulty task finder (BDTF) is proposed in this paper. The aim is to recommend tasks to a learner using a trade-off between skills of the learner and difficulty of the tasks such that the learner experiences a state of flow during the learning. Flow is a mental state that psychologists refer to when someone is completely immersed in an activity. Flow state is a multidisciplinary field of research and has been studied not only in psychology, but also neuroscience, education, sport, and games. The idea behind this paper is to try to achieve a flow state in a similar way as Elo’s chess skill rating (Glickman in Am Ches…

research product

A Novel Clustering Algorithm based on a Non-parametric "Anti-Bayesian" Paradigm

The problem of clustering, or unsupervised classification, has been solved by a myriad of techniques, all of which depend, either directly or implicitly, on the Bayesian principle of optimal classification. To be more specific, within a Bayesian paradigm, if one is to compare the testing sample with only a single point in the feature space from each class, the optimal Bayesian strategy would be to achieve this based on the distance from the corresponding means or central points in the respective distributions. When this principle is applied in clustering, one would assign an unassigned sample into the cluster whose mean is the closest, and this can be done in either a bottom-up or a top-dow…

research product

Higher-Fidelity Frugal and Accurate Quantile Estimation Using a Novel Incremental Discretized Paradigm

Nivå1

research product

Mitigating DDoS using weight‐based geographical clustering

Distributed denial of service (DDoS) attacks have for the last two decades been among the greatest threats facing the internet infrastructure. Mitigating DDoS attacks is a particularly challenging task as an attacker tries to conceal a huge amount of traffic inside a legitimate traffic flow. This article proposes to use data mining approaches to find unique hidden data structures which are able to characterize the normal traffic flow. This will serve as a mean for filtering illegitimate traffic under DDoS attacks. In this endeavor, we devise three algorithms built on previously uncharted areas within mitigation techniques where clustering techniques are used to create geographical clusters …

research product

On using novel “Anti-Bayesian” techniques for the classification of dynamical data streams

The classification of dynamical data streams is among the most complex problems encountered in classification. This is, firstly, because the distribution of the data streams is non-stationary, and it changes without any prior “warning”. Secondly, the manner in which it changes is also unknown. Thirdly, and more interestingly, the model operates with the assumption that the correct classes of previously-classified patterns become available at a juncture after their appearance. This paper pioneers the use of unreported novel schemes that can classify such dynamical data streams by invoking the recently-introduced “Anti-Bayesian” (AB) techniques. Contrary to the Bayesian paradigm, that compare…

research product

Achieving Fair Load Balancing by Invoking a Learning Automata-Based Two-Time-Scale Separation Paradigm.

Author's accepted manuscript. © 2020 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works. In this article, we consider the problem of load balancing (LB), but, unlike the approaches that have been proposed earlier, we attempt to resolve the problem in a fair manner (or rather, it would probably be more appropriate to describe it as an ε-fair manner because, although the LB…

research product

“Anti-Bayesian” flat and hierarchical clustering using symmetric quantiloids

A myriad of works has been published for achieving data clustering based on the Bayesian paradigm, where the clustering sometimes resorts to Naive-Bayes decisions. Within the domain of clustering, the Bayesian principle corresponds to assigning the unlabelled samples to the cluster whose mean (or centroid) is the closest. Recently, Oommen and his co-authors have proposed a novel, counter-intuitive and pioneering PR scheme that is radically opposed to the Bayesian principle. The rational for this paradigm, referred to as the “Anti-Bayesian” (AB) paradigm, involves classification based on the non-central quantiles of the distributions. The first-reported work to achieve clustering using the A…

research product