Search results for "machine learning."
showing 10 items of 1455 documents
ELM Regularized Method for Classification Problems
2016
Extreme Learning Machine (ELM) is a recently proposed algorithm, efficient and fast for learning the parameters of single layer neural structures. One of the main problems of this algorithm is to choose the optimal architecture for a given problem solution. To solve this limitation several solutions have been proposed in the literature, including the regularization of the structure. However, to the best of our knowledge, there are no works where such adjustment is applied to classification problems in the presence of a non-linearity in the output; all published works tackle modelling or regression problems. Our proposal has been applied to a series of standard databases for the evaluation o…
Comparison of machine learning models for gully erosion susceptibility mapping
2020
© 2019 China University of Geosciences (Beijing) and Peking University Gully erosion is a disruptive phenomenon which extensively affects the Iranian territory, especially in the Northern provinces. A number of studies have been recently undertaken to study this process and to predict it over space and ultimately, in a broader national effort, to limit its negative effects on local communities. We focused on the Bastam watershed where 9.3% of its surface is currently affected by gullying. Machine learning algorithms are currently under the magnifying glass across the geomorphological community for their high predictive ability. However, unlike the bivariate statistical models, their structu…
Prefiltering for pattern recognition using wavelet transform and neural networks
2003
Publisher Summary Neural networks are built from simple units interlinked by a set of weighted connections. Generally, these units are organized in layers. Each unit of the first layer (input layer) corresponds to a feature of a pattern that is to be analyzed. The units of the last layer (output layer) produce a decision after the propagation of information. Before feeding the computational data to neural networks, the signal must undergo a preprocessing in order to (1) define the initial transformation to represent the measured signal, (2) retain important features for class discrimination and discard that is irrelevant, and (3) reduce the volume of data to be processed, for example, data …
Online Web Bot Detection Using a Sequential Classification Approach
2019
A significant problem nowadays is detection of Web traffic generated by automatic software agents (Web bots). Some studies have dealt with this task by proposing various approaches to Web traffic classification in order to distinguish the traffic stemming from human users' visits from that generated by bots. Most of previous works addressed the problem of offline bot recognition, based on available information on user sessions completed on a Web server. Very few approaches, however, have been proposed to recognize bots online, before the session completes. This paper proposes a novel approach to binary classification of a multivariate data stream incoming on a Web server, in order to recogn…
Efficient on-the-fly Web bot detection
2021
Abstract A large fraction of traffic on present-day Web servers is generated by bots — intelligent agents able to traverse the Web and execute various advanced tasks. Since bots’ activity may raise concerns about server security and performance, many studies have investigated traffic features discriminating bots from human visitors and developed methods for automated traffic classification. Very few previous works, however, aim at identifying bots on-the-fly, trying to classify active sessions as early as possible. This paper proposes a novel method for binary classification of streams of Web server requests in order to label each active session as “bot” or “human”. A machine learning appro…
Application of neural network to predict purchases in online store
2016
A key ability of competitive online stores is effective prediction of customers’ purchase intentions as it makes it possible to apply personalized service strategy to convert visitors into buyers and increase sales conversion rates. Data mining and artificial intelligence techniques have proven to be successful in classification and prediction tasks in complex real-time systems, like e-commerce sites. In this paper we proposed a back-propagation neural network model aiming at predicting purchases in active user sessions in a Web store. The neural network training and evaluation was performed using a set of user sessions reconstructed from server log data. The proposed neural network was abl…
Identifying legitimate Web users and bots with different traffic profiles — an Information Bottleneck approach
2020
Abstract Recent studies reported that about half of Web users nowadays are intelligent agents (Web bots). Many bots are impersonators operating at a very high sophistication level, trying to emulate navigational behaviors of legitimate users (humans). Moreover, bot technology continues to evolve which makes bot detection even harder. To deal with this problem, many advanced methods for differentiating bots from humans have been proposed, a large part of which relies on supervised machine learning techniques. In this paper, we propose a novel approach to identify various profiles of bots and humans which combines feature selection and unsupervised learning of HTTP-level traffic patterns to d…
Restricted Decontamination for the Imbalanced Training Sample Problem
2003
The problem of imbalanced training data in supervised methods is currently receiving growing attention. Imbalanced data means that one class is much more represented than the others in the training sample. It has been observed that this situation, which arises in several practical domains, may produce an important deterioration of the classification accuracy, in particular with patterns belonging to the less represented classes. In the present paper, we report experimental results that point at the convenience of correctly downsizing the majority class while simultaneously increasing the size of the minority one in order to balance both classes. This is obtained by applying a modification o…
CN2-R: Faster CN2 with randomly generated complexes
2011
Among the rule induction algorithms, the classic CN2 is still one of the most popular ones; a great amount of enhancements and improvements to it is to witness this. Despite the growing computing capacities since the algorithm was proposed, one of the main issues is resource demand. The proposed modification, CN2-R, substitutes the star concept of the original algorithm with a technique of randomly generated complexes in order to substantially improve on running times without significant loss in accuracy.
Distance measures for biological sequences: Some recent approaches
2008
AbstractSequence comparison has become a very essential tool in modern molecular biology. In fact, in biomolecular sequences high similarity usually implies significant functional or structural similarity. Traditional approaches use techniques that are based on sequence alignment able to measure character level differences. However, the recent developments of whole genome sequencing technology give rise to need of similarity measures able to capture the rearrangements involving large segments contained in the sequences. This paper is devoted to illustrate different methods recently introduced for the alignment-free comparison of biological sequences. Goal of the paper is both to highlight t…