Search results for "Machine learning"
showing 10 items of 1464 documents
Prefiltering for pattern recognition using wavelet transform and neural networks
2003
Publisher Summary Neural networks are built from simple units interlinked by a set of weighted connections. Generally, these units are organized in layers. Each unit of the first layer (input layer) corresponds to a feature of a pattern that is to be analyzed. The units of the last layer (output layer) produce a decision after the propagation of information. Before feeding the computational data to neural networks, the signal must undergo a preprocessing in order to (1) define the initial transformation to represent the measured signal, (2) retain important features for class discrimination and discard that is irrelevant, and (3) reduce the volume of data to be processed, for example, data …
Online Web Bot Detection Using a Sequential Classification Approach
2019
A significant problem nowadays is detection of Web traffic generated by automatic software agents (Web bots). Some studies have dealt with this task by proposing various approaches to Web traffic classification in order to distinguish the traffic stemming from human users' visits from that generated by bots. Most of previous works addressed the problem of offline bot recognition, based on available information on user sessions completed on a Web server. Very few approaches, however, have been proposed to recognize bots online, before the session completes. This paper proposes a novel approach to binary classification of a multivariate data stream incoming on a Web server, in order to recogn…
Efficient on-the-fly Web bot detection
2021
Abstract A large fraction of traffic on present-day Web servers is generated by bots — intelligent agents able to traverse the Web and execute various advanced tasks. Since bots’ activity may raise concerns about server security and performance, many studies have investigated traffic features discriminating bots from human visitors and developed methods for automated traffic classification. Very few previous works, however, aim at identifying bots on-the-fly, trying to classify active sessions as early as possible. This paper proposes a novel method for binary classification of streams of Web server requests in order to label each active session as “bot” or “human”. A machine learning appro…
Application of neural network to predict purchases in online store
2016
A key ability of competitive online stores is effective prediction of customers’ purchase intentions as it makes it possible to apply personalized service strategy to convert visitors into buyers and increase sales conversion rates. Data mining and artificial intelligence techniques have proven to be successful in classification and prediction tasks in complex real-time systems, like e-commerce sites. In this paper we proposed a back-propagation neural network model aiming at predicting purchases in active user sessions in a Web store. The neural network training and evaluation was performed using a set of user sessions reconstructed from server log data. The proposed neural network was abl…
Identifying legitimate Web users and bots with different traffic profiles — an Information Bottleneck approach
2020
Abstract Recent studies reported that about half of Web users nowadays are intelligent agents (Web bots). Many bots are impersonators operating at a very high sophistication level, trying to emulate navigational behaviors of legitimate users (humans). Moreover, bot technology continues to evolve which makes bot detection even harder. To deal with this problem, many advanced methods for differentiating bots from humans have been proposed, a large part of which relies on supervised machine learning techniques. In this paper, we propose a novel approach to identify various profiles of bots and humans which combines feature selection and unsupervised learning of HTTP-level traffic patterns to d…
Restricted Decontamination for the Imbalanced Training Sample Problem
2003
The problem of imbalanced training data in supervised methods is currently receiving growing attention. Imbalanced data means that one class is much more represented than the others in the training sample. It has been observed that this situation, which arises in several practical domains, may produce an important deterioration of the classification accuracy, in particular with patterns belonging to the less represented classes. In the present paper, we report experimental results that point at the convenience of correctly downsizing the majority class while simultaneously increasing the size of the minority one in order to balance both classes. This is obtained by applying a modification o…
CN2-R: Faster CN2 with randomly generated complexes
2011
Among the rule induction algorithms, the classic CN2 is still one of the most popular ones; a great amount of enhancements and improvements to it is to witness this. Despite the growing computing capacities since the algorithm was proposed, one of the main issues is resource demand. The proposed modification, CN2-R, substitutes the star concept of the original algorithm with a technique of randomly generated complexes in order to substantially improve on running times without significant loss in accuracy.
Distance measures for biological sequences: Some recent approaches
2008
AbstractSequence comparison has become a very essential tool in modern molecular biology. In fact, in biomolecular sequences high similarity usually implies significant functional or structural similarity. Traditional approaches use techniques that are based on sequence alignment able to measure character level differences. However, the recent developments of whole genome sequencing technology give rise to need of similarity measures able to capture the rearrangements involving large segments contained in the sequences. This paper is devoted to illustrate different methods recently introduced for the alignment-free comparison of biological sequences. Goal of the paper is both to highlight t…
Using self-deferral to achieve fairness between Wi-Fi and NR-U in downlink and uplink scenarios
2022
Wireless networks operating in unlicensed bands generally use one of two channel access paradigms: random access (e.g., Wi-Fi) or scheduled access (e.g., LTE License Assisted Access, LTE LAA and New Radio-Unlicensed, NR-U). The coexistence between these two paradigms is based on listen before talk (LBT), which was, however, designed for random access. Meanwhile, scheduled systems require that their transmissions start at the beginning of a slot boundary. Synchronizing this boundary to the end of LBT usually requires transmitting a reservation signal (RS) to block the channel. Since the RS is a waste of channel resources, we investigate an alternative self-deferral approach (gap-based access…
Connections Between Topology and Macroscopic Mechanical Properties of Three-Dimensional Open-Pore Materials
2018
This work addresses a number of fundamental questions regarding the topological description of materials characterized by a highly porous three-dimensional structure with bending as the major deformation mechanism. Highly efficient finite-element beam models were used for generating data on the mechanical behavior of structures with different topologies, ranging from highly coordinated bcc to Gibson–Ashby structures. Random cutting enabled a continuous modification of average coordination numbers ranging from the maximum connectivity to the percolation-cluster transition of the 3D network. The computed macroscopic mechanical properties–Young's modulus, yield strength, and Poisson's ratio–co…