Search results for " Processing"
showing 10 items of 7549 documents
Online Web Bot Detection Using a Sequential Classification Approach
2019
A significant problem nowadays is detection of Web traffic generated by automatic software agents (Web bots). Some studies have dealt with this task by proposing various approaches to Web traffic classification in order to distinguish the traffic stemming from human users' visits from that generated by bots. Most of previous works addressed the problem of offline bot recognition, based on available information on user sessions completed on a Web server. Very few approaches, however, have been proposed to recognize bots online, before the session completes. This paper proposes a novel approach to binary classification of a multivariate data stream incoming on a Web server, in order to recogn…
Efficient on-the-fly Web bot detection
2021
Abstract A large fraction of traffic on present-day Web servers is generated by bots — intelligent agents able to traverse the Web and execute various advanced tasks. Since bots’ activity may raise concerns about server security and performance, many studies have investigated traffic features discriminating bots from human visitors and developed methods for automated traffic classification. Very few previous works, however, aim at identifying bots on-the-fly, trying to classify active sessions as early as possible. This paper proposes a novel method for binary classification of streams of Web server requests in order to label each active session as “bot” or “human”. A machine learning appro…
Ontology languages for the semantic web: A never completely updated review
2006
This paper gives a never completely account of approaches that have been used for the research community for representing knowledge. After underlining the importance of a layered approach and the use of standards, it starts with early efforts used for artificial intelligence researchers. Then recent approaches, aimed mainly at the semantic web, are described. Coding examples from the literature are presented in both sections. Finally, the semantic web ontology creation process, as we envision it, is introduced.
Natural Language Processing Agents and Document Clustering in Knowledge Management
2008
While HTML provides the Web with a standard format for information presentation, XML has been made a standard for information structuring on the Web. The mission of the Semantic Web now is to provide meaning to the Web. Apart from building on the existing Web technologies, we need other tools from other areas of science to do that. This chapter shows how natural language processing methods and technologies, together with ontologies and a neural algorithm, can be used to help in the task of adding meaning to the Web, thus making the Web a better platform for knowledge management in general.
Using association rules to assess purchase probability in online stores
2016
The paper addresses the problem of e-customer behavior characterization based on Web server log data. We describe user sessions with the number of session features and aim to identify the features indicating a high probability of making a purchase for two customer groups: traditional customers and innovative customers. We discuss our approach aimed at assessing a purchase probability in a user session depending on categories of viewed products and session features. We apply association rule mining to real online bookstore data. The results show differences in factors indicating a high purchase probability in session for both customer types. The discovered association rules allow us to formu…
Identifying legitimate Web users and bots with different traffic profiles — an Information Bottleneck approach
2020
Abstract Recent studies reported that about half of Web users nowadays are intelligent agents (Web bots). Many bots are impersonators operating at a very high sophistication level, trying to emulate navigational behaviors of legitimate users (humans). Moreover, bot technology continues to evolve which makes bot detection even harder. To deal with this problem, many advanced methods for differentiating bots from humans have been proposed, a large part of which relies on supervised machine learning techniques. In this paper, we propose a novel approach to identify various profiles of bots and humans which combines feature selection and unsupervised learning of HTTP-level traffic patterns to d…
Restricted Decontamination for the Imbalanced Training Sample Problem
2003
The problem of imbalanced training data in supervised methods is currently receiving growing attention. Imbalanced data means that one class is much more represented than the others in the training sample. It has been observed that this situation, which arises in several practical domains, may produce an important deterioration of the classification accuracy, in particular with patterns belonging to the less represented classes. In the present paper, we report experimental results that point at the convenience of correctly downsizing the majority class while simultaneously increasing the size of the minority one in order to balance both classes. This is obtained by applying a modification o…
Characterization of the human visual system threshold performance by a weighting function in the Gabor domain
1997
Abstract As evidenced by many physiological and psychophysical reports, the receptive fields of the first-stage set of mechanisms of the visual process fit to two-dimensional (2D) compactly supported harmonic functions. The application of this set of band-pass filter functions to the input signal implies that the visual system carries out some kind of conjoint space/spatial frequency transform. Assuming that a conjoint transform is carried out, we present in this paper a new characterization of the visual system performance by means of a weighting function in the conjoint domain. We have called this weighting function (in the particular case of the Gabor transform) the Gabor stimuli Sensiti…
Fast nonstationary preconditioned iterative methods for ill-posed problems, with application to image deblurring
2013
We introduce a new iterative scheme for solving linear ill-posed problems, similar to nonstationary iterated Tikhonov regularization, but with an approximation of the underlying operator to be used for the Tikhonov equations. For image deblurring problems, such an approximation can be a discrete deconvolution that operates entirely in the Fourier domain. We provide a theoretical analysis of the new scheme, using regularization parameters that are chosen by a certain adaptive strategy. The numerical performance of this method turns out to be superior to state-of-the-art iterative methods, including the conjugate gradient iteration for the normal equation, with and without additional precondi…
Rotationally symmetric 1-harmonic flows from D2 TO S 2: Local well-posedness and finite time blowup
2010
The 1-harmonic flow from the disk to the sphere with constant Dirichlet boundary conditions is analyzed in the case of rotational symmetry. Sufficient conditions on the initial datum are given, such that a unique classical solution exists for short times. Also, a sharp criterion on the boundary condition is identified, such that any classical solution will blow up in finite time. Finally, nongeneric examples of finite time blowup are exhibited for any boundary condition.