Search results for "artificial intelligence"
showing 10 items of 6122 documents
HTTP-level e-commerce data based on server access logs for an online store
2020
Abstract Web server logs have been extensively used as a source of data on the characteristics of Web traffic and users’ navigational patterns. In particular, Web bot detection and online purchase prediction using methods from artificial intelligence (AI) are currently key areas of research. However, in reality, it is hard to obtain logs from actual online stores and there is no common dataset that can be used across different studies. Moreover, there is a lack of studies exploring Web traffic over a longer period of time, due to the unavailability of long-term data from server logs. The need to develop reliable models of Web traffic, Web user navigation, and e-customer behaviour calls for …
Modeling a non-stationary bots’ arrival process at an e-commerce Web site
2017
Abstract The paper concerns the issue of modeling and generating a representative Web workload for Web server performance evaluation through simulation experiments. Web traffic analysis has been done from two decades, usually based on Web server log data. However, while the character of the overall Web traffic has been extensively studied and modeled, relatively few studies have been devoted to the analysis of Web traffic generated by Internet robots (Web bots). Moreover, the overwhelming majority of studies concern the traffic on non e-commerce websites. In this paper we address the problem of modeling a realistic arrival process of bots’ requests on an e-commerce Web server. Based on real…
Online Web Bot Detection Using a Sequential Classification Approach
2019
A significant problem nowadays is detection of Web traffic generated by automatic software agents (Web bots). Some studies have dealt with this task by proposing various approaches to Web traffic classification in order to distinguish the traffic stemming from human users' visits from that generated by bots. Most of previous works addressed the problem of offline bot recognition, based on available information on user sessions completed on a Web server. Very few approaches, however, have been proposed to recognize bots online, before the session completes. This paper proposes a novel approach to binary classification of a multivariate data stream incoming on a Web server, in order to recogn…
Efficient on-the-fly Web bot detection
2021
Abstract A large fraction of traffic on present-day Web servers is generated by bots — intelligent agents able to traverse the Web and execute various advanced tasks. Since bots’ activity may raise concerns about server security and performance, many studies have investigated traffic features discriminating bots from human visitors and developed methods for automated traffic classification. Very few previous works, however, aim at identifying bots on-the-fly, trying to classify active sessions as early as possible. This paper proposes a novel method for binary classification of streams of Web server requests in order to label each active session as “bot” or “human”. A machine learning appro…
Ontology languages for the semantic web: A never completely updated review
2006
This paper gives a never completely account of approaches that have been used for the research community for representing knowledge. After underlining the importance of a layered approach and the use of standards, it starts with early efforts used for artificial intelligence researchers. Then recent approaches, aimed mainly at the semantic web, are described. Coding examples from the literature are presented in both sections. Finally, the semantic web ontology creation process, as we envision it, is introduced.
Natural Language Processing Agents and Document Clustering in Knowledge Management
2008
While HTML provides the Web with a standard format for information presentation, XML has been made a standard for information structuring on the Web. The mission of the Semantic Web now is to provide meaning to the Web. Apart from building on the existing Web technologies, we need other tools from other areas of science to do that. This chapter shows how natural language processing methods and technologies, together with ontologies and a neural algorithm, can be used to help in the task of adding meaning to the Web, thus making the Web a better platform for knowledge management in general.
Application of neural network to predict purchases in online store
2016
A key ability of competitive online stores is effective prediction of customers’ purchase intentions as it makes it possible to apply personalized service strategy to convert visitors into buyers and increase sales conversion rates. Data mining and artificial intelligence techniques have proven to be successful in classification and prediction tasks in complex real-time systems, like e-commerce sites. In this paper we proposed a back-propagation neural network model aiming at predicting purchases in active user sessions in a Web store. The neural network training and evaluation was performed using a set of user sessions reconstructed from server log data. The proposed neural network was abl…
Using association rules to assess purchase probability in online stores
2016
The paper addresses the problem of e-customer behavior characterization based on Web server log data. We describe user sessions with the number of session features and aim to identify the features indicating a high probability of making a purchase for two customer groups: traditional customers and innovative customers. We discuss our approach aimed at assessing a purchase probability in a user session depending on categories of viewed products and session features. We apply association rule mining to real online bookstore data. The results show differences in factors indicating a high purchase probability in session for both customer types. The discovered association rules allow us to formu…
Identifying legitimate Web users and bots with different traffic profiles — an Information Bottleneck approach
2020
Abstract Recent studies reported that about half of Web users nowadays are intelligent agents (Web bots). Many bots are impersonators operating at a very high sophistication level, trying to emulate navigational behaviors of legitimate users (humans). Moreover, bot technology continues to evolve which makes bot detection even harder. To deal with this problem, many advanced methods for differentiating bots from humans have been proposed, a large part of which relies on supervised machine learning techniques. In this paper, we propose a novel approach to identify various profiles of bots and humans which combines feature selection and unsupervised learning of HTTP-level traffic patterns to d…
Restricted Decontamination for the Imbalanced Training Sample Problem
2003
The problem of imbalanced training data in supervised methods is currently receiving growing attention. Imbalanced data means that one class is much more represented than the others in the training sample. It has been observed that this situation, which arises in several practical domains, may produce an important deterioration of the classification accuracy, in particular with patterns belonging to the less represented classes. In the present paper, we report experimental results that point at the convenience of correctly downsizing the majority class while simultaneously increasing the size of the minority one in order to balance both classes. This is obtained by applying a modification o…