Search results for "Web server"
showing 10 items of 45 documents
Efficient on-the-fly Web bot detection
2021
Abstract A large fraction of traffic on present-day Web servers is generated by bots — intelligent agents able to traverse the Web and execute various advanced tasks. Since bots’ activity may raise concerns about server security and performance, many studies have investigated traffic features discriminating bots from human visitors and developed methods for automated traffic classification. Very few previous works, however, aim at identifying bots on-the-fly, trying to classify active sessions as early as possible. This paper proposes a novel method for binary classification of streams of Web server requests in order to label each active session as “bot” or “human”. A machine learning appro…
Feature selection: A multi-objective stochastic optimization approach
2020
The feature subset task can be cast as a multiobjective discrete optimization problem. In this work, we study the search algorithm component of a feature subset selection method. We propose an algorithm based on the threshold accepting method, extended to the multi-objective framework by an appropriate definition of the acceptance rule. The method is used in the task of identifying relevant subsets of features in a Web bot recognition problem, where automated software agents on the Web are identified by analyzing the stream of HTTP requests to a Web server.
Verification of Web traffic burstiness and self-similarity for multiple online stores
2017
Developing realistic Web traffic models is essential for a reliable Web server performance evaluation. Very significant Web traffic properties that have been identified so far include burstiness and self-similarity. Very few relevant studies have been devoted to e-commerce traffic, however. In this paper, we investigate burstiness and self-similarity factors for seven different online stores using their access log data. Our findings show that both features are present in all the analyzed e-commerce datasets. Furthermore, a strong correlation of the Hurst parameter with the average request arrival rate was discovered (0.94). Estimates of the Hurst parameter for the Web traffic in the online …
Practical Aspects of Log File Analysis for E-Commerce
2013
The paper concerns Web server log file analysis to discover knowledge useful for online retailers. Data for one month of the online bookstore operation was analyzed with respect to the probability of making a purchase by e-customers. Key states and characteristics of user sessions were distinguished and their relations to the session state connected with purchase confirmation were analyzed. Results allow identification of factors increasing the probability of making a purchase in a given Web store and thus, determination of user sessions which are more valuable in terms of e-business profitability. Such results may be then applied in practice, e.g. in a method for personalized or prioritize…
Delfos: the Oracle to Predict NextWeb User's Accesses
2007
Despite the wide and intensive research efforts focused on Web prediction and prefetching techniques aimed to reduce user's perceived latency, few attempts to implement and use them in real environments have been done, mainly due to their complexity and supposed limitations that low user available bandwidths imposed few years ago. Nevertheless, current user bandwidths open a new scenario for prefetching that becomes again an interesting option to improve web performance. This paper presents Delfos, a framework to perform web predictions and prefetching on a real environment that tries to cover the existing gap between research and praxis. Delfos is integrated in the web architecture without…
Data Stream Clustering for Application-Layer DDoS Detection in Encrypted Traffic
2018
Application-layer distributed denial-of-service attacks have become a serious threat to modern high-speed computer networks and systems. Unlike network-layer attacks, application-layer attacks can be performed using legitimate requests from legitimately connected network machines that make these attacks undetectable by signature-based intrusion detection systems. Moreover, the attacks may utilize protocols that encrypt the data of network connections in the application layer, making it even harder to detect an attacker’s activity without decrypting users’ network traffic, and therefore violating their privacy. In this paper, we present a method that allows us to detect various application-l…
Web Server Support for e-Customer Loyalty through QoS Differentiation
2013
The paper deals with the problem of offering predictive service in e-commerce Web server systems under overload. Due to unpredictability of Web accesses, such systems often fail to effectively handle peak traffic, which results in long delays and incomplete transactions. As a consequence, online retailers miss an opportunity to attract new customers, retain the loyalty of regular customers, and increase profits. We propose a method for priority-based admission control and scheduling of requests at the Web server system in order to differentiate Quality of Service (QoS) with regard to user-perceived delays, i.e., Web page response times provided by the system (as opposed to HTTP request resp…
Improving the quality of e-commerce web service: what is important for the request scheduling algorithm?
2005
The paper concerns a new research area that is Quality of Web Service (QoWS). The need for QoWS is motivated by a still growing number of Internet users, by a steady development and diversification of Web services, and especially by popularization of e-commerce applications. The goal of the paper is a critical analysis of the literature concerning scheduling algorithms for e-commerce Web servers. The paper characterizes factors affecting the load of the Web servers and discusses ways of improving their efficiency. Crucial QoWS requirements of the business Web server are identified: serving requests before their individual deadlines, supporting user session integrity, supporting different cl…
Using association rules to assess purchase probability in online stores
2016
The paper addresses the problem of e-customer behavior characterization based on Web server log data. We describe user sessions with the number of session features and aim to identify the features indicating a high probability of making a purchase for two customer groups: traditional customers and innovative customers. We discuss our approach aimed at assessing a purchase probability in a user session depending on categories of viewed products and session features. We apply association rule mining to real online bookstore data. The results show differences in factors indicating a high purchase probability in session for both customer types. The discovered association rules allow us to formu…
You’ve Got Photos! The design and evaluation of a location-based media-sharing application
2008
PhotoJournal is a novel location-based media sharing appli- cation that enables users to build interactive journals that associate multimedia files with locations on maps and share this information with other users. Its underlying informa- tion discovery and sharing mechanism is 7DS that runs in either pure peer-to-peer or centralized server-to-client mode, depending on the availability of a server and/or an infras- tructure. 7DS-enabled devices act as miniature caches, shar- ing information with each other. When access to an informa- tion server (e.g., web server) is not available, the local 7DS instance running on the device enables the device to search and access information from other p…