6533b826fe1ef96bd1284567

RESEARCH PRODUCT

Detection of Internet robots using a Bayesian approach

Mariusz SobkowGrażyna Suchacka

subject

Web serverComputer sciencebusiness.industryBayesian probabilitycomputer.software_genreEuclidean distanceIdentification (information)Web trafficRobotThe InternetData miningRobots exclusion standardbusinesscomputer

description

A large part of Web traffic on e-commerce sites is generated not by human users but by Internet robots: search engine crawlers, shopping bots, hacking bots, etc. In practice, not all robots, especially the malicious ones, disclose their identities to a Web server and thus there is a need to develop methods for their detection and identification. This paper proposes the application of a Bayesian approach to robot detection based on characteristics of user sessions. The method is applied to the Web traffic from a real e-commerce site. Results show that the classification model based on the cluster analysis with the Ward's method and the weighted Euclidean metric is very effective in robot detection, even obtaining accuracy of above 90%.

https://doi.org/10.1109/cybconf.2015.7175961