6533b854fe1ef96bd12ae79a
RESEARCH PRODUCT
On Utilizing Stochastic Non-linear Fractional Bin Packing to Resolve Distributed Web Crawling
Morten GoodwinB. John OommenOle-christopher GranmoAnis Yazidsubject
Theoretical computer scienceLearning automataBin packing problemComputer scienceWeb pageContinuous knapsack problemResource allocationDistributed web crawlingResource managementResource management (computing)Web crawlerdescription
This paper deals with the extremely pertinent problem of web crawling, which is far from trivial considering the magnitude and all-pervasive nature of the World-Wide Web. While numerous AI tools can be used to deal with this task, in this paper we map the problem onto the combinatorially-hard stochastic non-linear fractional knapsack problem, which, in turn, is then solved using Learning Automata (LA). Such LA-based solutions have been recently shown to outperform previous state-of-the-art approaches to resource allocation in Web monitoring. However, the ever growing deployment of distributed systems raises the need for solutions that cope with a distributed setting. In this paper, we present a novel scheme for solving the non-linear fractional bin packing problem. Furthermore, we demonstrate that our scheme has applications to Web crawling, i.e., Distributed resource allocation, and in particular, to distributed Web monitoring. Comprehensive experimental results demonstrate the superiority of our scheme when compared to other classical approaches.
year | journal | country | edition | language |
---|---|---|---|---|
2014-12-01 | 2014 IEEE 17th International Conference on Computational Science and Engineering |