6533b836fe1ef96bd12a14ac
RESEARCH PRODUCT
Learning Automata Based Q-learning for Content Placement in Cooperative Caching
Yue ChenYuanwei LiuZhong YangLei Jiaosubject
Signal Processing (eess.SP)Optimization problemLearning automatabusiness.industryComputer scienceMean opinion scoreQ-learningComputingMilieux_LEGALASPECTSOFCOMPUTING020206 networking & telecommunications02 engineering and technologycomputer.software_genreAction selectionIntelligent agentRecurrent neural networkFOS: Electrical engineering electronic engineering information engineering0202 electrical engineering electronic engineering information engineering020201 artificial intelligence & image processingQuality of experienceArtificial intelligenceElectrical and Electronic EngineeringElectrical Engineering and Systems Science - Signal ProcessingbusinessVDP::Teknologi: 500::Informasjons- og kommunikasjonsteknologi: 550computerdescription
An optimization problem of content placement in cooperative caching is formulated, with the aim of maximizing sum mean opinion score (MOS) of mobile users. Firstly, a supervised feed-forward back-propagation connectionist model based neural network (SFBC-NN) is invoked for user mobility and content popularity prediction. More particularly, practical data collected from GPS-tracker app on smartphones is tackled to test the accuracy of mobility prediction. Then, a learning automata-based Q-learning (LAQL) algorithm for cooperative caching is proposed, in which learning automata (LA) is invoked for Q-learning to obtain an optimal action selection in a random and stationary environment. It is proven that the LA-based action selection scheme is capable of enabling every state to select the optimal action with arbitrarily high probability if Q-learning is able to converge to the optimal Q value eventually. To characterize the performance of the proposed algorithms, the sum MOS of users is applied to define the reward function. Extensive simulations reveal that: 1) The prediction error of SFBC-NN lessen with the increase of iterations and nodes; 2) the proposed LAQL achieves significant performance improvement against traditional Q-learning; 3) the cooperative caching scheme is capable of outperforming non-cooperative caching and random caching of 3% and 4%.
year | journal | country | edition | language |
---|---|---|---|---|
2019-03-14 |