Search results for "Q-learning"

showing 10 items of 12 documents

Low Latency Ambient Backscatter Communications with Deep Q-Learning for Beyond 5G Applications

2020

Low latency is a critical requirement of beyond 5G services. Previously, the aspect of latency has been extensively analyzed in conventional and modern wireless networks. With the rapidly growing research interest in wireless-powered ambient backscatter communications, it has become ever more important to meet the delay constraints, while maximizing the achievable data rate. Therefore, to address the issue of latency in backscatter networks, this paper provides a deep Q-learning based framework for delay constrained ambient backscatter networks. To do so, a Q-learning model for ambient backscatter scenario has been developed. In addition, an algorithm has been proposed that employ deep neur…

BackscatterWireless networkComputer science05 social sciencesReal-time computing0202 electrical engineering electronic engineering information engineering0507 social and economic geographyQ-learning020206 networking & telecommunicationsNetwork performance02 engineering and technology050703 geography5G2020 IEEE 91st Vehicular Technology Conference (VTC2020-Spring)
researchProduct

POMDP problēmu risināšana, izmantojot vēsturiskus elementus, un tās optimizācija

2017

Elvja Egles Bakalaura darba „POMDP problēmu risināšana, izmantojot vēsturiskus elementus, un tās optimizācija” ietvaros tika izpētīti daļēji novērojami Markova lēmuma procesu (POMDP) algoritmi. Balstoties uz Jāņa Zutera zinātniski pētniecisko raksta saturu „Sequence Q-Learning: a Memory-based Method Towards Solving POMDP”, kurā ir aprakstīta POMDP risināšanas ideja, tika izstrādāts līdzvērtīgs mašīnmācīšanās algoritms, saskaņā ar rakstā sniegto informāciju. Bakalaura darba ietvaros tika pētīta šī Algoritma efektivitāte, kā arī apskatītas un piedāvātas vairākas iespējas tā tālākai pilnveidei. Programmētais darbs tika pievērsts konkrētai problēmai, kurai bija novērojama labāka ātrdarbība ar n…

Datorzinātnestimulētā mācīšanāsSequence Q-LearningmašīnmācīšanāsPOMDP
researchProduct

Learning competitive pricing strategies by multi-agent reinforcement learning

2003

Abstract In electronic marketplaces automated and dynamic pricing is becoming increasingly popular. Agents that perform this task can improve themselves by learning from past observations, possibly using reinforcement learning techniques. Co-learning of several adaptive agents against each other may lead to unforeseen results and increasingly dynamic behavior of the market. In this article we shed some light on price developments arising from a simple price adaptation strategy. Furthermore, we examine several adaptive pricing strategies and their learning behavior in a co-learning scenario with different levels of competition. Q-learning manages to learn best-reply strategies well, but is e…

Economics and EconometricsControl and OptimizationManagement scienceApplied MathematicsQ-learningAgent-based computational economicsTask (project management)Competition (economics)Pricing strategiesRisk analysis (engineering)Dynamic pricingEconomicsReinforcement learningAdaptation (computer science)Journal of Economic Dynamics and Control
researchProduct

Towards a Deep Reinforcement Learning Approach for Tower Line Wars

2017

There have been numerous breakthroughs with reinforcement learning in the recent years, perhaps most notably on Deep Reinforcement Learning successfully playing and winning relatively advanced computer games. There is undoubtedly an anticipation that Deep Reinforcement Learning will play a major role when the first AI masters the complicated game plays needed to beat a professional Real-Time Strategy game player. For this to be possible, there needs to be a game environment that targets and fosters AI research, and specifically Deep Reinforcement Learning. Some game environments already exist, however, these are either overly simplistic such as Atari 2600 or complex such as Starcraft II fro…

EntertainmentCognitive sciencebusiness.industryComputer scienceDeep learningComputingMilieux_PERSONALCOMPUTINGQ-learningReinforcement learningArtificial intelligencebusinessGame player
researchProduct

AIOC2: A deep Q-learning approach to autonomic I/O congestion control in Lustre

2021

Abstract In high performance computing systems, I/O congestion is a common problem in large-scale distributed file systems. However, the current implementation mainly requires administrator to manually design low-level implementation and optimization, we proposes an adaptive I/O congestion control framework, named AIOC 2 , which can not only adaptively tune the I/O congestion control parameters, but also exploit the deep Q-learning method to start the training parameters and optimize the tuning for different types of workloads from the server and the client at the same time. AIOC 2 combines the feedback-based dynamic I/O congestion control and deep Q-learning parameter tuning technology to …

ExploitComputer Networks and CommunicationsComputer sciencebusiness.industryQ-learningInterference (wave propagation)SupercomputerComputer Graphics and Computer-Aided DesignTheoretical Computer ScienceNetwork congestionArtificial IntelligenceHardware and ArchitectureEmbedded systemLustre (file system)Latency (engineering)businessThroughput (business)SoftwareParallel Computing
researchProduct

Deep Q-Learning With Q-Matrix Transfer Learning for Novel Fire Evacuation Environment

2021

We focus on the important problem of emergency evacuation, which clearly could benefit from reinforcement learning that has been largely unaddressed. Emergency evacuation is a complex task which is difficult to solve with reinforcement learning, since an emergency situation is highly dynamic, with a lot of changing variables and complex constraints that makes it difficult to train on. In this paper, we propose the first fire evacuation environment to train reinforcement learning agents for evacuation planning. The environment is modelled as a graph capturing the building structure. It consists of realistic features like fire spread, uncertainty and bottlenecks. We have implemented the envir…

FOS: Computer and information sciencesComputer Science - Machine LearningComputer Science - Artificial IntelligenceComputer scienceQ-learningComputingMilieux_LEGALASPECTSOFCOMPUTINGSystems and Control (eess.SY)02 engineering and technologyOverfittingMachine Learning (cs.LG)FOS: Electrical engineering electronic engineering information engineering0202 electrical engineering electronic engineering information engineeringReinforcement learningElectrical and Electronic EngineeringVDP::Teknologi: 500::Informasjons- og kommunikasjonsteknologi: 550business.industry020206 networking & telecommunicationsComputer Science ApplicationsHuman-Computer InteractionArtificial Intelligence (cs.AI)Control and Systems EngineeringShortest path problemEmergency evacuationComputer Science - Systems and Control020201 artificial intelligence & image processingArtificial intelligenceTransfer of learningbusinessSoftwareIEEE Transactions on Systems, Man, and Cybernetics: Systems
researchProduct

Interpretable Option Discovery Using Deep Q-Learning and Variational Autoencoders

2021

Deep Reinforcement Learning (RL) is unquestionably a robust framework to train autonomous agents in a wide variety of disciplines. However, traditional deep and shallow model-free RL algorithms suffer from low sample efficiency and inadequate generalization for sparse state spaces. The options framework with temporal abstractions [18] is perhaps the most promising method to solve these problems, but it still has noticeable shortcomings. It only guarantees local convergence, and it is challenging to automate initiation and termination conditions, which in practice are commonly hand-crafted.

Generalizationbusiness.industryComputer scienceAutonomous agentQ-learningSample (statistics)Machine learningcomputer.software_genreLocal convergenceVariety (cybernetics)Reinforcement learningArtificial intelligenceCluster analysisbusinesscomputer
researchProduct

Sequence Q-learning: A memory-based method towards solving POMDP

2015

Partially observable Markov decision process (POMDP) models a control problem, where states are only partially observable by an agent. The two main approaches to solve such tasks are these of value function and direct search in policy space. This paper introduces the Sequence Q-learning method which extends the well known Q-learning algorithm towards the ability to solve POMDPs through adding a special sequence management framework by advancing from action values to “sequence” values and including the “sequence continuity principle”.

SequenceComputer sciencebusiness.industryQ-learningPartially observable Markov decision processMarkov processContext (language use)Markov modelsymbols.namesakeBellman equationsymbolsArtificial intelligenceMarkov decision processbusiness2015 20th International Conference on Methods and Models in Automation and Robotics (MMAR)
researchProduct

Learning Automata Based Q-learning for Content Placement in Cooperative Caching

2019

An optimization problem of content placement in cooperative caching is formulated, with the aim of maximizing sum mean opinion score (MOS) of mobile users. Firstly, a supervised feed-forward back-propagation connectionist model based neural network (SFBC-NN) is invoked for user mobility and content popularity prediction. More particularly, practical data collected from GPS-tracker app on smartphones is tackled to test the accuracy of mobility prediction. Then, a learning automata-based Q-learning (LAQL) algorithm for cooperative caching is proposed, in which learning automata (LA) is invoked for Q-learning to obtain an optimal action selection in a random and stationary environment. It is p…

Signal Processing (eess.SP)Optimization problemLearning automatabusiness.industryComputer scienceMean opinion scoreQ-learningComputingMilieux_LEGALASPECTSOFCOMPUTING020206 networking & telecommunications02 engineering and technologycomputer.software_genreAction selectionIntelligent agentRecurrent neural networkFOS: Electrical engineering electronic engineering information engineering0202 electrical engineering electronic engineering information engineering020201 artificial intelligence & image processingQuality of experienceArtificial intelligenceElectrical and Electronic EngineeringElectrical Engineering and Systems Science - Signal ProcessingbusinessVDP::Teknologi: 500::Informasjons- og kommunikasjonsteknologi: 550computer
researchProduct

A Comparative Analysis of Multiple Biasing Techniques for $Q_{biased}$ Softmax Regression Algorithm

2021

Over the past many years the popularity of robotic workers has seen a tremendous surge. Several tasks which were previously considered insurmountable are able to be performed by robots efficiently, with much ease. This is mainly due to the advances made in the field of control systems and artificial intelligence in recent years. Lately, we have seen Reinforcement Learning (RL) capture the spotlight, in the field of robotics. Instead of explicitly specifying the solution of a particular task, RL enables the robot (agent) to explore its environment and through trial and error choose the appropriate response. In this paper, a comparative analysis of biasing techniques for the Q-biased softmax …

business.industryComputer scienceObstacle avoidanceSoftmax functionQ-learningRobotReinforcement learningMobile robotArtificial intelligencebusinessTrial and errorAction selection2021 International Conference on Artificial Intelligence and Mechatronics Systems (AIMS)
researchProduct