Search results for "Markov decision process"

showing 10 items of 22 documents

Optimization of anemia treatment in hemodialysis patients via reinforcement learning

2013

Objective: Anemia is a frequent comorbidity in hemodialysis patients that can be successfully treated by administering erythropoiesis-stimulating agents (ESAs). ESAs dosing is currently based on clinical protocols that often do not account for the high inter- and intra-individual variability in the patient's response. As a result, the hemoglobin level of some patients oscillates around the target range, which is associated with multiple risks and side-effects. This work proposes a methodology based on reinforcement learning (RL) to optimize ESA therapy. Methods: RL is a data-driven approach for solving sequential decision-making problems that are formulated as Markov decision processes (MDP…

MaleFOS: Computer and information sciencesMathematical optimizationDarbepoetin alfaComputer scienceAnemiaComputer Science - Artificial Intelligencemedicine.medical_treatmentMedicine (miscellaneous)Machine Learning (stat.ML)Outcome (game theory)Decision Support TechniquesMachine Learning (cs.LG)Renal DialysisArtificial IntelligenceStatistics - Machine LearningmedicineHumansReinforcement learningDosingAgedProtocol (science)Patient SelectionAnemiaHemoglobin AMiddle Agedmedicine.diseaseMarkov ChainsComputer Science - LearningArtificial Intelligence (cs.AI)Chronic DiseaseHematinicsKidney Failure ChronicFemaleHemodialysisMarkov decision processReinforcement PsychologyAlgorithmsmedicine.drug
researchProduct

Designing a multi-layer edge-computing platform for energy-efficient and delay-aware offloading in vehicular networks

2021

Abstract Vehicular networks are expected to support many time-critical services requiring huge amounts of computation resources with very low delay. However, such requirements may not be fully met by vehicle on-board devices due to their limited processing and storage capabilities. The solution provided by 5G is the application of the Multi-Access Edge Computing (MEC) paradigm, which represents a low-latency alternative to remote clouds. Accordingly, we envision a multi-layer job-offloading scheme based on three levels, i.e., the Vehicular Domain, the MEC Domain and Backhaul Network Domain. In such a view, jobs can be offloaded from the Vehicular Domain to the MEC Domain, and even further o…

Markov ModelsVehicular ad hoc networkComputer Networks and CommunicationsComputer scienceDistributed computing5G; Edge Computing; Markov Models; Reinforcement Learning; Vehicular NetworksLoad balancing (computing)Reinforcement LearningDomain (software engineering)ServerEdge ComputingReinforcement learningVehicular NetworksMarkov decision process5GEdge computingEfficient energy useComputer Networks
researchProduct

Least-squares temporal difference learning based on an extreme learning machine

2014

Abstract Reinforcement learning (RL) is a general class of algorithms for solving decision-making problems, which are usually modeled using the Markov decision process (MDP) framework. RL can find exact solutions only when the MDP state space is discrete and small enough. Due to the fact that many real-world problems are described by continuous variables, approximation is essential in practical applications of RL. This paper is focused on learning the value function of a fixed policy in continuous MPDs. This is an important subproblem of several RL algorithms. We propose a least-squares temporal difference (LSTD) algorithm based on the extreme learning machine. LSTD is typically combined wi…

Mathematical optimizationArtificial neural networkArtificial IntelligenceCognitive NeuroscienceBellman equationReinforcement learningState spaceMarkov decision processTemporal difference learningComputer Science ApplicationsMathematicsExtreme learning machineCurse of dimensionalityNeurocomputing
researchProduct

The Dreaming Variational Autoencoder for Reinforcement Learning Environments

2018

Reinforcement learning has shown great potential in generalizing over raw sensory data using only a single neural network for value optimization. There are several challenges in the current state-of-the-art reinforcement learning algorithms that prevent them from converging towards the global optima. It is likely that the solution to these problems lies in short- and long-term planning, exploration and memory management for reinforcement learning algorithms. Games are often used to benchmark reinforcement learning algorithms as they provide a flexible, reproducible, and easy to control environment. Regardless, few games feature a state-space where results in exploration, memory, and plannin…

Memory managementArtificial neural networkComputer sciencebusiness.industryBenchmark (computing)Feature (machine learning)Reinforcement learningArtificial intelligenceMarkov decision processbusinessAutoencoderGenerative grammar
researchProduct

Sequence Q-learning: A memory-based method towards solving POMDP

2015

Partially observable Markov decision process (POMDP) models a control problem, where states are only partially observable by an agent. The two main approaches to solve such tasks are these of value function and direct search in policy space. This paper introduces the Sequence Q-learning method which extends the well known Q-learning algorithm towards the ability to solve POMDPs through adding a special sequence management framework by advancing from action values to “sequence” values and including the “sequence continuity principle”.

SequenceComputer sciencebusiness.industryQ-learningPartially observable Markov decision processMarkov processContext (language use)Markov modelsymbols.namesakeBellman equationsymbolsArtificial intelligenceMarkov decision processbusiness2015 20th International Conference on Methods and Models in Automation and Robotics (MMAR)
researchProduct

The Rail Quality Index as an Indicator of the “Global Comfort” in Optimizing Safety, Quality and Efficiency in Railway Rails

2012

AbstractThe proposed model uses the stochastic dynamic programming and in particular Markov decision processes applied to the Rail Quality Index (RQI - Italian Indice di Qualità del Binario, IQB).By performing the integrated analysis of the classes of variables which characterize the overall service quality (in terms of comfort and safety), the proposed mathematical approach allows to find the solutions to the decision-making process in function of the probability of deterioration of the state variables of the infrastructure over time and of the flow of available resources.

Service qualityEngineeringQuality and EfficiencyIndex (economics)Operations researchbusiness.industryQuality of servicemedia_common.quotation_subjectRailwayGlobal Comfort Optimization of Safety Quality and Efficiency Railway IQB Rail Quality IndexPoison controlOptimization of SafetyStochastic programmingTransport engineeringRail Quality IndexIQBSafety engineeringSettore ICAR/04 - Strade Ferrovie Ed AeroportiGeneral Materials ScienceQuality (business)Markov decision processGlobal Comfortbusinessmedia_commonProcedia - Social and Behavioral Sciences
researchProduct

A Cognitive Dialogue Manager for Education Purposes

2011

A conversational agent is a software system that is able to interact with users in a natural way, and often uses natural language capabilities. In this chapter, an evolution of a conversational agent is presented according to the definition of dialogue management techniques for the conversational agents. The presented conversational agent is intended to act as a part of an educational system. The chapter outlines the state-of-the-art systems and techniques for dialogue management in cognitive educational systems, and the underlying psychological and social aspects. We present our framework for a dialogue manager aimed to reduce the uncertainty in users’ sentences during the assessment of hi…

Settore ING-INF/05 - Sistemi Di Elaborazione Delle InformazioniKnowledge managementOntologybusiness.industryLatent semantic analysisSemantic spacePartially observable Markov decision processCognitionPOMDPOntology (information science)computer.software_genreChatbotWorld Wide WebSemantic SpaceSemantic integrationPOS TaggingbusinessPsychologyLatent Semantic AnalysiOntology MappingcomputerChatbotOWL
researchProduct

A meta-cognitive architecture for planning in uncertain environments

2013

Abstract The behavior of an artificial agent performing in a natural environment is influenced by many different pressures and needs coming from both external world and internal factors, which sometimes drive the agent to reach conflicting goals. At the same time, the interaction between an artificial agent and the environment is deeply affected by uncertainty due to the imprecision in the description of the world, and the unpredictability of the effects of the agent’s actions. Such an agent needs meta-cognition in terms of both self-awareness and control. Self-awareness is related to the internal conditions that may possibly influence the completion of the task, while control is oriented t…

Settore ING-INF/05 - Sistemi Di Elaborazione Delle InformazioniProcess (engineering)business.industryComputer scienceCognitive NeuroscienceUncertaintyInternal modelExperimental and Cognitive PsychologyCognitive architecturecomputer.software_genreTask (project management)PlanningIntelligent agentRisk analysis (engineering)Cognitive and meta-cognitive artificial agentArtificial IntelligenceCognitive moduleMarkov Decision ProcessesArtificial intelligenceMarkov decision processCognitive and meta-cognitive artificial agentsbusinessGoal settingcomputerBiologically Inspired Cognitive Architectures
researchProduct

Comprehensive Uncertainty Management in MDPs

2013

Multistage decision-making in robots involved in real-world tasks is a process affected by uncertainty. The effects of the agent’s actions in a physical en- vironment cannot be always predicted deterministically and in a precise manner. Moreover, observing the environment can be a too onerous for a robot, hence not continuos. Markov Decision Processes (MDPs) are a well-known solution inspired to the classic probabilistic approach for managing uncertainty. On the other hand, including fuzzy logics and possibility theory has widened uncertainty representa- tion. Probability, possibility, fuzzy logics, and epistemic belief allow treating dif- ferent and not always superimposable facets of unce…

Settore ING-INF/05 - Sistemi Di Elaborazione Delle Informazionibusiness.industryProcess (engineering)Probabilistic logicFuzzy logicPossibility distributionUncertainty representationuncertainty Markov Decision Process believabilityRobotArtificial intelligenceMarkov decision processbusinessPossibility theoryMathematics
researchProduct

Allocation des ressources dans l’informatique en brouillard le calcul du brouillard véhiculaire pour une utilisation optimale des véhicules électriqu…

2019

Abstract: Technological advancements made it possible for Electric vehicles (EVs) to have onboard computation, communication, storage, and sensing capabilities. Nevertheless, most of the time these EVs spend their time in parking lots, which makes onboard devices cruelly underutilized. Thus, a better management and pooling these underutilized resources together would be strongly recommended. The new aggregated resources would be useful for traffic safety applications, comfort related applications or can be used as a distributed data center. Moreover, parked vehicles might also be used as a service delivery platform to serve users. Therefore, the use of aggregated abundant resources for the …

[INFO.INFO-AI] Computer Science [cs]/Artificial Intelligence [cs.AI]Jeu stochastiqueAllocation des ressourcesProcessus de décision MarkovienStochastic GameVéhicule électriqueVehicular Fog ComputingElectric VehiclesMarkov Decision ProcessInformatique en brouillard véhiculaire[INFO.INFO-AI]Computer Science [cs]/Artificial Intelligence [cs.AI]Resource Allocation
researchProduct