Search results for "Reinforcement learning"

showing 10 items of 95 documents

Development of a Simulator for Prototyping Reinforcement Learning-Based Autonomous Cars

2022

Autonomous driving is a research field that has received attention in recent years, with increasing applications of reinforcement learning (RL) algorithms. It is impractical to train an autonomous vehicle thoroughly in the physical space, i.e., the so-called ’real world’; therefore, simulators are used in almost all training of autonomous driving algorithms. There are numerous autonomous driving simulators, very few of which are specifically targeted at RL. RL-based cars are challenging due to the variety of reward functions available. There is a lack of simulators addressing many central RL research tasks within autonomous driving, such as scene understanding, localization and mapping, pla…

Human-Computer InteractionVDP::Teknologi: 500autonomous driving; simulators; reinforcement learningComputer Networks and CommunicationsCommunicationInformatics
researchProduct

Towards Model-Based Reinforcement Learning for Industry-Near Environments

2019

Deep reinforcement learning has over the past few years shown great potential in learning near-optimal control in complex simulated environments with little visible information. Rainbow (Q-Learning) and PPO (Policy Optimisation) have shown outstanding performance in a variety of tasks, including Atari 2600, MuJoCo, and Roboschool test suite. Although these algorithms are fundamentally different, both suffer from high variance, low sample efficiency, and hyperparameter sensitivity that, in practice, make these algorithms a no-go for critical operations in the industry.

HyperparameterArtificial neural networkComputer sciencebusiness.industrySample (statistics)Variance (accounting)Machine learningcomputer.software_genreVariety (cybernetics)Test suiteReinforcement learningArtificial intelligenceMarkov decision processbusinesscomputer
researchProduct

Generating Hyperspectral Skin Cancer Imagery using Generative Adversarial Neural Network

2020

In this study we develop a proof of concept of using generative adversarial neural networks in hyperspectral skin cancer imagery production. Generative adversarial neural network is a neural network, where two neural networks compete. The generator tries to produce data that is similar to the measured data, and the discriminator tries to correctly classify the data as fake or real. This is a reinforcement learning model, where both models get reinforcement based on their performance. In the training of the discriminator we use data measured from skin cancer patients. The aim for the study is to develop a generator for augmenting hyperspectral skin cancer imagery. peerReviewed

Imagery PsychotherapySkin NeoplasmsComputer science0211 other engineering and technologiesComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION02 engineering and technologygenerative adversarial neural networksneuroverkotMachine learningcomputer.software_genre030218 nuclear medicine & medical imagingMachine Learningihosyöpä03 medical and health sciencesAdversarial system0302 clinical medicineHumansLearningReinforcement learning021101 geological & geomatics engineeringArtificial neural networkskin cancerbusiness.industryspektrikuvausHyperspectral imagingComputingMethodologies_PATTERNRECOGNITIONkuvantaminenNeural Networks ComputerArtificial intelligencebusinesscomputerGenerative grammarGenerator (mathematics)
researchProduct

Towards safe reinforcement-learning in industrial grid-warehousing

2020

Abstract Reinforcement learning has shown to be profoundly successful at learning optimal policies for simulated environments using distributed training with extensive compute capacity. Model-free reinforcement learning uses the notion of trial and error, where the error is a vital part of learning the agent to behave optimally. In mission-critical, real-world environments, there is little tolerance for failure and can cause damaging effects on humans and equipment. In these environments, current state-of-the-art reinforcement learning approaches are not sufficient to learn optimal control policies safely. On the other hand, model-based reinforcement learning tries to encode environment tra…

Information Systems and ManagementComputer sciencemedia_common.quotation_subjectSample (statistics)02 engineering and technologyMachine learningcomputer.software_genreTheoretical Computer ScienceArtificial Intelligence0202 electrical engineering electronic engineering information engineeringReinforcement learningVDP::Teknologi: 500::Informasjons- og kommunikasjonsteknologi: 550media_commonbusiness.industry05 social sciences050301 educationGridOptimal controlAutoencoderComputer Science ApplicationsAction (philosophy)Control and Systems EngineeringCuriosity020201 artificial intelligence & image processingArtificial intelligencebusiness0503 educationcomputerSoftwareInformation Sciences
researchProduct

RDF* Graph Database as Interlingua for the TextWorld Challenge

2019

This paper briefly describes the top-scoring submission to the First TextWorld Problems: A Reinforcement and Language Learning Challenge. To alleviate the partial observability problem, characteristic to the TextWorld games, we split the Agent into two independent components: Observer and Actor, communicating only via the Interlingua of the RDF* graph database. The RDF* graph database serves as the “world model” memory incrementally updated by the Observer via FrameNet informed Natural Language Understanding techniques and is used by the Actor for the efficient exploration and planning of the game Action sequences. We find that the deep-learning approach works best for the Observer componen…

InterlinguaInformation retrievalGraph databaseComputer scienceBacktrackingbusiness.industryDeep learningNatural language understandingcomputer.file_formatcomputer.software_genrelanguage.human_languagelanguageReinforcement learningArtificial intelligenceRDFFrameNetbusinesscomputer2019 IEEE Conference on Games (CoG)
researchProduct

AI for Resource Allocation and Resource Allocation for AI: a two-fold paradigm at the network edge

2022

5G-and-beyond and Internet of Things (IoT) technologies are pushing a shift from the classic cloud-centric view of the network to a new edge-centric vision. In such a perspective, the computation, communication and storage resources are moved closer to the user, to the benefit of network responsiveness/latency, and of an improved context-awareness, that is, the ability to tailor the network services to the live user's experience. However, these improvements do not come for free: edge networks are highly constrained, and do not match the resource abundance of their cloud counterparts. In such a perspective, the proper management of the few available resources is of crucial importance to impr…

Internet Of ThingMINLPIoTEdge NetworkPerformance EvaluationLow Power Wide Area NetworkSystem ModelingSettore ING-INF/03 - TelecomunicazioniUAVSoftware Defined RadioReal TestbedVehicular NetworkMLLoRaReinforcement LearningResource AllocationMachine LearningGame TheoryArtificial IntelligenceAILPWANColosseum Channel EmulatorChannel EmulationEmulationSDR
researchProduct

Explainable Reinforcement Learning with the Tsetlin Machine

2021

The Tsetlin Machine is a recent supervised machine learning algorithm that has obtained competitive results in several benchmarks, both in terms of accuracy and resource usage. It has been used for convolution, classification, and regression, producing interpretable rules. In this paper, we introduce the first framework for reinforcement learning based on the Tsetlin Machine. We combined the value iteration algorithm with the regression Tsetlin Machine, as the value function approximator, to investigate the feasibility of training the Tsetlin Machine through bootstrapping. Moreover, we document robustness and accuracy of learning on several instances of the grid-world problem.

Learning automataComputer sciencebusiness.industryBootstrappingMachine learningcomputer.software_genreRegressionConvolutionRobustness (computer science)Bellman equationReinforcement learningMarkov decision processArtificial intelligenceMathematics::Representation Theorybusinesscomputer
researchProduct

A formal proof of the e-optimality of discretized pursuit algorithms

2015

Learning Automata (LA) can be reckoned to be the founding algorithms on which the field of Reinforcement Learning has been built. Among the families of LA, Estimator Algorithms (EAs) are certainly the fastest, and of these, the family of discretized algorithms are proven to converge even faster than their continuous counterparts. However, it has recently been reported that the previous proofs for ??-optimality for all the reported algorithms for the past three decades have been flawed. We applaud the researchers who discovered this flaw, and who further proceeded to rectify the proof for the Continuous Pursuit Algorithm (CPA). The latter proof examines the monotonicity property of the proba…

Learning automataDiscretizationInequalityBasis (linear algebra)Computer sciencemedia_common.quotation_subjectField (mathematics)Monotonic function02 engineering and technologyMathematical proofFormal proof020202 computer hardware & architectureAlgebraArtificial Intelligence0202 electrical engineering electronic engineering information engineeringReinforcement learning020201 artificial intelligence & image processingAlgorithmmedia_common
researchProduct

Evolution and Learning: Evolving Sensors in a Simple MDP Environment

2003

Natural intelligence and autonomous agents face difficulties when acting in information-dense environments. Assailed by a multitude of stimuli they have to make sense of the inflow of information, filtering and processing what is necessary, but discarding that which is unimportant. This paper aims at investigating the interactions between evolution of the sensorial channel extracting the information from the environment and the simultaneous individual adaptation of agent-control. Our particular goal is to study the influence of learning on the evolution of sensors, with learning duration being the tunable parameter. A genetic algorithm governs the evolution of sensors appropriate for the a…

Learning classifier systembusiness.industryComputer science05 social sciencesAutonomous agentExperimental and Cognitive PsychologyGrid050105 experimental psychologyTask (project management)03 medical and health sciencesBehavioral Neuroscience0302 clinical medicineGenetic algorithmReinforcement learning0501 psychology and cognitive sciencesArtificial intelligencebusinessAdaptation (computer science)030217 neurology & neurosurgeryCommunication channelAdaptive Behavior
researchProduct

Optimization of anemia treatment in hemodialysis patients via reinforcement learning

2013

Objective: Anemia is a frequent comorbidity in hemodialysis patients that can be successfully treated by administering erythropoiesis-stimulating agents (ESAs). ESAs dosing is currently based on clinical protocols that often do not account for the high inter- and intra-individual variability in the patient's response. As a result, the hemoglobin level of some patients oscillates around the target range, which is associated with multiple risks and side-effects. This work proposes a methodology based on reinforcement learning (RL) to optimize ESA therapy. Methods: RL is a data-driven approach for solving sequential decision-making problems that are formulated as Markov decision processes (MDP…

MaleFOS: Computer and information sciencesMathematical optimizationDarbepoetin alfaComputer scienceAnemiaComputer Science - Artificial Intelligencemedicine.medical_treatmentMedicine (miscellaneous)Machine Learning (stat.ML)Outcome (game theory)Decision Support TechniquesMachine Learning (cs.LG)Renal DialysisArtificial IntelligenceStatistics - Machine LearningmedicineHumansReinforcement learningDosingAgedProtocol (science)Patient SelectionAnemiaHemoglobin AMiddle Agedmedicine.diseaseMarkov ChainsComputer Science - LearningArtificial Intelligence (cs.AI)Chronic DiseaseHematinicsKidney Failure ChronicFemaleHemodialysisMarkov decision processReinforcement PsychologyAlgorithmsmedicine.drug
researchProduct