Search results for "reinforcement"

showing 10 items of 230 documents

Development of a Simulator for Prototyping Reinforcement Learning-Based Autonomous Cars

2022

Autonomous driving is a research field that has received attention in recent years, with increasing applications of reinforcement learning (RL) algorithms. It is impractical to train an autonomous vehicle thoroughly in the physical space, i.e., the so-called ’real world’; therefore, simulators are used in almost all training of autonomous driving algorithms. There are numerous autonomous driving simulators, very few of which are specifically targeted at RL. RL-based cars are challenging due to the variety of reward functions available. There is a lack of simulators addressing many central RL research tasks within autonomous driving, such as scene understanding, localization and mapping, pla…

Human-Computer InteractionVDP::Teknologi: 500autonomous driving; simulators; reinforcement learningComputer Networks and CommunicationsCommunicationInformatics
researchProduct

Towards Model-Based Reinforcement Learning for Industry-Near Environments

2019

Deep reinforcement learning has over the past few years shown great potential in learning near-optimal control in complex simulated environments with little visible information. Rainbow (Q-Learning) and PPO (Policy Optimisation) have shown outstanding performance in a variety of tasks, including Atari 2600, MuJoCo, and Roboschool test suite. Although these algorithms are fundamentally different, both suffer from high variance, low sample efficiency, and hyperparameter sensitivity that, in practice, make these algorithms a no-go for critical operations in the industry.

HyperparameterArtificial neural networkComputer sciencebusiness.industrySample (statistics)Variance (accounting)Machine learningcomputer.software_genreVariety (cybernetics)Test suiteReinforcement learningArtificial intelligenceMarkov decision processbusinesscomputer
researchProduct

Terms of abuse as expression and reinforcement of cultures

2008

In this study terms of abuse are investigated in 11 different cultures, Spontaneous verbal aggression is to a certain extent reminiscent of the values of a certain culture. Almost 3000 subjects from Spain, Germany, France, Italy, Croatia, Poland, Great Britain, USA, Norway, Greece, and The Netherlands were asked to write down terms of abuse that they would use given a certain stimulus situation, and in addition, to give their rating of the offensive character of those terms. A total set of 12,000 expressions was collected. The frequencies of the expressions were established, and the total list of expressions was reduced to 16 categories. Results point to some etic taboos, like sexuality and…

INSULTSociology and Political ScienceSocial Psychologymedia_common.quotation_subjectOffensiveLANGUAGEHuman sexualityNOUNSExpression (mathematics)abuse termsDevelopmental psychologyInsultnormative valueseticsabuse across countriesNounemicsEmic and eticabuse terms; etics; emics; normative values; abuse across countriesVerbal aggressionBusiness and International ManagementReinforcementPsychologySocial psychologymedia_commonInternational Journal of Intercultural Relations
researchProduct

Generating Hyperspectral Skin Cancer Imagery using Generative Adversarial Neural Network

2020

In this study we develop a proof of concept of using generative adversarial neural networks in hyperspectral skin cancer imagery production. Generative adversarial neural network is a neural network, where two neural networks compete. The generator tries to produce data that is similar to the measured data, and the discriminator tries to correctly classify the data as fake or real. This is a reinforcement learning model, where both models get reinforcement based on their performance. In the training of the discriminator we use data measured from skin cancer patients. The aim for the study is to develop a generator for augmenting hyperspectral skin cancer imagery. peerReviewed

Imagery PsychotherapySkin NeoplasmsComputer science0211 other engineering and technologiesComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION02 engineering and technologygenerative adversarial neural networksneuroverkotMachine learningcomputer.software_genre030218 nuclear medicine & medical imagingMachine Learningihosyöpä03 medical and health sciencesAdversarial system0302 clinical medicineHumansLearningReinforcement learning021101 geological & geomatics engineeringArtificial neural networkskin cancerbusiness.industryspektrikuvausHyperspectral imagingComputingMethodologies_PATTERNRECOGNITIONkuvantaminenNeural Networks ComputerArtificial intelligencebusinesscomputerGenerative grammarGenerator (mathematics)
researchProduct

Towards safe reinforcement-learning in industrial grid-warehousing

2020

Abstract Reinforcement learning has shown to be profoundly successful at learning optimal policies for simulated environments using distributed training with extensive compute capacity. Model-free reinforcement learning uses the notion of trial and error, where the error is a vital part of learning the agent to behave optimally. In mission-critical, real-world environments, there is little tolerance for failure and can cause damaging effects on humans and equipment. In these environments, current state-of-the-art reinforcement learning approaches are not sufficient to learn optimal control policies safely. On the other hand, model-based reinforcement learning tries to encode environment tra…

Information Systems and ManagementComputer sciencemedia_common.quotation_subjectSample (statistics)02 engineering and technologyMachine learningcomputer.software_genreTheoretical Computer ScienceArtificial Intelligence0202 electrical engineering electronic engineering information engineeringReinforcement learningVDP::Teknologi: 500::Informasjons- og kommunikasjonsteknologi: 550media_commonbusiness.industry05 social sciences050301 educationGridOptimal controlAutoencoderComputer Science ApplicationsAction (philosophy)Control and Systems EngineeringCuriosity020201 artificial intelligence & image processingArtificial intelligencebusiness0503 educationcomputerSoftwareInformation Sciences
researchProduct

Apparent interfacial shear strength of short-flax-fiber/starch acetate composites

2016

Abstract The paper deals with an indirect industry-friendly method for identification of the interfacial shear strength (IFSS) in a fully bio-based composite. The IFSS of flax fiber/starch acetate is evaluated by a modified Bowyer and Bader method based on an analysis of the stress–strain curve of a short-fiber-reinforced composite in tension. A shear lag model is developed for the tensile stress–strain response of short-fiber-reinforced composites allowing for an elastic-perfectly plastic stress transfer. Composites with different fiber volume fractions and a variable content of plasticizer have been analyzed. The apparent IFSS of flax/starch acetate is within the range of 5.5–20.5 MPa, de…

Interfacial shear strengthMaterials sciencePolymers and PlasticsApparent interfacial shear strengthGeneral Chemical EngineeringComposite numberSheet molding compoundsGreen composites02 engineering and technology010402 general chemistry01 natural sciencesBiomaterialsFlax fiberPlasticizersFlaxYarnUltimate tensile strengthChemical Engineering (all)Composite materialThermoplastic starchchemistry.chemical_classificationFiber volume fractionsFlax fiberElastic perfectly plasticStress–strain curvePlasticizerPolymer021001 nanoscience & nanotechnologyFiber reinforced plasticsReinforcement0104 chemical sciencesFibersStress-strain curvesReinforced plasticsInterfacial shearchemistryShort-fiber-reinforced compositesAdhesiveGreen composite0210 nano-technologyLinenInternational Journal of Adhesion and Adhesives
researchProduct

RDF* Graph Database as Interlingua for the TextWorld Challenge

2019

This paper briefly describes the top-scoring submission to the First TextWorld Problems: A Reinforcement and Language Learning Challenge. To alleviate the partial observability problem, characteristic to the TextWorld games, we split the Agent into two independent components: Observer and Actor, communicating only via the Interlingua of the RDF* graph database. The RDF* graph database serves as the “world model” memory incrementally updated by the Observer via FrameNet informed Natural Language Understanding techniques and is used by the Actor for the efficient exploration and planning of the game Action sequences. We find that the deep-learning approach works best for the Observer componen…

InterlinguaInformation retrievalGraph databaseComputer scienceBacktrackingbusiness.industryDeep learningNatural language understandingcomputer.file_formatcomputer.software_genrelanguage.human_languagelanguageReinforcement learningArtificial intelligenceRDFFrameNetbusinesscomputer2019 IEEE Conference on Games (CoG)
researchProduct

AI for Resource Allocation and Resource Allocation for AI: a two-fold paradigm at the network edge

2022

5G-and-beyond and Internet of Things (IoT) technologies are pushing a shift from the classic cloud-centric view of the network to a new edge-centric vision. In such a perspective, the computation, communication and storage resources are moved closer to the user, to the benefit of network responsiveness/latency, and of an improved context-awareness, that is, the ability to tailor the network services to the live user's experience. However, these improvements do not come for free: edge networks are highly constrained, and do not match the resource abundance of their cloud counterparts. In such a perspective, the proper management of the few available resources is of crucial importance to impr…

Internet Of ThingMINLPIoTEdge NetworkPerformance EvaluationLow Power Wide Area NetworkSystem ModelingSettore ING-INF/03 - TelecomunicazioniUAVSoftware Defined RadioReal TestbedVehicular NetworkMLLoRaReinforcement LearningResource AllocationMachine LearningGame TheoryArtificial IntelligenceAILPWANColosseum Channel EmulatorChannel EmulationEmulationSDR
researchProduct

Explainable Reinforcement Learning with the Tsetlin Machine

2021

The Tsetlin Machine is a recent supervised machine learning algorithm that has obtained competitive results in several benchmarks, both in terms of accuracy and resource usage. It has been used for convolution, classification, and regression, producing interpretable rules. In this paper, we introduce the first framework for reinforcement learning based on the Tsetlin Machine. We combined the value iteration algorithm with the regression Tsetlin Machine, as the value function approximator, to investigate the feasibility of training the Tsetlin Machine through bootstrapping. Moreover, we document robustness and accuracy of learning on several instances of the grid-world problem.

Learning automataComputer sciencebusiness.industryBootstrappingMachine learningcomputer.software_genreRegressionConvolutionRobustness (computer science)Bellman equationReinforcement learningMarkov decision processArtificial intelligenceMathematics::Representation Theorybusinesscomputer
researchProduct

A formal proof of the e-optimality of discretized pursuit algorithms

2015

Learning Automata (LA) can be reckoned to be the founding algorithms on which the field of Reinforcement Learning has been built. Among the families of LA, Estimator Algorithms (EAs) are certainly the fastest, and of these, the family of discretized algorithms are proven to converge even faster than their continuous counterparts. However, it has recently been reported that the previous proofs for ??-optimality for all the reported algorithms for the past three decades have been flawed. We applaud the researchers who discovered this flaw, and who further proceeded to rectify the proof for the Continuous Pursuit Algorithm (CPA). The latter proof examines the monotonicity property of the proba…

Learning automataDiscretizationInequalityBasis (linear algebra)Computer sciencemedia_common.quotation_subjectField (mathematics)Monotonic function02 engineering and technologyMathematical proofFormal proof020202 computer hardware & architectureAlgebraArtificial Intelligence0202 electrical engineering electronic engineering information engineeringReinforcement learning020201 artificial intelligence & image processingAlgorithmmedia_common
researchProduct