The Dreaming Variational Autoencoder for Reinforcement Learning Environments

6533b822fe1ef96bd127d903

RESEARCH PRODUCT

The Dreaming Variational Autoencoder for Reinforcement Learning Environments

Per-arne Andersen Morten Goodwin Ole-christoffer Granmo

subject

FOS: Computer and information sciences Maskinlæring Computer Science - Machine Learning VDP::Computer technology: 551 Artificial Intelligence (cs.AI)VDP::Datateknologi: 551 Computer Science - Artificial Intelligence Machine learning Deep learning Machine Learning (cs.LG)

description

Reinforcement learning has shown great potential in generalizing over raw sensory data using only a single neural network for value optimization. There are several challenges in the current state-of-the-art reinforcement learning algorithms that prevent them from converging towards the global optima. It is likely that the solution to these problems lies in short- and long-term planning, exploration and memory management for reinforcement learning algorithms. Games are often used to benchmark reinforcement learning algorithms as they provide a flexible, reproducible, and easy to control environment. Regardless, few games feature a state-space where results in exploration, memory, and planning are easily perceived. This paper presents The Dreaming Variational Autoencoder (DVAE), a neural network based generative modeling architecture for exploration in environments with sparse feedback. We further present Deep Maze, a novel and flexible maze engine that challenges DVAE in partial and fully-observable state-spaces, long-horizon tasks, and deterministic and stochastic problems. We show initial findings and encourage further work in reinforcement learning driven by generative exploration.

year	journal	country	edition	language
2018-10-02

10.1007/978-3-030-04191-5_11 http://hdl.handle.net/11250/2596208

PEPhysical Sciences and Engineering
- PE2Fundamental Constituents of Matter
  - PE2_1Theory of fundamental interactions
  - PE2_2 Phenomenology of fundamental interactions
  - PE2_3Experimental particle physics with accelerators
  - PE2_4Experimental particle physics without accelerators
  - PE2_6Nuclear, hadron and heavy ion physics
  - PE2_8Gas and plasma physics
  - PE2_5Classical and quantum physics of gravitational interactions
  - PE2_7Nuclear and particle astrophysics
  - PE2_9 Electromagnetism
  - PE2_10Atomic, molecular physics
  - PE2_11Ultra-cold atoms and molecules
  - PE2_18Equilibrium and non-equilibrium statistical mechanics: steady states and dynamics
  - PE2_14Lasers, ultra-short lasers and laser physics
  - PE2_17Metrology and measurement
  - PE2_16Non-linear physics
  - PE2_12Optics, non-linear optics and nano-optics
  - PE2_13Quantum optics and quantum information
  - PE2_15Thermodynamics
- PE1Mathematics
  - PE1_1Logic and foundations
  - PE1_6Geometry and global analysis
  - PE1_8Analysis
  - PE1_7Topology
  - PE1_9Operator algebras and functional analysis
  - PE1_10ODE and dynamical systems
  - PE1_12Mathematical physics
  - PE1_11Theoretical aspects of partial differential equations
  - PE1_13Probability
  - PE1_14Mathematical statistics
  - PE1_15Generic statistical methodology and modelling
  - PE1_4 Algebraic and complex geometry
  - PE1_19Scientific computing and data processing
  - PE1_2Algebra
  - PE1_20Control theory, optimisation and operational research
  - PE1_21Application of mathematics in sciences
  - PE1_22Application of mathematics in industry and society
  - PE1_5Lie groups, Lie algebras
  - PE1_18Numerical analysis
  - PE1_16Discrete mathematics and combinatorics
  - PE1_17Mathematical aspects of computer science
  - PE1_3Number theory
- PE3Condensed Matter Physics
  - PE3_1Structure of solids, material growth and characterisation
  - PE3_2Mechanical and acoustical properties of condensed matter, lattice dynamics
  - PE3_3Transport properties of condensed matter
  - PE3_4Electronic properties of materials, surfaces, interfaces, nanostructures
  - PE3_6Macroscopic quantum phenomena, e.g. superconductivity, superfluidity, quantum Hall effect
  - PE3_5Physical properties of semiconductors and insulators
  - PE3_8Magnetism and strongly correlated systems
  - PE3_7Spintronics
  - PE3_9 Condensed matter – beam interactions (photons, electrons, etc.)
  - PE3_10Nanophysics, e.g. nanoelectronics, nanophotonics, nanomagnetism, nanoelectromechanics
  - PE3_12Molecular electronics
  - PE3_11Mesoscopic quantum physics and solid-state quantum technologies
  - PE3_13Structure and dynamics of disordered systems, e.g. soft matter (gels, colloids, liquid crystals), granular matter, liquids, glasses, defects
  - PE3_14Fluid dynamics (physics)
  - PE3_15Statistical physics: phase transitions, condensed matter systems, models of complex systems, interdisciplinary applications
  - PE3_16Physics of biological systems
- PE5Synthetic Chemistry and Materials
  - PE5_1Structural properties of materials
  - PE5_2Solid state materials chemistry
  - PE5_3Surface modification
  - PE5_4Thin films
  - PE5_5Ionic liquids
  - PE5_9Coordination chemistry
  - PE5_10Colloid chemistry
  - PE5_11Biological chemistry and chemical biology
  - PE5_8Intelligent materials synthesis – self assembled materials
  - PE5_12Chemistry of condensed matter
  - PE5_14Macromolecular chemistry
  - PE5_15Polymer chemistry
  - PE5_16Supramolecular chemistry
  - PE5_17Organic chemistry
  - PE5_18Medicinal chemistry
  - PE5_6New materials: oxides, alloys, composite, organic-inorganic hybrid, nanoparticles
  - PE5_7Biomaterials synthesis
  - PE5_13Homogeneous catalysis
- PE4Physical and Analytical Chemical Sciences
  - PE4_2Spectroscopic and spectrometric techniques
  - PE4_3Molecular architecture and Structure
  - PE4_1Physical chemistry
  - PE4_4Surface science and nanostructures
  - PE4_5Analytical chemistry
  - PE4_6Chemical physics
  - PE4_7Chemical instrumentation
  - PE4_8Electrochemistry, electrodialysis, microfluidics, sensors
  - PE4_10Heterogeneous catalysis
  - PE4_9Method development in chemistry
  - PE4_11Physical chemistry of biological systems
  - PE4_15Photochemistry
  - PE4_14Radiation and Nuclear chemistry
  - PE4_16Corrosion
  - PE4_17Characterisation methods of materials
  - PE4_12Chemical reactions: mechanisms, dynamics, kinetics and catalytic reactions
  - PE4_18Environment chemistry
  - PE4_13Theoretical and computational chemistry
- PE6Computer Science and Informatics
  - PE6_3Software engineering, programming languages and systems
  - PE6_7Artificial intelligence, intelligent systems, natural language processing
  - PE6_9Human computer interaction and interface, visualisation
  - PE6_10Web and information systems, data management systems, information retrieval and digital libraries, data fusion
  - PE6_12Scientific computing, simulation and modelling tools
  - PE6_13Bioinformatics, bio-inspired computing, and natural computing
  - PE6_14Quantum computing (formal methods, algorithms and other computer science aspects)
  - PE6_1Computer architecture, embedded systems, operating systems
  - PE6_2Distributed systems, parallel computing, sensor networks, cyber-physical systems
  - PE6_5Security, privacy, cryptology, quantum cryptography
  - PE6_11Machine learning, statistical data processing and applications using signal processing (e.g. speech, image, video)
  - PE6_6Algorithms and complexity, distributed, parallel and network algorithms, algorithmic game theory
  - PE6_4Theoretical computer science, formal methods, automata
  - PE6_8Computer graphics, computer vision, multimedia, computer games
- PE7Systems and Communication Engineering
  - PE7_1Control engineering
  - PE7_2Electrical engineering: power components and/or systems
  - PE7_7Signal processing
  - PE7_8Networks, e.g. communication networks and nodes, Internet of Things, sensor networks, networks of robots
  - PE7_9Man-machine interfaces
  - PE7_5(Micro- and nano-) electronic, optoelectronic and photonic components
  - PE7_6Communication systems, wireless technology, high-frequency technology
  - PE7_3Simulation engineering and modelling
  - PE7_10Robotics
  - PE7_4(Micro- and nano-) systems engineering
  - PE7_12Electrical energy production, distribution, applications
  - PE7_11Components and systems for applications (in e.g. medicine, biology, environment)
- PE8Products and Processes Engineering
  - PE8_1Aerospace engineering
  - PE8_2Chemical engineering, technical chemistry
  - PE8_3Civil engineering, architecture, offshore construction, lightweight construction, geotechnics
  - PE8_5Fluid mechanics
  - PE8_4Computational engineering
  - PE8_6Energy processes engineering
  - PE8_8Propulsion engineering, e.g. hydraulic, turbo, piston, hybrid engines
  - PE8_7Mechanical engineering
  - PE8_11Environmental engineering, e.g. sustainable design, waste and water treatment, recycling, regeneration or recovery of compounds, carbon capture & storage
  - PE8_9Production technology, process engineering
  - PE8_13Industrial bioengineering
  - PE8_14Automotive and rail engineering; multi-/inter-modal transport engineering
  - PE8_10Manufacturing engineering and industrial design
  - PE8_12Naval/marine engineering
- PE9Universe Sciences
  - PE9_2Solar system science
  - PE9_3Exoplanetary science, formation and characterization of extrasolar planets
  - PE9_1Solar physics – the Sun and the heliosphere
  - PE9_4Astrobiology
  - PE9_5Interstellar medium and star formation
  - PE9_6Stars – stellar physics, stellar systems
  - PE9_7The Milky Way
  - PE9_8Galaxies – formation, evolution, clusters
  - PE9_9Cosmology and large-scale structure, dark matter, dark energy
  - PE9_10Relativistic astrophysics and compact objects
  - PE9_12High-energy and particle astronomy
  - PE9_11Gravitational wave astronomy
  - PE9_13Astronomical instrumentation and data, e.g. telescopes, detectors, techniques, archives, analyses
- PE10Earth System Science
  - PE10_2Meteorology, atmospheric physics and dynamics
  - PE10_7Physics of earth’s interior, seismology, geodynamics
  - PE10_9Biogeochemistry, biogeochemical cycles, environmental chemistry
  - PE10_8Oceanography (physical, chemical, biological, geological)
  - PE10_1Atmospheric chemistry, atmospheric composition, air pollution
  - PE10_4Terrestrial ecology, land cover change
  - PE10_5Geology, tectonics, volcanology
  - PE10_6Palaeoclimatology, palaeoecology
  - PE10_10Mineralogy, petrology, igneous petrology, metamorphic petrology
  - PE10_3Climatology and climate change
  - PE10_16Ozone, upper atmosphere, ionosphere
  - PE10_15Geomagnetism, palaeomagnetism
  - PE10_17Hydrology, hydrogeology, engineering and environmental geology, water and soil pollution
  - PE10_18Cryosphere, dynamics of snow and ice cover, sea ice, permafrosts and ice sheets
  - PE10_20Geohazards
  - PE10_19Planetary geology and geophysics
  - PE10_21Earth system modelling and interactions
  - PE10_14Earth observations from space/remote sensing
  - PE10_11Geochemistry, cosmochemistry, crystal chemistry, isotope geochemistry, thermodynamics
  - PE10_13Physical geography, geomorphology
  - PE10_12Sedimentology, soil science, palaeontology, earth evolution
- PE11Materials Engineering
  - PE11_2Engineering of metals and alloys
  - PE11_1Engineering of biomaterials, biomimetic, bioinspired and bio-enabled materials
  - PE11_3Engineering of ceramics and glasses
  - PE11_4Engineering of polymers and plastics
  - PE11_5Engineering of composites and hybrid materials
  - PE11_6Engineering of carbon materials
  - PE11_7Engineering of metal oxides
  - PE11_8Engineering of alternative established or emergent materials
  - PE11_9Nanomaterials engineering, e.g. nanoparticles, nanoporous materials, 1D & 2D nanomaterials
  - PE11_10Soft materials engineering, e.g. gels, foams, colloids
  - PE11_11Porous materials engineering, e.g. covalent-organic, metal-organic, porous aromatic frameworks
  - PE11_12Semi-conducting and magnetic materials engineering
  - PE11_13Metamaterials engineering
  - PE11_14Computational methods for materials engineering
LSLife Sciences
- LS1Molecules of Life: Biological Mechanisms, Structures and Functions
  - LS1_2Biochemistry
  - LS1_1Macromolecular complexes including interactions involving nucleic acids, proteins, lipids and carbohydrates
  - LS1_3DNA and RNA biology
  - LS1_4Protein biology
  - LS1_5Lipid biology
  - LS1_6Glycobiology
  - LS1_11Chemical biology
  - LS1_12Protein design
  - LS1_13Early translational research and drug design
  - LS1_14Innovative methods and modelling in molecular, structural and synthetic biology
  - LS1_10Synthetic biology
  - LS1_8Structural biology
  - LS1_9Molecular mechanisms of signalling processes
  - LS1_7Molecular biophysics, biomechanics, bioenergetics
- LS2Integrative Biology: from Genes and Genomes to Systems
  - LS2_1Genetics
  - LS2_3Epigenetics
  - LS2_2Gene editing
  - LS2_4Gene regulation
  - LS2_5Genomics
  - LS2_6Metagenomics
  - LS2_7Transcriptomics
  - LS2_8Proteomics
  - LS2_9Metabolomics
  - LS2_10Glycomics/Lipidomics
  - LS2_11Bioinformatics and computational biology
  - LS2_12Biostatistics
  - LS2_14Genetic diseases
  - LS2_16Innovative methods and modelling in integrative biology
  - LS2_13Systems biology
  - LS2_15Integrative biology for personalised medicine
- LS4Physiology in Health, Disease and Ageing
  - LS4_1Organ and tissue physiology and pathophysiology
  - LS4_4Endocrinology
  - LS4_3Physiology of ageing
  - LS4_2Comparative physiology
  - LS4_5Non-hormonal mechanisms of inter-organ and tissue communication
  - LS4_6Microbiome and host physiology
  - LS4_7Nutrition and exercise physiology
  - LS4_8Impact of stress (including environmental stress) on physiology
  - LS4_9Metabolism and metabolic disorders, including diabetes and obesity
  - LS4_10The cardiovascular system and cardiovascular diseases
  - LS4_11Haematopoiesis and blood diseases
  - LS4_12Cancer
  - LS4_13Other non-communicable diseases (except disorders of the nervous system and immunity-related diseases)
- LS3Cell Biology, Development, Stem Cells and Regeneration
  - LS3_4Cell junctions, cell adhesion, the extracellular matrix, cell communication
  - LS3_5Cell signalling and signal transduction, exosome biology
  - LS3_6Organelle biology and trafficking
  - LS3_7Mechanobiology of cells, tissues and organs
  - LS3_8Embryogenesis, pattern formation, morphogenesis
  - LS3_9Cell differentiation, formation of tissues and organs
  - LS3_10Developmental genetics
  - LS3_11Evolution of developmental strategies
  - LS3_12Organoids
  - LS3_3Cell behaviour, including control of cell shape, cell migration
  - LS3_13Stem cells
  - LS3_14Regeneration
  - LS3_15Development of cell-based therapeutic approaches for tissue regeneration
  - LS3_17Theoretical modelling in cellular, developmental and regenerative biology
  - LS3_16Functional imaging of cells and tissues
  - LS3_1Cell cycle, cell division and growth
  - LS3_2Cell senescence, cell death, autophagy, cell ageing
- LS5Neuroscience and Disorders of the Nervous System
  - LS5_1Neuronal cells
  - LS5_3Neural development and related disorders
  - LS5_5Neural networks and plasticity
  - LS5_10Ageing of the nervous system
  - LS5_11Neurological and neurodegenerative disorders
  - LS5_12Mental disorders
  - LS5_7Sensory systems, sensation and perception, including pain
  - LS5_13Nervous system injuries and trauma, stroke
  - LS5_14Repair and regeneration of the nervous system
  - LS5_15Neuroimmunology, neuroinflammation
  - LS5_16Systems and computational neuroscience (e.g. modelling, simulation, brain oscillations, connectomics)
  - LS5_18Innovative methods and tools for neuroscience
  - LS5_17Imaging in neuroscience
  - LS5_9Neural basis of cognition (e.g. learning, memory, attention, emotions, speech)
  - LS5_6Neurovascular biology and blood-brain barrier
  - LS5_8Neural basis of behaviour (e.g. sleep, consciousness, addiction)
  - LS5_4Neural stem cells
  - LS5_2Glial cells and neuronal-glial communication
- LS6Immunity, Infection and Immunotherapy
  - LS6_1Innate immunity
  - LS6_2Adaptive immunity
  - LS6_3Regulation of the immune response
  - LS6_4Immune-related diseases
  - LS6_5Biology of pathogens (e.g. bacteria, viruses, parasites, fungi)
  - LS6_7Mechanisms of infection
  - LS6_6Infectious diseases
  - LS6_8Biological basis of prevention and treatment of infection
  - LS6_9Antimicrobials, antimicrobial resistance
  - LS6_10Vaccine development
  - LS6_11Innovative immunological tools and approaches, including therapies
- LS7Prevention, Diagnosis and Treatment of Human Diseases
  - LS7_2Medical technologies and tools (including genetic tools and biomarkers) for prevention, diagnosis, monitoring and treatment of diseases
  - LS7_1Medical imaging for prevention, diagnosis and monitoring of diseases
  - LS7_5Applied gene, cell and immune therapies
  - LS7_4Regenerative medicine
  - LS7_3Nanomedicine
  - LS7_6Other medical therapeutic interventions, including transplantation
  - LS7_7Pharmacology and toxicology
  - LS7_8Effectiveness of interventions, including resistance to therapies
  - LS7_9Public health and epidemiology
  - LS7_12Health care, including care for the ageing population
  - LS7_14Digital medicine, e-medicine, medical applications of artificial intelligence
  - LS7_15Medical ethics
  - LS7_13Palliative medicine
  - LS7_11Environmental health, occupational medicine
  - LS7_10Preventative and prognostic medicine
- LS8Environmental Biology, Ecology and Evolution
  - LS8_2Biodiversity
  - LS8_3Conservation biology
  - LS8_4Population biology, population dynamics, population genetics
  - LS8_5Biological aspects of environmental change, including climate change
  - LS8_6Evolutionary ecology
  - LS8_1Ecosystem and community ecology, macroecology
  - LS8_7Evolutionary genetics
  - LS8_8Phylogenetics, systematics, comparative biology
  - LS8_11Behavioural ecology and evolution
  - LS8_9Macroevolution and paleobiology
  - LS8_10Ecology and evolution of species interactions
  - LS8_12Microbial ecology and evolution
  - LS8_13Marine biology and ecology
  - LS8_14Ecophysiology, from organisms to ecosystems
  - LS8_15Theoretical developments and modelling in environmental biology, ecology, and evolution
- LS9Biotechnology and Biosystems Engineering
  - LS9_3Bioengineering of cells, tissues, organs and organisms
  - LS9_2Applied genetics, gene editing and transgenic organisms
  - LS9_4Microbial biotechnology and bioengineering
  - LS9_5Food biotechnology and bioengineering
  - LS9_6Marine biotechnology and bioengineering
  - LS9_7Environmental biotechnology and bioengineering
  - LS9_8Applied plant sciences, plant breeding, agroecology and soil biology
  - LS9_10Veterinary and applied animal sciences
  - LS9_9Plant pathology and pest resistance
  - LS9_11Biomass production and utilisation, biofuels
  - LS9_12Ecotoxicology, biohazards and biosafety
  - LS9_1Bioengineering for synthetic and chemical biology
SHSocial Sciences and Humanities
- SH2Institutions, Governance and Legal Systems
  - SH2_1Political systems, governance
  - SH2_2Democratisation and social movements
  - SH2_3Conflict resolution, war, peace building
  - SH2_4Legal studies, comparative law, law and economics
  - SH2_5Constitutions, human rights, international law
  - SH2_6International relations, global and transnational governance
  - SH2_7Humanitarian assistance and development
  - SH2_8Political and legal philosophy
  - SH2_9Digital approaches to political science and law
- SH1Individuals, Markets and Organisations
  - SH1_6Banking, insurance
  - SH1_7Accounting, asset prices, auditing
  - SH1_8Econometrics, game theory, decision theory
  - SH1_9Behavioural economics; experimental economics; neuro-economics
  - SH1_2International trade; international business; spatial economics
  - SH1_10Microeconomics, industrial organisation, applied microeconomics
  - SH1_11Innovation, research & development, entrepreneurship
  - SH1_12Management; operations management, international management
  - SH1_13Human resource management; organisational behaviour
  - SH1_14Strategy, operation research
  - SH1_15Marketing, consumer behaviour
  - SH1_3Development economics political economics
  - SH1_1Macroeconomics; monetary economics; economic growth, labour economics
  - SH1_4Finance; financial markets
  - SH1_16Quantitative economic history, economic systems, institutional economics
  - SH1_5Corporate finance; international finance
- SH3The Social World and Its Interactions
  - SH3_1Social structure, social mobility, social innovation
  - SH3_2Inequalities, discrimination, prejudice
  - SH3_3Aggression and violence, antisocial behaviour, crime
  - SH3_5Social attitudes and beliefs
  - SH3_4Social integration, exclusion, prosocial behaviour
  - SH3_7Social policies, welfare, work and employment
  - SH3_8Poverty and poverty alleviation
  - SH3_10Communication and information, networks, media
  - SH3_11Digital social research
  - SH3_9Social aspects of teaching and learning, curriculum studies, education and educational policies
  - SH3_12Social studies of science and technology
  - SH3_6Social influence; power and group behaviour
- SH4The Human Mind and Its Complexity
  - SH4_1Cognitive basis of human development, developmental disorders; comparative cognition
  - SH4_2Personality and social cognition; emotion
  - SH4_3Clinical and health psychology
  - SH4_4Neurocognitive psychology
  - SH4_5Attention, perception, action, consciousness
  - SH4_6Learning, memory; cognition in ageing
  - SH4_7Reasoning, decision-making; intelligence
  - SH4_8Language learning and processing (first and second languages)
  - SH4_11Pragmatics, sociolinguistics, linguistic anthropology, discourse analysis
  - SH4_9Theoretical linguistics; computational linguistics
  - SH4_10Language typology; historical linguistics
- SH6The Study of the Human Past
  - SH6_1Archaeological methods and theory, history of archaeology
  - SH6_2Prehistoric archaeology, archaeology of non-literate societies
  - SH6_3Archaeology of early literate societies and early civilizations
  - SH6_6Digital, computational, virtual and geospatial archaeologies
  - SH6_7Historiography, theory and methods of history, including the analysis of digital data
  - SH6_10Colonial and post-colonial history
  - SH6_11Global, transnational, and comparative history
  - SH6_12Social and economic history
  - SH6_13Cultural history, intellectual history
  - SH6_14History of science and technologies, environmental history
  - SH6_9Early modern, modern, and contemporary history
  - SH6_8Ancient history, medieval history
  - SH6_5Archaeological science, bioarchaeology, environmental archaeology, geoarchaeology
  - SH6_4Medieval and post-medieval archaeologies
- SH5Texts and Concepts
  - SH5_4Philology; text and image studies
  - SH5_1Classics, ancient literature
  - SH5_5Palaeography and codicology
  - SH5_3Book studies
  - SH5_6Philosophy of mind, philosophy of language
  - SH5_7Philosophy of science, epistemology, logic
  - SH5_8Metaphysics, philosophical anthropology; aesthetics
  - SH5_9Ethics and its applications; social philosophy
  - SH5_10History of philosophy
  - SH5_11Digital humanities; digital approaches to literary studies and philosophy
  - SH5_2Theory and history of literature, comparative literature
- SH7Human Mobility, Environment, and Space
  - SH7_2Migration
  - SH7_1Human, economic and social geography
  - SH7_3Population dynamics: households, family and fertility
  - SH7_5Sustainability sciences, environment and resources, ecosystem services
  - SH7_4Social aspects of health, ageing and society
  - SH7_6Environmental and climate change, societal impact and policy
  - SH7_7Cities; urban, regional and rural studies
  - SH7_8Land use and planning
  - SH7_10GIS, spatial analysis; digital geography
  - SH7_9Energy, transportation and mobility
- SH8Studies of Cultures and Arts
  - SH8_1Kinship; diversity and identities, gender, interethnic relations
  - SH8_2Religious studies, ritual; symbolic representation
  - SH8_3Cultural studies and theory, cultural identities and memories, cultural heritage
  - SH8_4Museums, exhibitions, conservation and restoration
  - SH8_5History of art and of architecture
  - SH8_6Architecture, design, craft, creative industries
  - SH8_7Music and musicology; history of music
  - SH8_8Visual and performing arts, screen, arts-based research
  - SH8_9Digital approaches to anthropology, cultural studies and art