0000000000012771
AUTHOR
Michael Wand
Combining Markov Random Fields and Convolutional Neural Networks for Image Synthesis
This paper studies a combination of generative Markov random field (MRF) models and discriminatively trained deep convolutional neural networks (dCNNs) for synthesizing 2D images. The generative MRF acts on higher-levels of a dCNN feature pyramid, controling the image layout at an abstract level. We apply the method to both photographic and non-photo-realistic (artwork) synthesis tasks. The MRF regularizer prevents over-excitation artifacts and reduces implausible feature mixtures common to previous dCNN inversion approaches, permitting synthezing photographic content with increased visual plausibility. Unlike standard MRF-based texture synthesis, the combined system can both match and adap…
Building Construction Sets by Tiling Grammar Simplification
This paper poses the problem of fabricating physical construction sets from example geometry: A construction set provides a small number of different types of building blocks from which the example model as well as many similar variants can be reassembled. This process is formalized by tiling grammars. Our core contribution is an approach for simplifying tiling grammars such that we obtain physically manufacturable building blocks of controllable granularity while retaining variability, i.e., the ability to construct many different, related shapes. Simplification is performed by sequences of two types of elementary operations: non-local joint edge collapses in the tile graphs reduce the gra…
LeSSS: Learned Shared Semantic Spaces for Relating Multi-Modal Representations of 3D Shapes
In this paper, we propose a new method for structuring multi-modal representations of shapes according to semantic relations. We learn a metric that links semantically similar objects represented in different modalities. First, 3D-shapes are associated with textual labels by learning how textual attributes are related to the observed geometry. Correlations between similar labels are captured by simultaneously embedding labels and shape descriptors into a common latent space in which an inner product corresponds to similarity. The mapping is learned robustly by optimizing a rank-based loss function under a sparseness prior for the spectrum of the matrix of all classifiers. Second, we extend …
Fusion Architectures for Word-Based Audiovisual Speech Recognition
Improving Speaker-Independent Lipreading with Domain-Adversarial Training
We present a Lipreading system, i.e. a speech recognition system using only visual features, which uses domain-adversarial training for speaker independence. Domain-adversarial training is integrated into the optimization of a lipreader based on a stack of feedforward and LSTM (Long Short-Term Memory) recurrent neural networks, yielding an end-to-end trainable system which only requires a very small number of frames of untranscribed target data to substantially improve the recognition accuracy on the target speaker. On pairs of different source and target speakers, we achieve a relative accuracy improvement of around 40% with only 15 to 20 seconds of untranscribed target speech data. On mul…
Adversarial reverse mapping of condensed-phase molecular structures: Chemical transferability
Switching between different levels of resolution is essential for multiscale modeling, but restoring details at higher resolution remains challenging. In our previous study we have introduced deepBackmap: a deep neural-network-based approach to reverse-map equilibrated molecular structures for condensed-phase systems. Our method combines data-driven and physics-based aspects, leading to high-quality reconstructed structures. In this work, we expand the scope of our model and examine its chemical transferability. To this end, we train deepBackmap solely on homogeneous molecular liquids of small molecules, and apply it to a more challenging polymer melt. We augment the generator's objective w…
Adversarial reverse mapping of equilibrated condensed-phase molecular structures
A tight and consistent link between resolutions is crucial to further expand the impact of multiscale modeling for complex materials. We herein tackle the generation of condensed molecular structures as a refinement -- backmapping -- of a coarse-grained structure. Traditional schemes start from a rough coarse-to-fine mapping and perform further energy minimization and molecular dynamics simulations to equilibrate the system. In this study we introduce DeepBackmap: A deep neural network based approach to directly predict equilibrated molecular structures for condensed-phase systems. We use generative adversarial networks to learn the Boltzmann distribution from training data and realize reve…
Deep Learning Predicts Molecular Subtype of Muscle-invasive Bladder Cancer from Conventional Histopathological Slides.
Abstract Background Muscle-invasive bladder cancer (MIBC) is the second most common genitourinary malignancy, and is associated with high morbidity and mortality. Recently, molecular subtypes of MIBC have been identified, which have important clinical implications. Objective In the current study, we tried to predict the molecular subtype of MIBC samples from conventional histomorphology alone using deep learning. Design, setting, and participants Two cohorts of patients with MIBC were used: (1) The Cancer Genome Atlas Urothelial Bladder Carcinoma dataset including 407 patients and (2) our own cohort including 16 patients with treatment-naive, primary resected MIBC. This resulted in a total …
Inverse procedural modeling of 3D models for virtual worlds
This course presents a collection of state-of-the-art approaches for modeling and editing of 3D models for virtual worlds, simulations, and entertainment, in addition to real-world applications. The first contribution of this course is a coherent review of inverse procedural modeling (IPM) (i.e., proceduralization of provided 3D content). We describe different formulations of the problem as well as solutions based on those formulations. We show that although the IPM framework seems under-constrained, the state-of-the-art solutions actually use simple analogies to convert the problem into a set of fundamental computer science problems, which are then solved by corresponding algorithms or opt…
Benchmarking non-photorealistic rendering of portraits
We present a set of images for helping NPR practitioners evaluate their image-based portrait stylisation algorithms. Using a standard set both facilitates comparisons with other methods and helps ensure that presented results are representative. We give two levels of difficulty, each consisting of 20 images selected systematically so as to provide good coverage of several possible portrait characteristics. We applied three existing portrait-specific stylisation algorithms, two general-purpose stylisation algorithms, and one general learning based stylisation algorithm to the first level of the benchmark, corresponding to the type of constrained images that have often been used in portrait-s…
Deep Non-Line-of-Sight Reconstruction
The recent years have seen a surge of interest in methods for imaging beyond the direct line of sight. The most prominent techniques rely on time-resolved optical impulse responses, obtained by illuminating a diffuse wall with an ultrashort light pulse and observing multi-bounce indirect reflections with an ultrafast time-resolved imager. Reconstruction of geometry from such data, however, is a complex non-linear inverse problem that comes with substantial computational demands. In this paper, we employ convolutional feed-forward networks for solving the reconstruction problem efficiently while maintaining good reconstruction quality. Specifically, we devise a tailored autoencoder architect…
Progressive Stochastic Binarization of Deep Networks
A plethora of recent research has focused on improving the memory footprint and inference speed of deep networks by reducing the complexity of (i) numerical representations (for example, by deterministic or stochastic quantization) and (ii) arithmetic operations (for example, by binarization of weights). We propose a stochastic binarization scheme for deep networks that allows for efficient inference on hardware by restricting itself to additions of small integers and fixed shifts. Unlike previous approaches, the underlying randomized approximation is progressive, thus permitting an adaptive control of the accuracy of each operation at run-time. In a low-precision setting, we match the accu…
Approximate 3D Partial Symmetry Detection Using Co-occurrence Analysis
This paper addresses approximate partial symmetry detection in 3D point clouds, a classical and foundational tool for analyzing geometry. We present a novel, fully unsupervised method that detects partial symmetry under significant geometric variability, and without constraints on the number and arrangement of instances. The core idea is a matching scheme that finds consistent co-occurrence patterns in a frame-invariant way. We obtain a canonical partition of the input shape into building blocks and can handle ambiguous data by aggregating co-occurrence information across both all building block instances and the area they cover. We evaluate our method on several benchmark data sets and dem…
Automatic shape detection of ice crystals
Abstract Clouds have a crucial impact on the energy balance of the Earth-Atmosphere system. They can cool the system by partly reflecting or scattering of the incoming solar radiation (albedo effect); moreover, thermal radiation as emitted from the Earth's surface can be absorbed and partly re-emitted by clouds leading to a warming of the atmosphere (greenhouse effect). The effectiveness of both effects crucially depends on the size and the shape of a cloud's particulate constituents, i.e. liquid water droplets or solid ice crystals. For studying cloud microphysics, in situ measurements on board of aircraft are commonly used. An important class of measurement techniques comprises optical ar…
Precomputed Real-Time Texture Synthesis with Markovian Generative Adversarial Networks
This paper proposes Markovian Generative Adversarial Networks (MGANs), a method for training generative networks for efficient texture synthesis. While deep neural network approaches have recently demonstrated remarkable results in terms of synthesis quality, they still come at considerable computational costs (minutes of run-time for low-res images). Our paper addresses this efficiency issue. Instead of a numerical deconvolution in previous work, we precompute a feed-forward, strided convolutional network that captures the feature statistics of Markovian patches and is able to directly generate outputs of arbitrary dimensions. Such network can directly decode brown noise to realistic textu…