Brief overviewThe goal of exemplar-based texture synthesis is to generate texture images that are visually similar to a given exemplar. Recently, promising results have been reported by methods relying on convolutional neural networks (ConvNets) pretrained on large-scale image datasets. However, these methods have difficulties in synthesizing image textures with non-local structures and extending to dynamic or sound textures. In this paper, we present a conditional generative ConvNet (cgCNN) model which combines deep statistics and the probabilistic framework of generative ConvNet (gCNN) model. Given a texture exemplar, the cgCNN model defines a conditional distribution using deep statistics of a ConvNet, and synthesizes new textures by sampling from the conditional distribution. In contrast to previous deep texture models, the proposed cgCNN does not rely on pre-trained ConvNets but learns the weights of ConvNets for each input exemplar instead. As a result, the cgCNN model can synthesize high quality dynamic, sound and image textures in a unified manner. We also explore the theoretical connections between our model and other texture models. Further investigations show that the cgCNN model can be easily generalized to texture expansion and inpainting. Extensive experiments demonstrate that our model can achieve better or at least comparable results than the state-of-the-art methods. |
|
Introduction |
|
Exemplar-based texture synthesis (EBTS) has been a dynamic yet challenging topic in computer vision and graphics for the past decades, which targets to produce new texture samples that are visually similar to a given exemplar. Recently, deep ConvNets have been used in texture modelling. These models employ deep ConvNets that are pretrained on large-scale image data sets as feature extractors, and generate new samples by seeking images that maximize certain similarity between their deep features and those from the exemplar. We propose a new texture model named conditional generative ConvNet (cgCNN) by integrating deep texture statistics and the probabilistic framework of generative ConvNet (gCNN). Unlike previous texture models that rely on pretrained ConvNets, cgCNN learns the weights of the ConvNet for each input exemplar. It therefore has two main advantages:
|
Bounded constraintWe compare the results using different activation functions, i.e. hard sigmoid, tanh and sigmoid. The results of hard sigmoid are best. |
|||||
Exemplar | |||||
---|---|---|---|---|---|
hard tanh | |||||
tanh | |||||
sigmoid |
Image texture synthesisImage texture synthesis using c-cgCNN. All exemplars and synthesized images are of size (256, 256). |
||||||
Exemplar | Gatys' model | gCNN | CoopNet | self-tuning | c-cgCNN-Gram | c-cgCNN-mean |
---|---|---|---|---|---|---|
Sound texture synthesisSound texture synthesis using c-cgCNN. All exemplars and synthesized textures have 50000 data points (~2 seconds). |
||||
Air conditioner | Pneumatic drill | Applause | Frog | |
---|---|---|---|---|
Exemplar | ||||
McDermott's model | ||||
Antognini's model | ||||
c-cgCNN-mean | ||||
c-cgCNN-Gram |
Sound texture expansionExpanding sound textures using f-cgCNN. Exemplars have 16384 data points (less than 1 second), and synthesized textures have 122880 data points (~5 seconds) |
|||
bees | shaking paper | wind | |
---|---|---|---|
Exemplar | |||
f-cgCNN |
Sound texture inpaintingSound texture inpainting using our method. The corrputed exemplars have 50000 data points, and the masks cover the intervals from the 20000-th to the 30000-th data point. No post-processing is used. |
||||
bees | helicopter | steam | insects | |
---|---|---|---|---|
Corrupted exemplar | ||||
Inpainted |
References
|