(Go: >> BACK << -|- >> HOME <<)

SlideShare a Scribd company logo
Conditional Image Generation
with PixelCNN Decoders
Yohei Sugawara
BrainPad Inc.
NIPS2016読み会(@Preferred Networks)
January 19, 2017
Aäron van den Oord
Nal Kalchbrenner
Oriol Vinyals
Lasse Espeholt
Alex Graves
Koray Kavukcuoglu
Google DeepMind
Preview
- Pixel dependencies = Raster scan order
- Autoregressive models sequentially predict pixels rather
than predicting the whole image at once (like as GAN, VAE)
PixelCNN PixelRNN
(Diagonal BiLSTM)
Pros: easier to parallelize
Cons: bounded dependency
masked convolution
【Background: Autoregressive Image Modeling】 【Previous work: Pixel Recurrent Neural Networks】
- CNN based model and RNN based models are proposed.
Pros: full dependency field
Cons: sequential training.
【Proposed approach: Gated PixelCNN】
Vertical maps Horizontal maps
one-hot encoding of the class-label
embedding from trained model
conditioning on
4. PixelCNN AutoEncoder
Encoder = Convolutional layers
Decoder = Conditional PixelCNN layers
3. Conditional PixelCNN
2. Gated PixelCNN architecture.1. Removal of blind spots in the receptive
field by combining the horizontal stack
and the vertical stack.
【Experimental Result】
Data
- CIFAR-10 dataset
(32x32 color images)
Performance (Unconditional)
Image Generation Models
-Three image generation approaches are dominating the field:
Variational AutoEncoders (VAE) Generative Adversarial Networks (GAN)
z
x
)(~ zpz θ
)|(~ zxpx θ
Decoder
Encoder
)|( xzqφ
x
z
Real
D
G
Fake
Real/Fake ?
generate
Autoregressive Models
(cf. https://openai.com/blog/generative-models/)
VAE GAN Autoregressive Models
Pros.
- Efficient inference with
approximate latent variables.
- generate sharp image.
- no need for any Markov chain or
approx networks during sampling.
- very simple and stable training process
- currently gives the best log likelihood.
- tractable likelihood
Cons.
- generated samples tend to be
blurry.
- difficult to optimize due to
unstable training dynamics.
- relatively inefficient during sampling
Autoregressive Image Modeling
- Autoregressive models train a network that models the
conditional distribution of every individual pixel given
previous pixels (raster scan order dependencies).
⇒ sequentially predict pixels rather than predicting the whole image at once (like as GAN, VAE)
- For color image, 3 channels are generated successive conditioning, blue given red and green,
green given red, and red given only the pixels above and to the left of all channels.
R G B
Previous work: Pixel Recurrent Neural Networks.
 “Pixel Recurrent Neural Networks” got best paper award at ICML2016.
 They proposed two types of models, PixelRNN and PixelCNN
(two types of LSTM layers are proposed for PixelRNN.)
PixelCNNPixelRNN
masked convolution
Row LSTM Diagonal BiLSTM
PixelRNN PixelCNN
Pros.
• effectively handles long-range dependencies
⇒ good performance
Convolutions are easier to parallelize ⇒ much faster to train
Cons.
• Each state needs to be computed sequentially.
⇒ computationally expensive
Bounded receptive field ⇒ inferior performance
Blind spot problem (due to the masked convolution) needs to be eliminated.
• LSTM based models are natural choice for
dealing with the autoregressive dependencies.
• CNN based model uses masked convolution,
to ensure the model is causal.
11w 12w 13w
21w 22w 23w
31w 32w 33w 
Details of “Masked Convolution” & “Blind Spot”
 To generate next pixel, the model can only condition on the previously generated pixels.
 Then, to make sure CNN can only use information about pixels above and to the left of
current pixel, the filters of the convolution need to be masked.
Case 1D
 Right figure shows 5x1 convolutional filters after
multiplying them by mask.
 The filters connecting the input layer to the first
hidden layer are in this case multiplied by m=(1,1,0,0,0),
to ensure the model is causal.
(cf. Generating Interpretable Images with Contollable Structure, S.Reed et.al., 2016)
5x1 filter m=(1,1,0,0,0)
Case 2D
(ex. text, audio, etc)
(ex. image)
 In case of 2D, PixelCNNs have a blind spot
in the receptive field that cannot be used
to make predictions.
 Rightmost figure shows the growth of the
masked receptive field.
(3 layered network with 3x3 conv filters)
5x5 filter
3x3 filter, 3 layered
Proposed approach: Gated PixelCNN
 In this paper, they proposed the improved version of PixelCNN.
 Major improvements are as follows.
1. Removal of blind spots in the receptive field by
combining the horizontal stack and the vertical stack.
2. Replacement of the ReLu activations between the masked
convolutions in the original PixelCNN with the gated activation unit.
3. Given a latent vector, they modeled the conditional distribution of
images, conditional PixelCNN.
a. conditioning on class-label
b. conditioning on embedding from trained model
4. From a convolutional auto-encoder, they replaced the deconvolutional
decoder with conditional PixelCNN, named PixelCNN Auto-Encoders
First improvement: horizontal stack and vertical stack
 The removal of blind spots in the receptive field are important
for PixelCNN’s performance, because the blind spot can cover as
much as a quarter of the potential receptive field.
 The vertical stack conditions on all rows above the current row.
 The horizontal stack conditions on current row.
- Details about implementation techniques are described below.
Second improvement: Gated activation and architecture
 Gated activation unit:
 Single layer block of a Gated PixelCNN
(σ: sigmoid, k: number of layer, ⦿: element-wise product, *: convolutional operator)
- Masked convolutions are shown in green.
- Element-wise operations are shown in red.
- Convolutions with Wf, Wg are combined
into a single operation shown in blue.
𝑣
𝑣′
ℎ
ℎ′
𝑣 = vertical activation maps
ℎ = horizontal activation maps
inth
intv
Details of Gated PixelCNN architecture
 Break down operations into four steps.
① Calculate vertical feature maps
… n×n convolutions are calculated with gated activation.
Input: ( = input image if 1st layer)
Output:
𝑣
𝑣′
ℎ
ℎ′
𝑣𝑖𝑛𝑡
ℎ𝑖𝑛𝑡
𝑣
𝑣′
Feature map 3x3 masked filters
receptive field
(1,1,1,1)
zero padding
receptive field
(1,0,1,1)
zero padding
2x3 filtersFeature map
Two types of
equivalent
implementation:
In this case, (i, j)th pixel depends
on (i, j+k)th (future) pixels
Next problem:
Solution:
Shift down vertical feature maps
when to feed into horizontal stack.
(ex. n=3)
0 0 0 … 0 0
𝑣
𝑣′
ℎ
ℎ′
𝑣𝑖𝑛𝑡
ℎ𝑖𝑛𝑡
Details of Gated PixelCNN architecture
② Feed vertical maps into horizontal stack
1. n x n masked convolution
2. shifting down operation (as below)
3. 1 x 1 convolution
Input: ( = input image if 1st layer)
Output:
𝑣
𝑣𝑖𝑛𝑡
Shift down vertical feature maps
when to feed into horizontal stack.
1. Add zero padding on the top
2. Crop the bottom

Left operations can be interpreted as below.
ensure causalityviolate causality
Feature map Feature map
Details of Gated PixelCNN architecture
𝑣
𝑣′
ℎ
ℎ′
𝑣𝑖𝑛𝑡
ℎ𝑖𝑛𝑡
Feature map 1x3 masked filters
receptive field
(0,0,1,1)
zero padding
receptive field
(0,0,1,0)
zero padding
1x2 filtersFeature map
Two types of
equivalent
implementation:
Next problem:
➢ Mask ‘A’ vs Mask ‘B’
③ Calculate horizontal feature maps
… 1×n convolutions are calculated with gated activation.
(vertical maps are added before activation.)
Input: , (input image if 1st layer)
Output:
ℎ
inth
intv
- Mask ‘A’ (restrict connection
from itself) is applied to only to
the first convolution.
- Mask ‘B’ (allow connection from
itself) is applied to all the
subsequent convolution.
(ex. n=3)
𝑣
𝑣′
ℎ
ℎ′
Details of Gated PixelCNN architecture
④ Calculate residual connection in horizontal stack.
… 1×1 convolutions are calculated without gated activation.
then, maps are added to horizontal maps (of layer’s input)
Input: , (input image if 1st layer)
Output:
ℎ
ℎ′
inth
inth
intv
③ Calculate horizontal feature maps
- Mask ‘A’ can be implemented as below. (ex. n=3)
receptive field
(0,0,1,0)
zero padding
1x1 filters
Feature map

[Convolution] [Crop the right]
Output layer and whole architecture
 Output layer
 Using a softmax on discrete pixel values ([0-255] = 256 way) instead of a
mixture density approach. (same approach as PixelRNN)
 Although without prior information about the meaning or relations of the 256
color categories, the distributions predicted by the model are meaningful.
 Whole architecture
(width) x (height) x (channels)
…
…
(width) x (height) x p (#feature maps)
Gated PixelCNN layer
output
Input
Additional 1x1 conv layers
original conditional
Model
Gated activation unit
Third improvements: conditional PixelCNN & PixelCNN AutoEncoder
 they modeled the conditional distribution by adding terms
that depend on h to the activations before the nolinearities
 coniditional PixelCNN
 PixelCNN AutoEncoder
 From a convolutional auto-encoder, they replaced the
deconvolutional decoder with conditional PixelCNN
Encoder:Convolution layers Decoder:Deconvolution layers
⇒ conditional PixelCNN layers
Experimental Results (Unconditional)
 Data: CIFAR-10 dataset
 Score: Negative log-likelihood score (bits/dim)
 Gated PixelCNN outperforms the PixelCNN by 0.11
bits/dim, which has a very significant effect on the
visual quality, and close to the performance of PixelRNN
 Data: ImageNet dataset
 Gated PixelCNN outperforms PixelRNN.
 Achieve similar performance to the PixelRNN in less
than half the training time.
Experimental Results (Conditional)
 Coniditioning on ImageNet classes
 Given a one-hot encoding hi, for the i-th class, model )|( ihxp
 Coniditioning on Portrait Embeddings
(part of results.)
 Embeddings are took from top layer of a conv network trained
on a large database of portraits from Flickr images.
 After the supervised net was trained, {x:image, h:embedding}
tuples are taken and trained conditional PixelCNN to model
 Given a new image of a person that was not in the training set,
they computed h and generate new portraits of same person.
)|( hxp
 And experimented with reconstructions conditioned on linear
interpolations between embeddings of pairs of images.
Experimental Results (PixelCNN Auto-Encoder)
 Data: 32x32 ImageNet patches
(Left to right: original image, reconstruction by auto-encoder, conditional samples from PixelCNN auto-encoder)
(m: dimensional bottleneck)
Summary & Reference
 Improved PixelCNN:
 Same performance as PixelRNN, but faster (easier to parallelize)
 Fixed “blind spots” problem
 Gated activation units
 Conditional Generation:
 Conditioned on class-label
 Conditioned on portrait embedding
 PixelCNN AutoEncoders
 Summary
 References
[1] Aäron van den Oord et al., “Conditional Image Generation with PixelCNN Decoders”,
NIPS 2016
[2] Aäron van den Oord et al., “Pixel Recurrent Neural Networks”, ICML 2016 (Best Paper Award)
[3] S. Reed, A. van den Oord et al., “Generating Interpretable Images with Controllable Structure”,
Under review as a conference paper at ICLR 2017
Appendix – Progress of research related to this paper
 Applied to other domains
➢ “WaveNet: A Generative Model for Raw Audio”, A. van den Oord et al. (DeepMind)
➢ “Video Pixel Networks”, Nal Kalchbrenner, A. van den Oord et al. (DeepMind)
➢ “Language Modeling with Gated Convolutional Networks”, Yann N.Dauphin et al. (Facebook AI Research)
➢ “Generating Interpretable Images with Controllable Structure”, S.Reed, A. van den Oord et al. (Google DeepMind),
Under review as a conference paper at ICLR 2017
 The conditional probability distribution is modelled by a stack of dilated causal convolutional
layers with gated activation units.
 The architecture of the generative video model consists of two parts:
1. Resolution preserving CNN encoders
2. PixelCNN decoders
 Text-to-image synthesis (generating images from captions and other strcuture) using
gated conditional PixelCNN model.
 A new language model that replace recurrent connections typically used in RNN with gated
temporal convolutions.
Appendix – Progress of research related to this paper
➢ “PixelCNN++: A PixelCNN Inplementation with Discretized Logistic Mixture Likelihood and Other Modifications”,
Tim Salimans, Andrej Karpathy, et al. (OpenAI), Under review as a conference paper at ICLR 2017
 Modifications of Gated PixelCNN model
➢ “PixelVAE: A Latent Variable Model for Natural Images”,
Ishaan Gulrajani, Kundan Kumar et al.,
Under review as a conference paper at ICLR 2017
 A number of modifications to the original gated PixelCNN model.
1. Use a discretized logistic mixture likelihood, rather than a 256-way sofmax.
2. Condition on whole pixels, rather than R/G/B sub-pixels.
3. Use downsampling.
4. Introduce additional short-cut connections. (like as U-net)
5. Regularize the model using dropout.
 A VAE model with an autoregressive decoder
based on PixelCNN.

More Related Content

What's hot

SSII2021 [OS2-02] 深層学習におけるデータ拡張の原理と最新動向
SSII2021 [OS2-02] 深層学習におけるデータ拡張の原理と最新動向SSII2021 [OS2-02] 深層学習におけるデータ拡張の原理と最新動向
SSII2021 [OS2-02] 深層学習におけるデータ拡張の原理と最新動向
SSII
 
Mean Teacher
Mean TeacherMean Teacher
Mean Teacher
harmonylab
 
PFP:材料探索のための汎用Neural Network Potential - 2021/10/4 QCMSR + DLAP共催
PFP:材料探索のための汎用Neural Network Potential - 2021/10/4 QCMSR + DLAP共催PFP:材料探索のための汎用Neural Network Potential - 2021/10/4 QCMSR + DLAP共催
PFP:材料探索のための汎用Neural Network Potential - 2021/10/4 QCMSR + DLAP共催
Preferred Networks
 
Person Re-Identification におけるRe-ranking のための K reciprocal-encoding
Person Re-Identification におけるRe-ranking のための K reciprocal-encodingPerson Re-Identification におけるRe-ranking のための K reciprocal-encoding
Person Re-Identification におけるRe-ranking のための K reciprocal-encoding
tancoro
 
Normalization 방법
Normalization 방법 Normalization 방법
Normalization 방법
홍배 김
 
[DL輪読会]Flow-based Deep Generative Models
[DL輪読会]Flow-based Deep Generative Models[DL輪読会]Flow-based Deep Generative Models
[DL輪読会]Flow-based Deep Generative Models
Deep Learning JP
 
007 20151214 Deep Unsupervised Learning using Nonequlibrium Thermodynamics
007 20151214 Deep Unsupervised Learning using Nonequlibrium Thermodynamics007 20151214 Deep Unsupervised Learning using Nonequlibrium Thermodynamics
007 20151214 Deep Unsupervised Learning using Nonequlibrium Thermodynamics
Ha Phuong
 
オープンワールド認識 (第34回全脳アーキテクチャ若手の会 勉強会)
オープンワールド認識 (第34回全脳アーキテクチャ若手の会 勉強会)オープンワールド認識 (第34回全脳アーキテクチャ若手の会 勉強会)
オープンワールド認識 (第34回全脳アーキテクチャ若手の会 勉強会)
Takuma Yagi
 
論文紹介: Fast R-CNN&Faster R-CNN
論文紹介: Fast R-CNN&Faster R-CNN論文紹介: Fast R-CNN&Faster R-CNN
論文紹介: Fast R-CNN&Faster R-CNN
Takashi Abe
 
Super resolution from a single image
Super resolution from a single imageSuper resolution from a single image
Super resolution from a single image
Lakkhana Mallikarachchi
 
FixMatch:simplifying semi supervised learning with consistency and confidence
FixMatch:simplifying semi supervised learning with consistency and confidenceFixMatch:simplifying semi supervised learning with consistency and confidence
FixMatch:simplifying semi supervised learning with consistency and confidence
LEE HOSEONG
 
[DL輪読会]近年のエネルギーベースモデルの進展
[DL輪読会]近年のエネルギーベースモデルの進展[DL輪読会]近年のエネルギーベースモデルの進展
[DL輪読会]近年のエネルギーベースモデルの進展
Deep Learning JP
 
MixMatch: A Holistic Approach to Semi- Supervised Learning
MixMatch: A Holistic Approach to Semi- Supervised LearningMixMatch: A Holistic Approach to Semi- Supervised Learning
MixMatch: A Holistic Approach to Semi- Supervised Learning
harmonylab
 
PR-409: Denoising Diffusion Probabilistic Models
PR-409: Denoising Diffusion Probabilistic ModelsPR-409: Denoising Diffusion Probabilistic Models
PR-409: Denoising Diffusion Probabilistic Models
Hyeongmin Lee
 
Super resolution-review
Super resolution-reviewSuper resolution-review
Super resolution-review
Woojin Jeong
 
Superpixel Sampling Networks
Superpixel Sampling NetworksSuperpixel Sampling Networks
Superpixel Sampling Networks
yukihiro domae
 
画像生成・生成モデル メタサーベイ
画像生成・生成モデル メタサーベイ画像生成・生成モデル メタサーベイ
画像生成・生成モデル メタサーベイ
cvpaper. challenge
 
Learning global pooling operators in deep neural networks for image retrieval...
Learning global pooling operators in deep neural networks for image retrieval...Learning global pooling operators in deep neural networks for image retrieval...
Learning global pooling operators in deep neural networks for image retrieval...
Erlangen Artificial Intelligence & Machine Learning Meetup
 
NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis
NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis
NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis
taeseon ryu
 
Contrastive learning 20200607
Contrastive learning 20200607Contrastive learning 20200607
Contrastive learning 20200607
ぱんいち すみもと
 

What's hot (20)

SSII2021 [OS2-02] 深層学習におけるデータ拡張の原理と最新動向
SSII2021 [OS2-02] 深層学習におけるデータ拡張の原理と最新動向SSII2021 [OS2-02] 深層学習におけるデータ拡張の原理と最新動向
SSII2021 [OS2-02] 深層学習におけるデータ拡張の原理と最新動向
 
Mean Teacher
Mean TeacherMean Teacher
Mean Teacher
 
PFP:材料探索のための汎用Neural Network Potential - 2021/10/4 QCMSR + DLAP共催
PFP:材料探索のための汎用Neural Network Potential - 2021/10/4 QCMSR + DLAP共催PFP:材料探索のための汎用Neural Network Potential - 2021/10/4 QCMSR + DLAP共催
PFP:材料探索のための汎用Neural Network Potential - 2021/10/4 QCMSR + DLAP共催
 
Person Re-Identification におけるRe-ranking のための K reciprocal-encoding
Person Re-Identification におけるRe-ranking のための K reciprocal-encodingPerson Re-Identification におけるRe-ranking のための K reciprocal-encoding
Person Re-Identification におけるRe-ranking のための K reciprocal-encoding
 
Normalization 방법
Normalization 방법 Normalization 방법
Normalization 방법
 
[DL輪読会]Flow-based Deep Generative Models
[DL輪読会]Flow-based Deep Generative Models[DL輪読会]Flow-based Deep Generative Models
[DL輪読会]Flow-based Deep Generative Models
 
007 20151214 Deep Unsupervised Learning using Nonequlibrium Thermodynamics
007 20151214 Deep Unsupervised Learning using Nonequlibrium Thermodynamics007 20151214 Deep Unsupervised Learning using Nonequlibrium Thermodynamics
007 20151214 Deep Unsupervised Learning using Nonequlibrium Thermodynamics
 
オープンワールド認識 (第34回全脳アーキテクチャ若手の会 勉強会)
オープンワールド認識 (第34回全脳アーキテクチャ若手の会 勉強会)オープンワールド認識 (第34回全脳アーキテクチャ若手の会 勉強会)
オープンワールド認識 (第34回全脳アーキテクチャ若手の会 勉強会)
 
論文紹介: Fast R-CNN&Faster R-CNN
論文紹介: Fast R-CNN&Faster R-CNN論文紹介: Fast R-CNN&Faster R-CNN
論文紹介: Fast R-CNN&Faster R-CNN
 
Super resolution from a single image
Super resolution from a single imageSuper resolution from a single image
Super resolution from a single image
 
FixMatch:simplifying semi supervised learning with consistency and confidence
FixMatch:simplifying semi supervised learning with consistency and confidenceFixMatch:simplifying semi supervised learning with consistency and confidence
FixMatch:simplifying semi supervised learning with consistency and confidence
 
[DL輪読会]近年のエネルギーベースモデルの進展
[DL輪読会]近年のエネルギーベースモデルの進展[DL輪読会]近年のエネルギーベースモデルの進展
[DL輪読会]近年のエネルギーベースモデルの進展
 
MixMatch: A Holistic Approach to Semi- Supervised Learning
MixMatch: A Holistic Approach to Semi- Supervised LearningMixMatch: A Holistic Approach to Semi- Supervised Learning
MixMatch: A Holistic Approach to Semi- Supervised Learning
 
PR-409: Denoising Diffusion Probabilistic Models
PR-409: Denoising Diffusion Probabilistic ModelsPR-409: Denoising Diffusion Probabilistic Models
PR-409: Denoising Diffusion Probabilistic Models
 
Super resolution-review
Super resolution-reviewSuper resolution-review
Super resolution-review
 
Superpixel Sampling Networks
Superpixel Sampling NetworksSuperpixel Sampling Networks
Superpixel Sampling Networks
 
画像生成・生成モデル メタサーベイ
画像生成・生成モデル メタサーベイ画像生成・生成モデル メタサーベイ
画像生成・生成モデル メタサーベイ
 
Learning global pooling operators in deep neural networks for image retrieval...
Learning global pooling operators in deep neural networks for image retrieval...Learning global pooling operators in deep neural networks for image retrieval...
Learning global pooling operators in deep neural networks for image retrieval...
 
NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis
NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis
NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis
 
Contrastive learning 20200607
Contrastive learning 20200607Contrastive learning 20200607
Contrastive learning 20200607
 

Viewers also liked

Learning to learn by gradient descent by gradient descent
Learning to learn by gradient descent by gradient descentLearning to learn by gradient descent by gradient descent
Learning to learn by gradient descent by gradient descent
Hiroyuki Fukuda
 
Dual Learning for Machine Translation (NIPS 2016)
Dual Learning for Machine Translation (NIPS 2016)Dual Learning for Machine Translation (NIPS 2016)
Dual Learning for Machine Translation (NIPS 2016)
Toru Fujino
 
Introduction of “Fairness in Learning: Classic and Contextual Bandits”
Introduction of “Fairness in Learning: Classic and Contextual Bandits”Introduction of “Fairness in Learning: Classic and Contextual Bandits”
Introduction of “Fairness in Learning: Classic and Contextual Bandits”
Kazuto Fukuchi
 
Introduction of "TrailBlazer" algorithm
Introduction of "TrailBlazer" algorithmIntroduction of "TrailBlazer" algorithm
Introduction of "TrailBlazer" algorithm
Katsuki Ohto
 
Interaction Networks for Learning about Objects, Relations and Physics
Interaction Networks for Learning about Objects, Relations and PhysicsInteraction Networks for Learning about Objects, Relations and Physics
Interaction Networks for Learning about Objects, Relations and Physics
Ken Kuroki
 
Safe and Efficient Off-Policy Reinforcement Learning
Safe and Efficient Off-Policy Reinforcement LearningSafe and Efficient Off-Policy Reinforcement Learning
Safe and Efficient Off-Policy Reinforcement Learning
mooopan
 
Value iteration networks
Value iteration networksValue iteration networks
Value iteration networks
Fujimoto Keisuke
 
InfoGAN: Interpretable Representation Learning by Information Maximizing Gen...
InfoGAN: Interpretable Representation Learning by Information Maximizing Gen...InfoGAN: Interpretable Representation Learning by Information Maximizing Gen...
InfoGAN: Interpretable Representation Learning by Information Maximizing Gen...
Shuhei Yoshida
 
Fast and Probvably Seedings for k-Means
Fast and Probvably Seedings for k-MeansFast and Probvably Seedings for k-Means
Fast and Probvably Seedings for k-Means
Kimikazu Kato
 
時系列データ3
時系列データ3時系列データ3
時系列データ3
graySpace999
 
Improving Variational Inference with Inverse Autoregressive Flow
Improving Variational Inference with Inverse Autoregressive FlowImproving Variational Inference with Inverse Autoregressive Flow
Improving Variational Inference with Inverse Autoregressive Flow
Tatsuya Shirakawa
 
[DL輪読会]Convolutional Sequence to Sequence Learning
[DL輪読会]Convolutional Sequence to Sequence Learning[DL輪読会]Convolutional Sequence to Sequence Learning
[DL輪読会]Convolutional Sequence to Sequence Learning
Deep Learning JP
 
NIPS 2016 Overview and Deep Learning Topics
NIPS 2016 Overview and Deep Learning Topics  NIPS 2016 Overview and Deep Learning Topics
NIPS 2016 Overview and Deep Learning Topics
Koichi Hamada
 
論文紹介 Combining Model-Based and Model-Free Updates for Trajectory-Centric Rein...
論文紹介 Combining Model-Based and Model-Free Updates for Trajectory-Centric Rein...論文紹介 Combining Model-Based and Model-Free Updates for Trajectory-Centric Rein...
論文紹介 Combining Model-Based and Model-Free Updates for Trajectory-Centric Rein...
Kusano Hitoshi
 
Differential privacy without sensitivity [NIPS2016読み会資料]
Differential privacy without sensitivity [NIPS2016読み会資料]Differential privacy without sensitivity [NIPS2016読み会資料]
Differential privacy without sensitivity [NIPS2016読み会資料]
Kentaro Minami
 
Matching networks for one shot learning
Matching networks for one shot learningMatching networks for one shot learning
Matching networks for one shot learning
Kazuki Fujikawa
 
ICML2016読み会 概要紹介
ICML2016読み会 概要紹介ICML2016読み会 概要紹介
ICML2016読み会 概要紹介
Kohei Hayashi
 
論文紹介 Pixel Recurrent Neural Networks
論文紹介 Pixel Recurrent Neural Networks論文紹介 Pixel Recurrent Neural Networks
論文紹介 Pixel Recurrent Neural Networks
Seiya Tokui
 

Viewers also liked (18)

Learning to learn by gradient descent by gradient descent
Learning to learn by gradient descent by gradient descentLearning to learn by gradient descent by gradient descent
Learning to learn by gradient descent by gradient descent
 
Dual Learning for Machine Translation (NIPS 2016)
Dual Learning for Machine Translation (NIPS 2016)Dual Learning for Machine Translation (NIPS 2016)
Dual Learning for Machine Translation (NIPS 2016)
 
Introduction of “Fairness in Learning: Classic and Contextual Bandits”
Introduction of “Fairness in Learning: Classic and Contextual Bandits”Introduction of “Fairness in Learning: Classic and Contextual Bandits”
Introduction of “Fairness in Learning: Classic and Contextual Bandits”
 
Introduction of "TrailBlazer" algorithm
Introduction of "TrailBlazer" algorithmIntroduction of "TrailBlazer" algorithm
Introduction of "TrailBlazer" algorithm
 
Interaction Networks for Learning about Objects, Relations and Physics
Interaction Networks for Learning about Objects, Relations and PhysicsInteraction Networks for Learning about Objects, Relations and Physics
Interaction Networks for Learning about Objects, Relations and Physics
 
Safe and Efficient Off-Policy Reinforcement Learning
Safe and Efficient Off-Policy Reinforcement LearningSafe and Efficient Off-Policy Reinforcement Learning
Safe and Efficient Off-Policy Reinforcement Learning
 
Value iteration networks
Value iteration networksValue iteration networks
Value iteration networks
 
InfoGAN: Interpretable Representation Learning by Information Maximizing Gen...
InfoGAN: Interpretable Representation Learning by Information Maximizing Gen...InfoGAN: Interpretable Representation Learning by Information Maximizing Gen...
InfoGAN: Interpretable Representation Learning by Information Maximizing Gen...
 
Fast and Probvably Seedings for k-Means
Fast and Probvably Seedings for k-MeansFast and Probvably Seedings for k-Means
Fast and Probvably Seedings for k-Means
 
時系列データ3
時系列データ3時系列データ3
時系列データ3
 
Improving Variational Inference with Inverse Autoregressive Flow
Improving Variational Inference with Inverse Autoregressive FlowImproving Variational Inference with Inverse Autoregressive Flow
Improving Variational Inference with Inverse Autoregressive Flow
 
[DL輪読会]Convolutional Sequence to Sequence Learning
[DL輪読会]Convolutional Sequence to Sequence Learning[DL輪読会]Convolutional Sequence to Sequence Learning
[DL輪読会]Convolutional Sequence to Sequence Learning
 
NIPS 2016 Overview and Deep Learning Topics
NIPS 2016 Overview and Deep Learning Topics  NIPS 2016 Overview and Deep Learning Topics
NIPS 2016 Overview and Deep Learning Topics
 
論文紹介 Combining Model-Based and Model-Free Updates for Trajectory-Centric Rein...
論文紹介 Combining Model-Based and Model-Free Updates for Trajectory-Centric Rein...論文紹介 Combining Model-Based and Model-Free Updates for Trajectory-Centric Rein...
論文紹介 Combining Model-Based and Model-Free Updates for Trajectory-Centric Rein...
 
Differential privacy without sensitivity [NIPS2016読み会資料]
Differential privacy without sensitivity [NIPS2016読み会資料]Differential privacy without sensitivity [NIPS2016読み会資料]
Differential privacy without sensitivity [NIPS2016読み会資料]
 
Matching networks for one shot learning
Matching networks for one shot learningMatching networks for one shot learning
Matching networks for one shot learning
 
ICML2016読み会 概要紹介
ICML2016読み会 概要紹介ICML2016読み会 概要紹介
ICML2016読み会 概要紹介
 
論文紹介 Pixel Recurrent Neural Networks
論文紹介 Pixel Recurrent Neural Networks論文紹介 Pixel Recurrent Neural Networks
論文紹介 Pixel Recurrent Neural Networks
 

Similar to Conditional Image Generation with PixelCNN Decoders

Object Segmentation (D2L7 Insight@DCU Machine Learning Workshop 2017)
Object Segmentation (D2L7 Insight@DCU Machine Learning Workshop 2017)Object Segmentation (D2L7 Insight@DCU Machine Learning Workshop 2017)
Object Segmentation (D2L7 Insight@DCU Machine Learning Workshop 2017)
Universitat Politècnica de Catalunya
 
Deep learning for image video processing
Deep learning for image video processingDeep learning for image video processing
Deep learning for image video processing
Yu Huang
 
Overview of Convolutional Neural Networks
Overview of Convolutional Neural NetworksOverview of Convolutional Neural Networks
Overview of Convolutional Neural Networks
ananth
 
build a Convolutional Neural Network (CNN) using TensorFlow in Python
build a Convolutional Neural Network (CNN) using TensorFlow in Pythonbuild a Convolutional Neural Network (CNN) using TensorFlow in Python
build a Convolutional Neural Network (CNN) using TensorFlow in Python
Kv Sagar
 
Image Segmentation (D3L1 2017 UPC Deep Learning for Computer Vision)
Image Segmentation (D3L1 2017 UPC Deep Learning for Computer Vision)Image Segmentation (D3L1 2017 UPC Deep Learning for Computer Vision)
Image Segmentation (D3L1 2017 UPC Deep Learning for Computer Vision)
Universitat Politècnica de Catalunya
 
Convolutional Neural Network Models - Deep Learning
Convolutional Neural Network Models - Deep LearningConvolutional Neural Network Models - Deep Learning
Convolutional Neural Network Models - Deep Learning
Mohamed Loey
 
Introduction to convolutional networks .pptx
Introduction to convolutional networks .pptxIntroduction to convolutional networks .pptx
Introduction to convolutional networks .pptx
ArunNegi37
 
20150703.journal club
20150703.journal club20150703.journal club
20150703.journal club
Hayaru SHOUNO
 
Convolutional Neural Networks - Veronica Vilaplana - UPC Barcelona 2018
Convolutional Neural Networks - Veronica Vilaplana - UPC Barcelona 2018Convolutional Neural Networks - Veronica Vilaplana - UPC Barcelona 2018
Convolutional Neural Networks - Veronica Vilaplana - UPC Barcelona 2018
Universitat Politècnica de Catalunya
 
B.tech_project_ppt.pptx
B.tech_project_ppt.pptxB.tech_project_ppt.pptx
B.tech_project_ppt.pptx
supratikmondal6
 
object detection paper review
object detection paper reviewobject detection paper review
object detection paper review
Yoonho Na
 
Anchor free object detection by deep learning
Anchor free object detection by deep learningAnchor free object detection by deep learning
Anchor free object detection by deep learning
Yu Huang
 
Lecture 6: Convolutional Neural Networks
Lecture 6: Convolutional Neural NetworksLecture 6: Convolutional Neural Networks
Lecture 6: Convolutional Neural Networks
Sang Jun Lee
 
convolutional_neural_networks in deep learning
convolutional_neural_networks in deep learningconvolutional_neural_networks in deep learning
convolutional_neural_networks in deep learning
ssusere5ddd6
 
Convolutional Neural Network for pixel-wise skyline detection
Convolutional Neural Network for pixel-wise skyline detectionConvolutional Neural Network for pixel-wise skyline detection
Convolutional Neural Network for pixel-wise skyline detection
Darian Frajberg
 
SIGGRAPH 2014 Course on Computational Cameras and Displays (part 3)
SIGGRAPH 2014 Course on Computational Cameras and Displays (part 3)SIGGRAPH 2014 Course on Computational Cameras and Displays (part 3)
SIGGRAPH 2014 Course on Computational Cameras and Displays (part 3)
Matthew O'Toole
 
Review-image-segmentation-by-deep-learning
Review-image-segmentation-by-deep-learningReview-image-segmentation-by-deep-learning
Review-image-segmentation-by-deep-learning
Trong-An Bui
 
Densebox
DenseboxDensebox
Densebox
冠宇 陳
 
CNN.pptx
CNN.pptxCNN.pptx
CNN.pptx
AbrarRana10
 
Mask-RCNN for Instance Segmentation
Mask-RCNN for Instance SegmentationMask-RCNN for Instance Segmentation
Mask-RCNN for Instance Segmentation
Dat Nguyen
 

Similar to Conditional Image Generation with PixelCNN Decoders (20)

Object Segmentation (D2L7 Insight@DCU Machine Learning Workshop 2017)
Object Segmentation (D2L7 Insight@DCU Machine Learning Workshop 2017)Object Segmentation (D2L7 Insight@DCU Machine Learning Workshop 2017)
Object Segmentation (D2L7 Insight@DCU Machine Learning Workshop 2017)
 
Deep learning for image video processing
Deep learning for image video processingDeep learning for image video processing
Deep learning for image video processing
 
Overview of Convolutional Neural Networks
Overview of Convolutional Neural NetworksOverview of Convolutional Neural Networks
Overview of Convolutional Neural Networks
 
build a Convolutional Neural Network (CNN) using TensorFlow in Python
build a Convolutional Neural Network (CNN) using TensorFlow in Pythonbuild a Convolutional Neural Network (CNN) using TensorFlow in Python
build a Convolutional Neural Network (CNN) using TensorFlow in Python
 
Image Segmentation (D3L1 2017 UPC Deep Learning for Computer Vision)
Image Segmentation (D3L1 2017 UPC Deep Learning for Computer Vision)Image Segmentation (D3L1 2017 UPC Deep Learning for Computer Vision)
Image Segmentation (D3L1 2017 UPC Deep Learning for Computer Vision)
 
Convolutional Neural Network Models - Deep Learning
Convolutional Neural Network Models - Deep LearningConvolutional Neural Network Models - Deep Learning
Convolutional Neural Network Models - Deep Learning
 
Introduction to convolutional networks .pptx
Introduction to convolutional networks .pptxIntroduction to convolutional networks .pptx
Introduction to convolutional networks .pptx
 
20150703.journal club
20150703.journal club20150703.journal club
20150703.journal club
 
Convolutional Neural Networks - Veronica Vilaplana - UPC Barcelona 2018
Convolutional Neural Networks - Veronica Vilaplana - UPC Barcelona 2018Convolutional Neural Networks - Veronica Vilaplana - UPC Barcelona 2018
Convolutional Neural Networks - Veronica Vilaplana - UPC Barcelona 2018
 
B.tech_project_ppt.pptx
B.tech_project_ppt.pptxB.tech_project_ppt.pptx
B.tech_project_ppt.pptx
 
object detection paper review
object detection paper reviewobject detection paper review
object detection paper review
 
Anchor free object detection by deep learning
Anchor free object detection by deep learningAnchor free object detection by deep learning
Anchor free object detection by deep learning
 
Lecture 6: Convolutional Neural Networks
Lecture 6: Convolutional Neural NetworksLecture 6: Convolutional Neural Networks
Lecture 6: Convolutional Neural Networks
 
convolutional_neural_networks in deep learning
convolutional_neural_networks in deep learningconvolutional_neural_networks in deep learning
convolutional_neural_networks in deep learning
 
Convolutional Neural Network for pixel-wise skyline detection
Convolutional Neural Network for pixel-wise skyline detectionConvolutional Neural Network for pixel-wise skyline detection
Convolutional Neural Network for pixel-wise skyline detection
 
SIGGRAPH 2014 Course on Computational Cameras and Displays (part 3)
SIGGRAPH 2014 Course on Computational Cameras and Displays (part 3)SIGGRAPH 2014 Course on Computational Cameras and Displays (part 3)
SIGGRAPH 2014 Course on Computational Cameras and Displays (part 3)
 
Review-image-segmentation-by-deep-learning
Review-image-segmentation-by-deep-learningReview-image-segmentation-by-deep-learning
Review-image-segmentation-by-deep-learning
 
Densebox
DenseboxDensebox
Densebox
 
CNN.pptx
CNN.pptxCNN.pptx
CNN.pptx
 
Mask-RCNN for Instance Segmentation
Mask-RCNN for Instance SegmentationMask-RCNN for Instance Segmentation
Mask-RCNN for Instance Segmentation
 

Recently uploaded

buku report tentang analisis TIMSS 2023.pdf
buku report tentang analisis TIMSS 2023.pdfbuku report tentang analisis TIMSS 2023.pdf
buku report tentang analisis TIMSS 2023.pdf
ABDULKALAM847167
 
@Call @Girls Bandra phone 9920874524 You Are Serach A Beautyfull Dolle come here
@Call @Girls Bandra phone 9920874524 You Are Serach A Beautyfull Dolle come here@Call @Girls Bandra phone 9920874524 You Are Serach A Beautyfull Dolle come here
@Call @Girls Bandra phone 9920874524 You Are Serach A Beautyfull Dolle come here
SARITA PANDEY
 
SAP ANalytics Cloud -SAP SAC planning 22
SAP ANalytics Cloud -SAP SAC planning 22SAP ANalytics Cloud -SAP SAC planning 22
SAP ANalytics Cloud -SAP SAC planning 22
ramana4bw
 
Applications of Data Science in Various Industries
Applications of Data Science in Various IndustriesApplications of Data Science in Various Industries
Applications of Data Science in Various Industries
IABAC
 
*Call *Girls in Hyderabad 🤣 8826483818 🤣 Pooja Sharma Best High Class Hyderab...
*Call *Girls in Hyderabad 🤣 8826483818 🤣 Pooja Sharma Best High Class Hyderab...*Call *Girls in Hyderabad 🤣 8826483818 🤣 Pooja Sharma Best High Class Hyderab...
*Call *Girls in Hyderabad 🤣 8826483818 🤣 Pooja Sharma Best High Class Hyderab...
roobykhan02154
 
❻❸❼⓿❽❻❷⓿⓿❼ SATTA MATKA DPBOSS KALYAN FAST RESULTS CHART KALYAN MATKA MATKA RE...
❻❸❼⓿❽❻❷⓿⓿❼ SATTA MATKA DPBOSS KALYAN FAST RESULTS CHART KALYAN MATKA MATKA RE...❻❸❼⓿❽❻❷⓿⓿❼ SATTA MATKA DPBOSS KALYAN FAST RESULTS CHART KALYAN MATKA MATKA RE...
❻❸❼⓿❽❻❷⓿⓿❼ SATTA MATKA DPBOSS KALYAN FAST RESULTS CHART KALYAN MATKA MATKA RE...
#kalyanmatkaresult #dpboss #kalyanmatka #satta #matka #sattamatka
 
Kolkata @Call @Girls Service 0000000000 Rani Best High Class Kolkata Available
Kolkata @Call @Girls Service 0000000000 Rani Best High Class Kolkata AvailableKolkata @Call @Girls Service 0000000000 Rani Best High Class Kolkata Available
Kolkata @Call @Girls Service 0000000000 Rani Best High Class Kolkata Available
roshansa9823
 
Cómo hemos implementado semántica de "Exactly Once" en nuestra base de datos ...
Cómo hemos implementado semántica de "Exactly Once" en nuestra base de datos ...Cómo hemos implementado semántica de "Exactly Once" en nuestra base de datos ...
Cómo hemos implementado semántica de "Exactly Once" en nuestra base de datos ...
javier ramirez
 
2024 June - Orange County (CA) Tableau User Group Meeting
2024 June - Orange County (CA) Tableau User Group Meeting2024 June - Orange County (CA) Tableau User Group Meeting
2024 June - Orange County (CA) Tableau User Group Meeting
Alison Pitt
 
LLM powered Contract Compliance Application.pptx
LLM powered Contract Compliance Application.pptxLLM powered Contract Compliance Application.pptx
LLM powered Contract Compliance Application.pptx
Jyotishko Biswas
 
[D2T2S04] SageMaker를 활용한 Generative AI Foundation Model Training and Tuning
[D2T2S04] SageMaker를 활용한 Generative AI Foundation Model Training and Tuning[D2T2S04] SageMaker를 활용한 Generative AI Foundation Model Training and Tuning
[D2T2S04] SageMaker를 활용한 Generative AI Foundation Model Training and Tuning
Donghwan Lee
 
一比一原版英国埃塞克斯大学毕业证(essex毕业证书)如何办理
一比一原版英国埃塞克斯大学毕业证(essex毕业证书)如何办理一比一原版英国埃塞克斯大学毕业证(essex毕业证书)如何办理
一比一原版英国埃塞克斯大学毕业证(essex毕业证书)如何办理
qemnpg
 
Simon Fraser University degree offer diploma Transcript
Simon Fraser University  degree offer diploma TranscriptSimon Fraser University  degree offer diploma Transcript
Simon Fraser University degree offer diploma Transcript
taqyea
 
Bangalore @Call @Girls 0000000000 Riya Khan Beautiful And Cute Girl any Time
Bangalore @Call @Girls 0000000000 Riya Khan Beautiful And Cute Girl any TimeBangalore @Call @Girls 0000000000 Riya Khan Beautiful And Cute Girl any Time
Bangalore @Call @Girls 0000000000 Riya Khan Beautiful And Cute Girl any Time
adityaroy0215
 
[D3T1S02] Aurora Limitless Database Introduction
[D3T1S02] Aurora Limitless Database Introduction[D3T1S02] Aurora Limitless Database Introduction
[D3T1S02] Aurora Limitless Database Introduction
Amazon Web Services Korea
 
❻❸❼⓿❽❻❷⓿⓿❼ SATTA MATKA DPBOSS KALYAN FAST RESULTS CHART KALYAN MATKA MATKA RE...
❻❸❼⓿❽❻❷⓿⓿❼ SATTA MATKA DPBOSS KALYAN FAST RESULTS CHART KALYAN MATKA MATKA RE...❻❸❼⓿❽❻❷⓿⓿❼ SATTA MATKA DPBOSS KALYAN FAST RESULTS CHART KALYAN MATKA MATKA RE...
❻❸❼⓿❽❻❷⓿⓿❼ SATTA MATKA DPBOSS KALYAN FAST RESULTS CHART KALYAN MATKA MATKA RE...
#kalyanmatkaresult #dpboss #kalyanmatka #satta #matka #sattamatka
 
Madurai @Call @Girls Whatsapp 0000000000 With High Profile Offer 25%
Madurai @Call @Girls Whatsapp 0000000000 With High Profile Offer 25%Madurai @Call @Girls Whatsapp 0000000000 With High Profile Offer 25%
Madurai @Call @Girls Whatsapp 0000000000 With High Profile Offer 25%
punebabes1
 
( Call ) Girls South Mumbai phone 9930687706 You Are Serach A Beautyfull Doll...
( Call ) Girls South Mumbai phone 9930687706 You Are Serach A Beautyfull Doll...( Call ) Girls South Mumbai phone 9930687706 You Are Serach A Beautyfull Doll...
( Call ) Girls South Mumbai phone 9930687706 You Are Serach A Beautyfull Doll...
seenu pandey
 
❻❸❼⓿❽❻❷⓿⓿❼ SATTA MATKA DPBOSS KALYAN MATKA RESULTS KALYAN CHART KALYAN MATKA ...
❻❸❼⓿❽❻❷⓿⓿❼ SATTA MATKA DPBOSS KALYAN MATKA RESULTS KALYAN CHART KALYAN MATKA ...❻❸❼⓿❽❻❷⓿⓿❼ SATTA MATKA DPBOSS KALYAN MATKA RESULTS KALYAN CHART KALYAN MATKA ...
❻❸❼⓿❽❻❷⓿⓿❼ SATTA MATKA DPBOSS KALYAN MATKA RESULTS KALYAN CHART KALYAN MATKA ...
#kalyanmatkaresult #dpboss #kalyanmatka #satta #matka #sattamatka
 
❻❸❼⓿❽❻❷⓿⓿❼ SATTA MATKA DPBOSS KALYAN FAST RESULTS CHART KALYAN MATKA MATKA RE...
❻❸❼⓿❽❻❷⓿⓿❼ SATTA MATKA DPBOSS KALYAN FAST RESULTS CHART KALYAN MATKA MATKA RE...❻❸❼⓿❽❻❷⓿⓿❼ SATTA MATKA DPBOSS KALYAN FAST RESULTS CHART KALYAN MATKA MATKA RE...
❻❸❼⓿❽❻❷⓿⓿❼ SATTA MATKA DPBOSS KALYAN FAST RESULTS CHART KALYAN MATKA MATKA RE...
#kalyanmatkaresult #dpboss #kalyanmatka #satta #matka #sattamatka
 

Recently uploaded (20)

buku report tentang analisis TIMSS 2023.pdf
buku report tentang analisis TIMSS 2023.pdfbuku report tentang analisis TIMSS 2023.pdf
buku report tentang analisis TIMSS 2023.pdf
 
@Call @Girls Bandra phone 9920874524 You Are Serach A Beautyfull Dolle come here
@Call @Girls Bandra phone 9920874524 You Are Serach A Beautyfull Dolle come here@Call @Girls Bandra phone 9920874524 You Are Serach A Beautyfull Dolle come here
@Call @Girls Bandra phone 9920874524 You Are Serach A Beautyfull Dolle come here
 
SAP ANalytics Cloud -SAP SAC planning 22
SAP ANalytics Cloud -SAP SAC planning 22SAP ANalytics Cloud -SAP SAC planning 22
SAP ANalytics Cloud -SAP SAC planning 22
 
Applications of Data Science in Various Industries
Applications of Data Science in Various IndustriesApplications of Data Science in Various Industries
Applications of Data Science in Various Industries
 
*Call *Girls in Hyderabad 🤣 8826483818 🤣 Pooja Sharma Best High Class Hyderab...
*Call *Girls in Hyderabad 🤣 8826483818 🤣 Pooja Sharma Best High Class Hyderab...*Call *Girls in Hyderabad 🤣 8826483818 🤣 Pooja Sharma Best High Class Hyderab...
*Call *Girls in Hyderabad 🤣 8826483818 🤣 Pooja Sharma Best High Class Hyderab...
 
❻❸❼⓿❽❻❷⓿⓿❼ SATTA MATKA DPBOSS KALYAN FAST RESULTS CHART KALYAN MATKA MATKA RE...
❻❸❼⓿❽❻❷⓿⓿❼ SATTA MATKA DPBOSS KALYAN FAST RESULTS CHART KALYAN MATKA MATKA RE...❻❸❼⓿❽❻❷⓿⓿❼ SATTA MATKA DPBOSS KALYAN FAST RESULTS CHART KALYAN MATKA MATKA RE...
❻❸❼⓿❽❻❷⓿⓿❼ SATTA MATKA DPBOSS KALYAN FAST RESULTS CHART KALYAN MATKA MATKA RE...
 
Kolkata @Call @Girls Service 0000000000 Rani Best High Class Kolkata Available
Kolkata @Call @Girls Service 0000000000 Rani Best High Class Kolkata AvailableKolkata @Call @Girls Service 0000000000 Rani Best High Class Kolkata Available
Kolkata @Call @Girls Service 0000000000 Rani Best High Class Kolkata Available
 
Cómo hemos implementado semántica de "Exactly Once" en nuestra base de datos ...
Cómo hemos implementado semántica de "Exactly Once" en nuestra base de datos ...Cómo hemos implementado semántica de "Exactly Once" en nuestra base de datos ...
Cómo hemos implementado semántica de "Exactly Once" en nuestra base de datos ...
 
2024 June - Orange County (CA) Tableau User Group Meeting
2024 June - Orange County (CA) Tableau User Group Meeting2024 June - Orange County (CA) Tableau User Group Meeting
2024 June - Orange County (CA) Tableau User Group Meeting
 
LLM powered Contract Compliance Application.pptx
LLM powered Contract Compliance Application.pptxLLM powered Contract Compliance Application.pptx
LLM powered Contract Compliance Application.pptx
 
[D2T2S04] SageMaker를 활용한 Generative AI Foundation Model Training and Tuning
[D2T2S04] SageMaker를 활용한 Generative AI Foundation Model Training and Tuning[D2T2S04] SageMaker를 활용한 Generative AI Foundation Model Training and Tuning
[D2T2S04] SageMaker를 활용한 Generative AI Foundation Model Training and Tuning
 
一比一原版英国埃塞克斯大学毕业证(essex毕业证书)如何办理
一比一原版英国埃塞克斯大学毕业证(essex毕业证书)如何办理一比一原版英国埃塞克斯大学毕业证(essex毕业证书)如何办理
一比一原版英国埃塞克斯大学毕业证(essex毕业证书)如何办理
 
Simon Fraser University degree offer diploma Transcript
Simon Fraser University  degree offer diploma TranscriptSimon Fraser University  degree offer diploma Transcript
Simon Fraser University degree offer diploma Transcript
 
Bangalore @Call @Girls 0000000000 Riya Khan Beautiful And Cute Girl any Time
Bangalore @Call @Girls 0000000000 Riya Khan Beautiful And Cute Girl any TimeBangalore @Call @Girls 0000000000 Riya Khan Beautiful And Cute Girl any Time
Bangalore @Call @Girls 0000000000 Riya Khan Beautiful And Cute Girl any Time
 
[D3T1S02] Aurora Limitless Database Introduction
[D3T1S02] Aurora Limitless Database Introduction[D3T1S02] Aurora Limitless Database Introduction
[D3T1S02] Aurora Limitless Database Introduction
 
❻❸❼⓿❽❻❷⓿⓿❼ SATTA MATKA DPBOSS KALYAN FAST RESULTS CHART KALYAN MATKA MATKA RE...
❻❸❼⓿❽❻❷⓿⓿❼ SATTA MATKA DPBOSS KALYAN FAST RESULTS CHART KALYAN MATKA MATKA RE...❻❸❼⓿❽❻❷⓿⓿❼ SATTA MATKA DPBOSS KALYAN FAST RESULTS CHART KALYAN MATKA MATKA RE...
❻❸❼⓿❽❻❷⓿⓿❼ SATTA MATKA DPBOSS KALYAN FAST RESULTS CHART KALYAN MATKA MATKA RE...
 
Madurai @Call @Girls Whatsapp 0000000000 With High Profile Offer 25%
Madurai @Call @Girls Whatsapp 0000000000 With High Profile Offer 25%Madurai @Call @Girls Whatsapp 0000000000 With High Profile Offer 25%
Madurai @Call @Girls Whatsapp 0000000000 With High Profile Offer 25%
 
( Call ) Girls South Mumbai phone 9930687706 You Are Serach A Beautyfull Doll...
( Call ) Girls South Mumbai phone 9930687706 You Are Serach A Beautyfull Doll...( Call ) Girls South Mumbai phone 9930687706 You Are Serach A Beautyfull Doll...
( Call ) Girls South Mumbai phone 9930687706 You Are Serach A Beautyfull Doll...
 
❻❸❼⓿❽❻❷⓿⓿❼ SATTA MATKA DPBOSS KALYAN MATKA RESULTS KALYAN CHART KALYAN MATKA ...
❻❸❼⓿❽❻❷⓿⓿❼ SATTA MATKA DPBOSS KALYAN MATKA RESULTS KALYAN CHART KALYAN MATKA ...❻❸❼⓿❽❻❷⓿⓿❼ SATTA MATKA DPBOSS KALYAN MATKA RESULTS KALYAN CHART KALYAN MATKA ...
❻❸❼⓿❽❻❷⓿⓿❼ SATTA MATKA DPBOSS KALYAN MATKA RESULTS KALYAN CHART KALYAN MATKA ...
 
❻❸❼⓿❽❻❷⓿⓿❼ SATTA MATKA DPBOSS KALYAN FAST RESULTS CHART KALYAN MATKA MATKA RE...
❻❸❼⓿❽❻❷⓿⓿❼ SATTA MATKA DPBOSS KALYAN FAST RESULTS CHART KALYAN MATKA MATKA RE...❻❸❼⓿❽❻❷⓿⓿❼ SATTA MATKA DPBOSS KALYAN FAST RESULTS CHART KALYAN MATKA MATKA RE...
❻❸❼⓿❽❻❷⓿⓿❼ SATTA MATKA DPBOSS KALYAN FAST RESULTS CHART KALYAN MATKA MATKA RE...
 

Conditional Image Generation with PixelCNN Decoders

  • 1. Conditional Image Generation with PixelCNN Decoders Yohei Sugawara BrainPad Inc. NIPS2016読み会(@Preferred Networks) January 19, 2017 Aäron van den Oord Nal Kalchbrenner Oriol Vinyals Lasse Espeholt Alex Graves Koray Kavukcuoglu Google DeepMind
  • 2. Preview - Pixel dependencies = Raster scan order - Autoregressive models sequentially predict pixels rather than predicting the whole image at once (like as GAN, VAE) PixelCNN PixelRNN (Diagonal BiLSTM) Pros: easier to parallelize Cons: bounded dependency masked convolution 【Background: Autoregressive Image Modeling】 【Previous work: Pixel Recurrent Neural Networks】 - CNN based model and RNN based models are proposed. Pros: full dependency field Cons: sequential training. 【Proposed approach: Gated PixelCNN】 Vertical maps Horizontal maps one-hot encoding of the class-label embedding from trained model conditioning on 4. PixelCNN AutoEncoder Encoder = Convolutional layers Decoder = Conditional PixelCNN layers 3. Conditional PixelCNN 2. Gated PixelCNN architecture.1. Removal of blind spots in the receptive field by combining the horizontal stack and the vertical stack. 【Experimental Result】 Data - CIFAR-10 dataset (32x32 color images) Performance (Unconditional)
  • 3. Image Generation Models -Three image generation approaches are dominating the field: Variational AutoEncoders (VAE) Generative Adversarial Networks (GAN) z x )(~ zpz θ )|(~ zxpx θ Decoder Encoder )|( xzqφ x z Real D G Fake Real/Fake ? generate Autoregressive Models (cf. https://openai.com/blog/generative-models/) VAE GAN Autoregressive Models Pros. - Efficient inference with approximate latent variables. - generate sharp image. - no need for any Markov chain or approx networks during sampling. - very simple and stable training process - currently gives the best log likelihood. - tractable likelihood Cons. - generated samples tend to be blurry. - difficult to optimize due to unstable training dynamics. - relatively inefficient during sampling
  • 4. Autoregressive Image Modeling - Autoregressive models train a network that models the conditional distribution of every individual pixel given previous pixels (raster scan order dependencies). ⇒ sequentially predict pixels rather than predicting the whole image at once (like as GAN, VAE) - For color image, 3 channels are generated successive conditioning, blue given red and green, green given red, and red given only the pixels above and to the left of all channels. R G B
  • 5. Previous work: Pixel Recurrent Neural Networks.  “Pixel Recurrent Neural Networks” got best paper award at ICML2016.  They proposed two types of models, PixelRNN and PixelCNN (two types of LSTM layers are proposed for PixelRNN.) PixelCNNPixelRNN masked convolution Row LSTM Diagonal BiLSTM PixelRNN PixelCNN Pros. • effectively handles long-range dependencies ⇒ good performance Convolutions are easier to parallelize ⇒ much faster to train Cons. • Each state needs to be computed sequentially. ⇒ computationally expensive Bounded receptive field ⇒ inferior performance Blind spot problem (due to the masked convolution) needs to be eliminated. • LSTM based models are natural choice for dealing with the autoregressive dependencies. • CNN based model uses masked convolution, to ensure the model is causal. 11w 12w 13w 21w 22w 23w 31w 32w 33w 
  • 6. Details of “Masked Convolution” & “Blind Spot”  To generate next pixel, the model can only condition on the previously generated pixels.  Then, to make sure CNN can only use information about pixels above and to the left of current pixel, the filters of the convolution need to be masked. Case 1D  Right figure shows 5x1 convolutional filters after multiplying them by mask.  The filters connecting the input layer to the first hidden layer are in this case multiplied by m=(1,1,0,0,0), to ensure the model is causal. (cf. Generating Interpretable Images with Contollable Structure, S.Reed et.al., 2016) 5x1 filter m=(1,1,0,0,0) Case 2D (ex. text, audio, etc) (ex. image)  In case of 2D, PixelCNNs have a blind spot in the receptive field that cannot be used to make predictions.  Rightmost figure shows the growth of the masked receptive field. (3 layered network with 3x3 conv filters) 5x5 filter 3x3 filter, 3 layered
  • 7. Proposed approach: Gated PixelCNN  In this paper, they proposed the improved version of PixelCNN.  Major improvements are as follows. 1. Removal of blind spots in the receptive field by combining the horizontal stack and the vertical stack. 2. Replacement of the ReLu activations between the masked convolutions in the original PixelCNN with the gated activation unit. 3. Given a latent vector, they modeled the conditional distribution of images, conditional PixelCNN. a. conditioning on class-label b. conditioning on embedding from trained model 4. From a convolutional auto-encoder, they replaced the deconvolutional decoder with conditional PixelCNN, named PixelCNN Auto-Encoders
  • 8. First improvement: horizontal stack and vertical stack  The removal of blind spots in the receptive field are important for PixelCNN’s performance, because the blind spot can cover as much as a quarter of the potential receptive field.  The vertical stack conditions on all rows above the current row.  The horizontal stack conditions on current row. - Details about implementation techniques are described below.
  • 9. Second improvement: Gated activation and architecture  Gated activation unit:  Single layer block of a Gated PixelCNN (σ: sigmoid, k: number of layer, ⦿: element-wise product, *: convolutional operator) - Masked convolutions are shown in green. - Element-wise operations are shown in red. - Convolutions with Wf, Wg are combined into a single operation shown in blue. 𝑣 𝑣′ ℎ ℎ′ 𝑣 = vertical activation maps ℎ = horizontal activation maps inth intv
  • 10. Details of Gated PixelCNN architecture  Break down operations into four steps. ① Calculate vertical feature maps … n×n convolutions are calculated with gated activation. Input: ( = input image if 1st layer) Output: 𝑣 𝑣′ ℎ ℎ′ 𝑣𝑖𝑛𝑡 ℎ𝑖𝑛𝑡 𝑣 𝑣′ Feature map 3x3 masked filters receptive field (1,1,1,1) zero padding receptive field (1,0,1,1) zero padding 2x3 filtersFeature map Two types of equivalent implementation: In this case, (i, j)th pixel depends on (i, j+k)th (future) pixels Next problem: Solution: Shift down vertical feature maps when to feed into horizontal stack. (ex. n=3)
  • 11. 0 0 0 … 0 0 𝑣 𝑣′ ℎ ℎ′ 𝑣𝑖𝑛𝑡 ℎ𝑖𝑛𝑡 Details of Gated PixelCNN architecture ② Feed vertical maps into horizontal stack 1. n x n masked convolution 2. shifting down operation (as below) 3. 1 x 1 convolution Input: ( = input image if 1st layer) Output: 𝑣 𝑣𝑖𝑛𝑡 Shift down vertical feature maps when to feed into horizontal stack. 1. Add zero padding on the top 2. Crop the bottom  Left operations can be interpreted as below. ensure causalityviolate causality Feature map Feature map
  • 12. Details of Gated PixelCNN architecture 𝑣 𝑣′ ℎ ℎ′ 𝑣𝑖𝑛𝑡 ℎ𝑖𝑛𝑡 Feature map 1x3 masked filters receptive field (0,0,1,1) zero padding receptive field (0,0,1,0) zero padding 1x2 filtersFeature map Two types of equivalent implementation: Next problem: ➢ Mask ‘A’ vs Mask ‘B’ ③ Calculate horizontal feature maps … 1×n convolutions are calculated with gated activation. (vertical maps are added before activation.) Input: , (input image if 1st layer) Output: ℎ inth intv - Mask ‘A’ (restrict connection from itself) is applied to only to the first convolution. - Mask ‘B’ (allow connection from itself) is applied to all the subsequent convolution. (ex. n=3)
  • 13. 𝑣 𝑣′ ℎ ℎ′ Details of Gated PixelCNN architecture ④ Calculate residual connection in horizontal stack. … 1×1 convolutions are calculated without gated activation. then, maps are added to horizontal maps (of layer’s input) Input: , (input image if 1st layer) Output: ℎ ℎ′ inth inth intv ③ Calculate horizontal feature maps - Mask ‘A’ can be implemented as below. (ex. n=3) receptive field (0,0,1,0) zero padding 1x1 filters Feature map  [Convolution] [Crop the right]
  • 14. Output layer and whole architecture  Output layer  Using a softmax on discrete pixel values ([0-255] = 256 way) instead of a mixture density approach. (same approach as PixelRNN)  Although without prior information about the meaning or relations of the 256 color categories, the distributions predicted by the model are meaningful.  Whole architecture (width) x (height) x (channels) … … (width) x (height) x p (#feature maps) Gated PixelCNN layer output Input Additional 1x1 conv layers
  • 15. original conditional Model Gated activation unit Third improvements: conditional PixelCNN & PixelCNN AutoEncoder  they modeled the conditional distribution by adding terms that depend on h to the activations before the nolinearities  coniditional PixelCNN  PixelCNN AutoEncoder  From a convolutional auto-encoder, they replaced the deconvolutional decoder with conditional PixelCNN Encoder:Convolution layers Decoder:Deconvolution layers ⇒ conditional PixelCNN layers
  • 16. Experimental Results (Unconditional)  Data: CIFAR-10 dataset  Score: Negative log-likelihood score (bits/dim)  Gated PixelCNN outperforms the PixelCNN by 0.11 bits/dim, which has a very significant effect on the visual quality, and close to the performance of PixelRNN  Data: ImageNet dataset  Gated PixelCNN outperforms PixelRNN.  Achieve similar performance to the PixelRNN in less than half the training time.
  • 17. Experimental Results (Conditional)  Coniditioning on ImageNet classes  Given a one-hot encoding hi, for the i-th class, model )|( ihxp  Coniditioning on Portrait Embeddings (part of results.)  Embeddings are took from top layer of a conv network trained on a large database of portraits from Flickr images.  After the supervised net was trained, {x:image, h:embedding} tuples are taken and trained conditional PixelCNN to model  Given a new image of a person that was not in the training set, they computed h and generate new portraits of same person. )|( hxp  And experimented with reconstructions conditioned on linear interpolations between embeddings of pairs of images.
  • 18. Experimental Results (PixelCNN Auto-Encoder)  Data: 32x32 ImageNet patches (Left to right: original image, reconstruction by auto-encoder, conditional samples from PixelCNN auto-encoder) (m: dimensional bottleneck)
  • 19. Summary & Reference  Improved PixelCNN:  Same performance as PixelRNN, but faster (easier to parallelize)  Fixed “blind spots” problem  Gated activation units  Conditional Generation:  Conditioned on class-label  Conditioned on portrait embedding  PixelCNN AutoEncoders  Summary  References [1] Aäron van den Oord et al., “Conditional Image Generation with PixelCNN Decoders”, NIPS 2016 [2] Aäron van den Oord et al., “Pixel Recurrent Neural Networks”, ICML 2016 (Best Paper Award) [3] S. Reed, A. van den Oord et al., “Generating Interpretable Images with Controllable Structure”, Under review as a conference paper at ICLR 2017
  • 20. Appendix – Progress of research related to this paper  Applied to other domains ➢ “WaveNet: A Generative Model for Raw Audio”, A. van den Oord et al. (DeepMind) ➢ “Video Pixel Networks”, Nal Kalchbrenner, A. van den Oord et al. (DeepMind) ➢ “Language Modeling with Gated Convolutional Networks”, Yann N.Dauphin et al. (Facebook AI Research) ➢ “Generating Interpretable Images with Controllable Structure”, S.Reed, A. van den Oord et al. (Google DeepMind), Under review as a conference paper at ICLR 2017  The conditional probability distribution is modelled by a stack of dilated causal convolutional layers with gated activation units.  The architecture of the generative video model consists of two parts: 1. Resolution preserving CNN encoders 2. PixelCNN decoders  Text-to-image synthesis (generating images from captions and other strcuture) using gated conditional PixelCNN model.  A new language model that replace recurrent connections typically used in RNN with gated temporal convolutions.
  • 21. Appendix – Progress of research related to this paper ➢ “PixelCNN++: A PixelCNN Inplementation with Discretized Logistic Mixture Likelihood and Other Modifications”, Tim Salimans, Andrej Karpathy, et al. (OpenAI), Under review as a conference paper at ICLR 2017  Modifications of Gated PixelCNN model ➢ “PixelVAE: A Latent Variable Model for Natural Images”, Ishaan Gulrajani, Kundan Kumar et al., Under review as a conference paper at ICLR 2017  A number of modifications to the original gated PixelCNN model. 1. Use a discretized logistic mixture likelihood, rather than a 256-way sofmax. 2. Condition on whole pixels, rather than R/G/B sub-pixels. 3. Use downsampling. 4. Introduce additional short-cut connections. (like as U-net) 5. Regularize the model using dropout.  A VAE model with an autoregressive decoder based on PixelCNN.