(Go: >> BACK << -|- >> HOME <<)

SlideShare a Scribd company logo
Introduction to Diffusion Models
2022.01.03.
KAIST ALIN-LAB
Sangwoo Mo
1
• Diffusion model is SOTA on image generation
• Beat BigGAN and StyleGAN on high-resolution images
Diffusion Model Boom!
2
Dhariwal & Nichol. Diffusion Models Beat GANs on Image Synthesis. NeurIPS’21
• Diffusion model is SOTA on density estimation
• Beat autoregressive models on likelihood score
Diffusion Model Boom!
3
Song et al. Maximum Likelihood Training of Score-Based Diffusion Models. NeurIPS’21
Kingma et al. Variational Diffusion Models. NeurIPS’21
• Diffusion model is useful for image editing
• Editing = Rough scribble + diffusion (i.e., naturalization)
• Scribbled images are unseen for GANs, but diffusion models still can denoise them
Diffusion Model Boom!
4
Meng et al. SDEdit: Image Synthesis and Editing with Stochastic Differential Equations. arXiv’21
• Diffusion model is useful for image editing
• Also can be combined with vision-and-language model
Diffusion Model Boom!
5
Nichol et al. GLIDE: Towards Photorealistic Image Generation and Editing with Text-Guided Diffusion Models. arXiv’21
• Diffusion model is also effective for non-visual domains
• Continuous domains like speech, and even for discrete domains like text
Diffusion Model Boom!
6
Kong et al. DiffWave: A Versatile Diffusion Model for Audio Synthesis. ICLR’21
Austin et al. Structured Denoising Diffusion Models in Discrete State-Spaces. NeurIPS’21
• Trilemma of generative models: Quality vs. Diversity vs. Speed
• Diffusion model produces diverse and high-quality samples, but generations is slow
Diffusion Model is All We Need?
7
Xiao et al. Tackling the Generative Learning Trilemma with Denoising Diffusion GANs. arXiv’21
• Today’s content
• Diffusion Probabilistic Model – ICML’15
• Denoising Diffusion Probabilistic Model (DDPM) – NeurIPS’20
• Improve quality & diversity of diffusion model
• Denoising Diffusion Implicit Model (DDIM) – ICLR’21
• Improve generation speed of diffusion model
• Not covering
• Relation of diffusion model and score matching
• Extension to stochastic differential equation
• There are lots of new interesting works (see NeurIPS’21, ICLR’22)
Outline
8
Score SDE: Song et al. Score-Based Generative Modeling through Stochastic Differential Equations. ICLR’21
→ See Score SDE (ICLR’21)
• Diffusion model aims to learn the reverse of noise generation procedure
• Forward step: (Iteratively) Add noise to the original sample
→ The sample 𝑥! converges to the complete noise 𝑥" (e.g., ∼ 𝒩(0, 𝐼))
Diffusion Probabilistic Model
9
Sohl-Dickstein et al. Deep Unsupervised Learning using Nonequilibrium Thermodynamics. ICML’15
Forward (diffusion) process
• Diffusion model aims to learn the reverse of noise generation procedure
• Forward step: (Iteratively) Add noise to the original sample
→ The sample 𝑥! converges to the complete noise 𝑥" (e.g., ∼ 𝒩(0, 𝐼))
• Reverse step: Recover the original sample from the noise
→ Note that it is the “generation” procedure
Diffusion Probabilistic Model
10
Sohl-Dickstein et al. Deep Unsupervised Learning using Nonequilibrium Thermodynamics. ICML’15
Reverse process
Forward (diffusion) process
• Diffusion model aims to learn the reverse of noise generation procedure
• Forward step: (Iteratively) Add noise to the original sample
→ Technically, it is a product of conditional noise distributions 𝑞(𝐱#|𝐱#$%)
• Usually, the parameters 𝛽# are fixed (one can jointly learn, but not beneficial)
• Noise annealing (i.e., reducing noise scale 𝛽# < 𝛽#$%) is crucial to the performance
Diffusion Probabilistic Model
11
Sohl-Dickstein et al. Deep Unsupervised Learning using Nonequilibrium Thermodynamics. ICML’15
• Diffusion model aims to learn the reverse of noise generation procedure
• Forward step: (Iteratively) Add noise to the original sample
→ Technically, it is a product of conditional noise distributions 𝑞(𝐱#|𝐱#$%)
• Reverse step: Recover the original sample from the noise
→ It is also a product of conditional (de)noise distributions 𝑝&(𝐱#'%|𝐱#)
• Use the learned parameters: denoiser 𝝁& (main part) and randomness 𝚺&
Diffusion Probabilistic Model
12
Sohl-Dickstein et al. Deep Unsupervised Learning using Nonequilibrium Thermodynamics. ICML’15
• Diffusion model aims to learn the reverse of noise generation procedure
• Forward step: (Iteratively) Add noise to the original sample
Reverse step: Recover the original sample from the noise
• Training: Minimize variational lower bound of the model 𝑝&(𝐱!)
Diffusion Probabilistic Model
13
Sohl-Dickstein et al. Deep Unsupervised Learning using Nonequilibrium Thermodynamics. ICML’15
• Diffusion model aims to learn the reverse of noise generation procedure
• Forward step: (Iteratively) Add noise to the original sample
Reverse step: Recover the original sample from the noise
• Training: Minimize variational lower bound of the model 𝑝& 𝐱!
→ It can be decomposed to the step-wise losses (for each step 𝑡)
Diffusion Probabilistic Model
14
Sohl-Dickstein et al. Deep Unsupervised Learning using Nonequilibrium Thermodynamics. ICML’15
• Diffusion model aims to learn the reverse of noise generation procedure
• Training: Minimize variational lower bound of the model 𝑝& 𝐱!
→ It can be decomposed to the step-wise losses (for each step 𝑡)
• Here, the true reverse step 𝑞(𝐱#$%|𝐱#, 𝐱!) can be computed as a closed form of 𝛽#
• Note that we only define the true forward step 𝑞(𝐱#|𝐱#$%)
• Since all distributions above are Gaussian, the KL divergences are tractable
Diffusion Probabilistic Model
15
Sohl-Dickstein et al. Deep Unsupervised Learning using Nonequilibrium Thermodynamics. ICML’15
• Diffusion model aims to learn the reverse of noise generation procedure
• Network: Use the image-to-image translation (e.g., U-Net) architectures
• Recall that input is 𝐱# and output is 𝐱#$%, both are images
• It is expensive since both input and output are high-dimensional
• Note that the denoiser 𝜇& 𝐱(, t shares weights, but conditioned by step 𝑡
Diffusion Probabilistic Model
16
* Image from the pix2pix-HD paper
Sohl-Dickstein et al. Deep Unsupervised Learning using Nonequilibrium Thermodynamics. ICML’15
• Diffusion model aims to learn the reverse of noise generation procedure
• Sampling: Draw a random noise 𝒙" then apply the reverse step 𝑝&(𝐱#'%|𝐱#)
• It often requires the hundreds of reverse steps (very slow)
• Early and late steps change the high- and low-level attributes, respectively
Diffusion Probabilistic Model
17
* Image from the DDPM paper
Sohl-Dickstein et al. Deep Unsupervised Learning using Nonequilibrium Thermodynamics. ICML’15
• DDPM reparametrizes the reverse distributions of diffusion models
• Key idea: The original reverse step fully creates the denoiser 𝜇& 𝐱(, t from 𝐱#
• However, 𝐱#$% and 𝐱# share most information, and thus it is redundant
→ Instead, create the residual 𝜖& 𝐱(, t and add to the original 𝐱#
Denoising Diffusion Probabilistic Model (DDPM)
18
Ho et al. Denoising Diffusion Probabilistic Models. NeurIPS'20
• DDPM reparametrizes the reverse distributions of diffusion models
• Key idea: The original reverse step fully creates the denoiser 𝜇& 𝐱(, t from 𝐱#
• However, 𝐱#$% and 𝐱# share most information, and thus it is redundant
→ Instead, create the residual 𝜖& 𝐱(, t and add to the original 𝐱#
• Formally, DDPM reparametrizes the learned reverse distribution as1
and the step-wise objective 𝐿#$% can be reformulated as2
Denoising Diffusion Probabilistic Model (DDPM)
19
1. 𝛼! are some constants determined by 𝛽!
2. Note that we need no “intermediate” samples, and only compare the forward noise 𝝐 and reverse noise 𝝐" conditioned on 𝐱#
Ho et al. Denoising Diffusion Probabilistic Models. NeurIPS'20
• DDPM initiated the diffusion model boom
• Achieved SOTA on CIFAR-10, with high-resolution scalability
• It produces more diverse samples than GAN (no mode collapse)
Denoising Diffusion Probabilistic Model (DDPM)
20
Ho et al. Denoising Diffusion Probabilistic Models. NeurIPS'20
• DDIM roughly sketches the final sample, then refine it with the reverse process
• Motivation:
• Diffusion model is slow due to the iterative procedure
• GAN/VAE creates the sample by one-shot forward operation
• ⇒ Can we combine the advantages for fast sampling of diffusion models?
• Technical spoiler:
• Instead of naïvely applying diffusion model upon GAN/VAE,
DDIM proposes a principled approach of rough sketch + refinement
Denoising Diffusion Implicit Model (DDIM)
21
Song et al. Denoising Diffusion Implicit Models. ICLR’21
• DDIM roughly sketches the final sample, then refine it with the reverse process
• Key idea:
• Given 𝐱#, generate the rough sketch 𝐱! and refine 𝑝&(𝐱#$%|𝐱#, 𝐱!)1
• Unlike original diffusion model, it is not a Markovian structure
Denoising Diffusion Implicit Model (DDIM)
22
1. Recall that the original diffusion model uses 𝑝"(𝐱$%&|𝐱$)
Song et al. Denoising Diffusion Implicit Models. ICLR’21
• DDIM roughly sketches the final sample, then refine it with the reverse process
• Key idea: Given 𝐱#, generate the rough sketch 𝐱! and refine 𝑞(𝐱#$%|𝐱#, 𝐱!)
• Formulation: Define the forward distribution 𝑞(𝐱#$%|𝐱#, 𝐱!) as
then, the forward process is derived from Bayes’ rule
Denoising Diffusion Implicit Model (DDIM)
23
Song et al. Denoising Diffusion Implicit Models. ICLR’21
• DDIM roughly sketches the final sample, then refine it with the reverse process
• Key idea: Given 𝐱#, generate the rough sketch 𝐱! and refine 𝑞(𝐱#$%|𝐱#, 𝐱!)
• Formulation: Forward process is
and reverse process is
Denoising Diffusion Implicit Model (DDIM)
24
Song et al. Denoising Diffusion Implicit Models. ICLR’21
• DDIM roughly sketches the final sample, then refine it with the reverse process
• Key idea: Given 𝐱#, generate the rough sketch 𝐱! and refine 𝑞(𝐱#$%|𝐱#, 𝐱!)
• Formulation: Forward process is
and reverse process is
• Training: The variational lower bound of DDIM is identical to the one of DDPM1
• It is surprising since the forward/reverse formulation is totally different
Denoising Diffusion Implicit Model (DDIM)
25
1. Precisely, the bound is different, but the solution is identical under some assumption (though violated in practice)
Song et al. Denoising Diffusion Implicit Models. ICLR’21
• DDIM significantly reduces the sampling steps of diffusion model
• Creates the outline of the sample after only 10 steps (DDPM needs hundreds)
Denoising Diffusion Implicit Model (DDIM)
26
Song et al. Denoising Diffusion Implicit Models. ICLR’21
• New golden era of generative models
• Competition of various approaches: GAN, VAE, flow, diffusion model1
• Also, lots of hybrid approaches (e.g., score SDE = diffusion + continuous flow)
• Which model to use?
• Diffusion model seems to be a
nice option for high-quality generation
• However, GAN is (currently) still a
more practical solution which needs
fast sampling (e.g., real-time apps.)
Take-home Message
27
1. VAE also shows promising generation performance (see NVAE, very deep VAE)
28
Thank you for listening! 😀

More Related Content

What's hot

Stable Diffusion path
Stable Diffusion pathStable Diffusion path
Stable Diffusion path
Vitaly Bondar
 
Generative adversarial networks
Generative adversarial networksGenerative adversarial networks
Generative adversarial networks
Yunjey Choi
 
Introduction to Generative Adversarial Networks (GANs)
Introduction to Generative Adversarial Networks (GANs)Introduction to Generative Adversarial Networks (GANs)
Introduction to Generative Adversarial Networks (GANs)
Appsilon Data Science
 
diffusion 모델부터 DALLE2까지.pdf
diffusion 모델부터 DALLE2까지.pdfdiffusion 모델부터 DALLE2까지.pdf
diffusion 모델부터 DALLE2까지.pdf
수철 박
 
Score-Based Generative Modeling through Stochastic Differential Equations
Score-Based Generative Modeling through Stochastic Differential EquationsScore-Based Generative Modeling through Stochastic Differential Equations
Score-Based Generative Modeling through Stochastic Differential Equations
Sangwoo Mo
 
Generative Adversarial Network (GAN)
Generative Adversarial Network (GAN)Generative Adversarial Network (GAN)
Generative Adversarial Network (GAN)
Prakhar Rastogi
 
Domain Transfer and Adaptation Survey
Domain Transfer and Adaptation SurveyDomain Transfer and Adaptation Survey
Domain Transfer and Adaptation Survey
Sangwoo Mo
 
Generative adversarial networks
Generative adversarial networksGenerative adversarial networks
Generative adversarial networks
Ding Li
 
Transformer in Vision
Transformer in VisionTransformer in Vision
Transformer in Vision
Sangmin Woo
 
Self-supervised Learning Lecture Note
Self-supervised Learning Lecture NoteSelf-supervised Learning Lecture Note
Self-supervised Learning Lecture Note
Sangwoo Mo
 
Tutorial on Theory and Application of Generative Adversarial Networks
Tutorial on Theory and Application of Generative Adversarial NetworksTutorial on Theory and Application of Generative Adversarial Networks
Tutorial on Theory and Application of Generative Adversarial Networks
MLReview
 
Deep learning - A Visual Introduction
Deep learning - A Visual IntroductionDeep learning - A Visual Introduction
Deep learning - A Visual Introduction
Lukas Masuch
 
GAN - Theory and Applications
GAN - Theory and ApplicationsGAN - Theory and Applications
GAN - Theory and Applications
Emanuele Ghelfi
 
PR-355: Masked Autoencoders Are Scalable Vision Learners
PR-355: Masked Autoencoders Are Scalable Vision LearnersPR-355: Masked Autoencoders Are Scalable Vision Learners
PR-355: Masked Autoencoders Are Scalable Vision Learners
Jinwon Lee
 
Deep Learning - Convolutional Neural Networks
Deep Learning - Convolutional Neural NetworksDeep Learning - Convolutional Neural Networks
Deep Learning - Convolutional Neural Networks
Christian Perone
 
GANs Presentation.pptx
GANs Presentation.pptxGANs Presentation.pptx
GANs Presentation.pptx
MAHMOUD729246
 
Synthetic data generation for machine learning
Synthetic data generation for machine learningSynthetic data generation for machine learning
Synthetic data generation for machine learning
QuantUniversity
 
Introduction to Transformer Model
Introduction to Transformer ModelIntroduction to Transformer Model
Introduction to Transformer Model
Nuwan Sriyantha Bandara
 
[PR12] intro. to gans jaejun yoo
[PR12] intro. to gans   jaejun yoo[PR12] intro. to gans   jaejun yoo
[PR12] intro. to gans jaejun yoo
JaeJun Yoo
 
GANs Deep Learning Summer School
GANs Deep Learning Summer SchoolGANs Deep Learning Summer School
GANs Deep Learning Summer School
Rubens Zimbres, PhD
 

What's hot (20)

Stable Diffusion path
Stable Diffusion pathStable Diffusion path
Stable Diffusion path
 
Generative adversarial networks
Generative adversarial networksGenerative adversarial networks
Generative adversarial networks
 
Introduction to Generative Adversarial Networks (GANs)
Introduction to Generative Adversarial Networks (GANs)Introduction to Generative Adversarial Networks (GANs)
Introduction to Generative Adversarial Networks (GANs)
 
diffusion 모델부터 DALLE2까지.pdf
diffusion 모델부터 DALLE2까지.pdfdiffusion 모델부터 DALLE2까지.pdf
diffusion 모델부터 DALLE2까지.pdf
 
Score-Based Generative Modeling through Stochastic Differential Equations
Score-Based Generative Modeling through Stochastic Differential EquationsScore-Based Generative Modeling through Stochastic Differential Equations
Score-Based Generative Modeling through Stochastic Differential Equations
 
Generative Adversarial Network (GAN)
Generative Adversarial Network (GAN)Generative Adversarial Network (GAN)
Generative Adversarial Network (GAN)
 
Domain Transfer and Adaptation Survey
Domain Transfer and Adaptation SurveyDomain Transfer and Adaptation Survey
Domain Transfer and Adaptation Survey
 
Generative adversarial networks
Generative adversarial networksGenerative adversarial networks
Generative adversarial networks
 
Transformer in Vision
Transformer in VisionTransformer in Vision
Transformer in Vision
 
Self-supervised Learning Lecture Note
Self-supervised Learning Lecture NoteSelf-supervised Learning Lecture Note
Self-supervised Learning Lecture Note
 
Tutorial on Theory and Application of Generative Adversarial Networks
Tutorial on Theory and Application of Generative Adversarial NetworksTutorial on Theory and Application of Generative Adversarial Networks
Tutorial on Theory and Application of Generative Adversarial Networks
 
Deep learning - A Visual Introduction
Deep learning - A Visual IntroductionDeep learning - A Visual Introduction
Deep learning - A Visual Introduction
 
GAN - Theory and Applications
GAN - Theory and ApplicationsGAN - Theory and Applications
GAN - Theory and Applications
 
PR-355: Masked Autoencoders Are Scalable Vision Learners
PR-355: Masked Autoencoders Are Scalable Vision LearnersPR-355: Masked Autoencoders Are Scalable Vision Learners
PR-355: Masked Autoencoders Are Scalable Vision Learners
 
Deep Learning - Convolutional Neural Networks
Deep Learning - Convolutional Neural NetworksDeep Learning - Convolutional Neural Networks
Deep Learning - Convolutional Neural Networks
 
GANs Presentation.pptx
GANs Presentation.pptxGANs Presentation.pptx
GANs Presentation.pptx
 
Synthetic data generation for machine learning
Synthetic data generation for machine learningSynthetic data generation for machine learning
Synthetic data generation for machine learning
 
Introduction to Transformer Model
Introduction to Transformer ModelIntroduction to Transformer Model
Introduction to Transformer Model
 
[PR12] intro. to gans jaejun yoo
[PR12] intro. to gans   jaejun yoo[PR12] intro. to gans   jaejun yoo
[PR12] intro. to gans jaejun yoo
 
GANs Deep Learning Summer School
GANs Deep Learning Summer SchoolGANs Deep Learning Summer School
GANs Deep Learning Summer School
 

Similar to Introduction to Diffusion Models

Deep neural network with GANs pre- training for tuberculosis type classificat...
Deep neural network with GANs pre- training for tuberculosis type classificat...Deep neural network with GANs pre- training for tuberculosis type classificat...
Deep neural network with GANs pre- training for tuberculosis type classificat...
Behzad Shomali
 
Minor Project Report on Denoising Diffusion Probabilistic Model
Minor Project Report on Denoising Diffusion Probabilistic ModelMinor Project Report on Denoising Diffusion Probabilistic Model
Minor Project Report on Denoising Diffusion Probabilistic Model
soxigoh238
 
Score based Generative Modeling through Stochastic Differential Equations
Score based Generative Modeling through Stochastic Differential EquationsScore based Generative Modeling through Stochastic Differential Equations
Score based Generative Modeling through Stochastic Differential Equations
Sungchul Kim
 
Deep Learning Opening Workshop - Improving Generative Models - Junier Oliva, ...
Deep Learning Opening Workshop - Improving Generative Models - Junier Oliva, ...Deep Learning Opening Workshop - Improving Generative Models - Junier Oliva, ...
Deep Learning Opening Workshop - Improving Generative Models - Junier Oliva, ...
The Statistical and Applied Mathematical Sciences Institute
 
Bootstrap.ppt
Bootstrap.pptBootstrap.ppt
Bootstrap.ppt
ABINASHPADHY6
 
Lec16: Medical Image Registration (Advanced): Deformable Registration
Lec16: Medical Image Registration (Advanced): Deformable RegistrationLec16: Medical Image Registration (Advanced): Deformable Registration
Lec16: Medical Image Registration (Advanced): Deformable Registration
Ulaş Bağcı
 
Exploring Simple Siamese Representation Learning
Exploring Simple Siamese Representation LearningExploring Simple Siamese Representation Learning
Exploring Simple Siamese Representation Learning
Sungchul Kim
 
Design of Engineering Experiments Part 5
Design of Engineering Experiments Part 5Design of Engineering Experiments Part 5
Design of Engineering Experiments Part 5
Stats Statswork
 
Generational Layered Canvas Mechanism for Collaborative Web Applications
Generational Layered Canvas Mechanism for Collaborative Web ApplicationsGenerational Layered Canvas Mechanism for Collaborative Web Applications
Generational Layered Canvas Mechanism for Collaborative Web Applications
kata shin
 
Deep Implicit Layers: Learning Structured Problems with Neural Networks
Deep Implicit Layers: Learning Structured Problems with Neural NetworksDeep Implicit Layers: Learning Structured Problems with Neural Networks
Deep Implicit Layers: Learning Structured Problems with Neural Networks
Sangwoo Mo
 
TPDM Presentation Slide (ICCV23)
TPDM Presentation Slide (ICCV23)TPDM Presentation Slide (ICCV23)
TPDM Presentation Slide (ICCV23)
Suhyeon Lee
 
Hyperbolic Deep Reinforcement Learning
Hyperbolic Deep Reinforcement LearningHyperbolic Deep Reinforcement Learning
Hyperbolic Deep Reinforcement Learning
Sangwoo Mo
 
A Unified Framework for Computer Vision Tasks: (Conditional) Generative Model...
A Unified Framework for Computer Vision Tasks: (Conditional) Generative Model...A Unified Framework for Computer Vision Tasks: (Conditional) Generative Model...
A Unified Framework for Computer Vision Tasks: (Conditional) Generative Model...
Sangwoo Mo
 
PR243: Designing Network Design Spaces
PR243: Designing Network Design SpacesPR243: Designing Network Design Spaces
PR243: Designing Network Design Spaces
Jinwon Lee
 
Image Inpainting Using Deep Learning
Image Inpainting Using Deep Learning Image Inpainting Using Deep Learning
Image Inpainting Using Deep Learning
MohammadPooya Malek
 
Graph Analysis Beyond Linear Algebra
Graph Analysis Beyond Linear AlgebraGraph Analysis Beyond Linear Algebra
Graph Analysis Beyond Linear Algebra
Jason Riedy
 
Double patterning for 32nm and beyond
Double patterning for 32nm and beyondDouble patterning for 32nm and beyond
Double patterning for 32nm and beyond
Manikandan Sampathkumar
 
PR-305: Exploring Simple Siamese Representation Learning
PR-305: Exploring Simple Siamese Representation LearningPR-305: Exploring Simple Siamese Representation Learning
PR-305: Exploring Simple Siamese Representation Learning
Sungchul Kim
 
Using Feature Grouping as a Stochastic Regularizer for High Dimensional Noisy...
Using Feature Grouping as a Stochastic Regularizer for High Dimensional Noisy...Using Feature Grouping as a Stochastic Regularizer for High Dimensional Noisy...
Using Feature Grouping as a Stochastic Regularizer for High Dimensional Noisy...
WiMLDSMontreal
 
Towards Modularity in Live Visual Modeling: A case-study with OpenPonk and Ke...
Towards Modularity in Live Visual Modeling: A case-study with OpenPonk and Ke...Towards Modularity in Live Visual Modeling: A case-study with OpenPonk and Ke...
Towards Modularity in Live Visual Modeling: A case-study with OpenPonk and Ke...
ESUG
 

Similar to Introduction to Diffusion Models (20)

Deep neural network with GANs pre- training for tuberculosis type classificat...
Deep neural network with GANs pre- training for tuberculosis type classificat...Deep neural network with GANs pre- training for tuberculosis type classificat...
Deep neural network with GANs pre- training for tuberculosis type classificat...
 
Minor Project Report on Denoising Diffusion Probabilistic Model
Minor Project Report on Denoising Diffusion Probabilistic ModelMinor Project Report on Denoising Diffusion Probabilistic Model
Minor Project Report on Denoising Diffusion Probabilistic Model
 
Score based Generative Modeling through Stochastic Differential Equations
Score based Generative Modeling through Stochastic Differential EquationsScore based Generative Modeling through Stochastic Differential Equations
Score based Generative Modeling through Stochastic Differential Equations
 
Deep Learning Opening Workshop - Improving Generative Models - Junier Oliva, ...
Deep Learning Opening Workshop - Improving Generative Models - Junier Oliva, ...Deep Learning Opening Workshop - Improving Generative Models - Junier Oliva, ...
Deep Learning Opening Workshop - Improving Generative Models - Junier Oliva, ...
 
Bootstrap.ppt
Bootstrap.pptBootstrap.ppt
Bootstrap.ppt
 
Lec16: Medical Image Registration (Advanced): Deformable Registration
Lec16: Medical Image Registration (Advanced): Deformable RegistrationLec16: Medical Image Registration (Advanced): Deformable Registration
Lec16: Medical Image Registration (Advanced): Deformable Registration
 
Exploring Simple Siamese Representation Learning
Exploring Simple Siamese Representation LearningExploring Simple Siamese Representation Learning
Exploring Simple Siamese Representation Learning
 
Design of Engineering Experiments Part 5
Design of Engineering Experiments Part 5Design of Engineering Experiments Part 5
Design of Engineering Experiments Part 5
 
Generational Layered Canvas Mechanism for Collaborative Web Applications
Generational Layered Canvas Mechanism for Collaborative Web ApplicationsGenerational Layered Canvas Mechanism for Collaborative Web Applications
Generational Layered Canvas Mechanism for Collaborative Web Applications
 
Deep Implicit Layers: Learning Structured Problems with Neural Networks
Deep Implicit Layers: Learning Structured Problems with Neural NetworksDeep Implicit Layers: Learning Structured Problems with Neural Networks
Deep Implicit Layers: Learning Structured Problems with Neural Networks
 
TPDM Presentation Slide (ICCV23)
TPDM Presentation Slide (ICCV23)TPDM Presentation Slide (ICCV23)
TPDM Presentation Slide (ICCV23)
 
Hyperbolic Deep Reinforcement Learning
Hyperbolic Deep Reinforcement LearningHyperbolic Deep Reinforcement Learning
Hyperbolic Deep Reinforcement Learning
 
A Unified Framework for Computer Vision Tasks: (Conditional) Generative Model...
A Unified Framework for Computer Vision Tasks: (Conditional) Generative Model...A Unified Framework for Computer Vision Tasks: (Conditional) Generative Model...
A Unified Framework for Computer Vision Tasks: (Conditional) Generative Model...
 
PR243: Designing Network Design Spaces
PR243: Designing Network Design SpacesPR243: Designing Network Design Spaces
PR243: Designing Network Design Spaces
 
Image Inpainting Using Deep Learning
Image Inpainting Using Deep Learning Image Inpainting Using Deep Learning
Image Inpainting Using Deep Learning
 
Graph Analysis Beyond Linear Algebra
Graph Analysis Beyond Linear AlgebraGraph Analysis Beyond Linear Algebra
Graph Analysis Beyond Linear Algebra
 
Double patterning for 32nm and beyond
Double patterning for 32nm and beyondDouble patterning for 32nm and beyond
Double patterning for 32nm and beyond
 
PR-305: Exploring Simple Siamese Representation Learning
PR-305: Exploring Simple Siamese Representation LearningPR-305: Exploring Simple Siamese Representation Learning
PR-305: Exploring Simple Siamese Representation Learning
 
Using Feature Grouping as a Stochastic Regularizer for High Dimensional Noisy...
Using Feature Grouping as a Stochastic Regularizer for High Dimensional Noisy...Using Feature Grouping as a Stochastic Regularizer for High Dimensional Noisy...
Using Feature Grouping as a Stochastic Regularizer for High Dimensional Noisy...
 
Towards Modularity in Live Visual Modeling: A case-study with OpenPonk and Ke...
Towards Modularity in Live Visual Modeling: A case-study with OpenPonk and Ke...Towards Modularity in Live Visual Modeling: A case-study with OpenPonk and Ke...
Towards Modularity in Live Visual Modeling: A case-study with OpenPonk and Ke...
 

More from Sangwoo Mo

Brief History of Visual Representation Learning
Brief History of Visual Representation LearningBrief History of Visual Representation Learning
Brief History of Visual Representation Learning
Sangwoo Mo
 
Learning Visual Representations from Uncurated Data
Learning Visual Representations from Uncurated DataLearning Visual Representations from Uncurated Data
Learning Visual Representations from Uncurated Data
Sangwoo Mo
 
Deep Learning Theory Seminar (Chap 3, part 2)
Deep Learning Theory Seminar (Chap 3, part 2)Deep Learning Theory Seminar (Chap 3, part 2)
Deep Learning Theory Seminar (Chap 3, part 2)
Sangwoo Mo
 
Deep Learning Theory Seminar (Chap 1-2, part 1)
Deep Learning Theory Seminar (Chap 1-2, part 1)Deep Learning Theory Seminar (Chap 1-2, part 1)
Deep Learning Theory Seminar (Chap 1-2, part 1)
Sangwoo Mo
 
Object-Region Video Transformers
Object-Region Video TransformersObject-Region Video Transformers
Object-Region Video Transformers
Sangwoo Mo
 
Learning Theory 101 ...and Towards Learning the Flat Minima
Learning Theory 101 ...and Towards Learning the Flat MinimaLearning Theory 101 ...and Towards Learning the Flat Minima
Learning Theory 101 ...and Towards Learning the Flat Minima
Sangwoo Mo
 
Sharpness-aware minimization (SAM)
Sharpness-aware minimization (SAM)Sharpness-aware minimization (SAM)
Sharpness-aware minimization (SAM)
Sangwoo Mo
 
Self-Attention with Linear Complexity
Self-Attention with Linear ComplexitySelf-Attention with Linear Complexity
Self-Attention with Linear Complexity
Sangwoo Mo
 
Meta-Learning with Implicit Gradients
Meta-Learning with Implicit GradientsMeta-Learning with Implicit Gradients
Meta-Learning with Implicit Gradients
Sangwoo Mo
 
Challenging Common Assumptions in the Unsupervised Learning of Disentangled R...
Challenging Common Assumptions in the Unsupervised Learning of Disentangled R...Challenging Common Assumptions in the Unsupervised Learning of Disentangled R...
Challenging Common Assumptions in the Unsupervised Learning of Disentangled R...
Sangwoo Mo
 
Generative Models for General Audiences
Generative Models for General AudiencesGenerative Models for General Audiences
Generative Models for General Audiences
Sangwoo Mo
 
Bayesian Model-Agnostic Meta-Learning
Bayesian Model-Agnostic Meta-LearningBayesian Model-Agnostic Meta-Learning
Bayesian Model-Agnostic Meta-Learning
Sangwoo Mo
 
Deep Learning for Natural Language Processing
Deep Learning for Natural Language ProcessingDeep Learning for Natural Language Processing
Deep Learning for Natural Language Processing
Sangwoo Mo
 
Neural Processes
Neural ProcessesNeural Processes
Neural Processes
Sangwoo Mo
 
Improved Trainings of Wasserstein GANs (WGAN-GP)
Improved Trainings of Wasserstein GANs (WGAN-GP)Improved Trainings of Wasserstein GANs (WGAN-GP)
Improved Trainings of Wasserstein GANs (WGAN-GP)
Sangwoo Mo
 
Recursive Neural Networks
Recursive Neural NetworksRecursive Neural Networks
Recursive Neural Networks
Sangwoo Mo
 
Emergence of Invariance and Disentangling in Deep Representations
Emergence of Invariance and Disentangling in Deep RepresentationsEmergence of Invariance and Disentangling in Deep Representations
Emergence of Invariance and Disentangling in Deep Representations
Sangwoo Mo
 
REBAR: Low-variance, unbiased gradient estimates for discrete latent variable...
REBAR: Low-variance, unbiased gradient estimates for discrete latent variable...REBAR: Low-variance, unbiased gradient estimates for discrete latent variable...
REBAR: Low-variance, unbiased gradient estimates for discrete latent variable...
Sangwoo Mo
 
Topology for Computing: Homology
Topology for Computing: HomologyTopology for Computing: Homology
Topology for Computing: Homology
Sangwoo Mo
 
Reinforcement Learning with Deep Energy-Based Policies
Reinforcement Learning with Deep Energy-Based PoliciesReinforcement Learning with Deep Energy-Based Policies
Reinforcement Learning with Deep Energy-Based Policies
Sangwoo Mo
 

More from Sangwoo Mo (20)

Brief History of Visual Representation Learning
Brief History of Visual Representation LearningBrief History of Visual Representation Learning
Brief History of Visual Representation Learning
 
Learning Visual Representations from Uncurated Data
Learning Visual Representations from Uncurated DataLearning Visual Representations from Uncurated Data
Learning Visual Representations from Uncurated Data
 
Deep Learning Theory Seminar (Chap 3, part 2)
Deep Learning Theory Seminar (Chap 3, part 2)Deep Learning Theory Seminar (Chap 3, part 2)
Deep Learning Theory Seminar (Chap 3, part 2)
 
Deep Learning Theory Seminar (Chap 1-2, part 1)
Deep Learning Theory Seminar (Chap 1-2, part 1)Deep Learning Theory Seminar (Chap 1-2, part 1)
Deep Learning Theory Seminar (Chap 1-2, part 1)
 
Object-Region Video Transformers
Object-Region Video TransformersObject-Region Video Transformers
Object-Region Video Transformers
 
Learning Theory 101 ...and Towards Learning the Flat Minima
Learning Theory 101 ...and Towards Learning the Flat MinimaLearning Theory 101 ...and Towards Learning the Flat Minima
Learning Theory 101 ...and Towards Learning the Flat Minima
 
Sharpness-aware minimization (SAM)
Sharpness-aware minimization (SAM)Sharpness-aware minimization (SAM)
Sharpness-aware minimization (SAM)
 
Self-Attention with Linear Complexity
Self-Attention with Linear ComplexitySelf-Attention with Linear Complexity
Self-Attention with Linear Complexity
 
Meta-Learning with Implicit Gradients
Meta-Learning with Implicit GradientsMeta-Learning with Implicit Gradients
Meta-Learning with Implicit Gradients
 
Challenging Common Assumptions in the Unsupervised Learning of Disentangled R...
Challenging Common Assumptions in the Unsupervised Learning of Disentangled R...Challenging Common Assumptions in the Unsupervised Learning of Disentangled R...
Challenging Common Assumptions in the Unsupervised Learning of Disentangled R...
 
Generative Models for General Audiences
Generative Models for General AudiencesGenerative Models for General Audiences
Generative Models for General Audiences
 
Bayesian Model-Agnostic Meta-Learning
Bayesian Model-Agnostic Meta-LearningBayesian Model-Agnostic Meta-Learning
Bayesian Model-Agnostic Meta-Learning
 
Deep Learning for Natural Language Processing
Deep Learning for Natural Language ProcessingDeep Learning for Natural Language Processing
Deep Learning for Natural Language Processing
 
Neural Processes
Neural ProcessesNeural Processes
Neural Processes
 
Improved Trainings of Wasserstein GANs (WGAN-GP)
Improved Trainings of Wasserstein GANs (WGAN-GP)Improved Trainings of Wasserstein GANs (WGAN-GP)
Improved Trainings of Wasserstein GANs (WGAN-GP)
 
Recursive Neural Networks
Recursive Neural NetworksRecursive Neural Networks
Recursive Neural Networks
 
Emergence of Invariance and Disentangling in Deep Representations
Emergence of Invariance and Disentangling in Deep RepresentationsEmergence of Invariance and Disentangling in Deep Representations
Emergence of Invariance and Disentangling in Deep Representations
 
REBAR: Low-variance, unbiased gradient estimates for discrete latent variable...
REBAR: Low-variance, unbiased gradient estimates for discrete latent variable...REBAR: Low-variance, unbiased gradient estimates for discrete latent variable...
REBAR: Low-variance, unbiased gradient estimates for discrete latent variable...
 
Topology for Computing: Homology
Topology for Computing: HomologyTopology for Computing: Homology
Topology for Computing: Homology
 
Reinforcement Learning with Deep Energy-Based Policies
Reinforcement Learning with Deep Energy-Based PoliciesReinforcement Learning with Deep Energy-Based Policies
Reinforcement Learning with Deep Energy-Based Policies
 

Recently uploaded

Calgary MuleSoft Meetup APM and IDP .pptx
Calgary MuleSoft Meetup APM and IDP .pptxCalgary MuleSoft Meetup APM and IDP .pptx
Calgary MuleSoft Meetup APM and IDP .pptx
ishalveerrandhawa1
 
Transcript: Details of description part II: Describing images in practice - T...
Transcript: Details of description part II: Describing images in practice - T...Transcript: Details of description part II: Describing images in practice - T...
Transcript: Details of description part II: Describing images in practice - T...
BookNet Canada
 
Cookies program to display the information though cookie creation
Cookies program to display the information though cookie creationCookies program to display the information though cookie creation
Cookies program to display the information though cookie creation
shanthidl1
 
What's Next Web Development Trends to Watch.pdf
What's Next Web Development Trends to Watch.pdfWhat's Next Web Development Trends to Watch.pdf
What's Next Web Development Trends to Watch.pdf
SeasiaInfotech2
 
Quality Patents: Patents That Stand the Test of Time
Quality Patents: Patents That Stand the Test of TimeQuality Patents: Patents That Stand the Test of Time
Quality Patents: Patents That Stand the Test of Time
Aurora Consulting
 
Why do You Have to Redesign?_Redesign Challenge Day 1
Why do You Have to Redesign?_Redesign Challenge Day 1Why do You Have to Redesign?_Redesign Challenge Day 1
Why do You Have to Redesign?_Redesign Challenge Day 1
FellyciaHikmahwarani
 
Data Protection in a Connected World: Sovereignty and Cyber Security
Data Protection in a Connected World: Sovereignty and Cyber SecurityData Protection in a Connected World: Sovereignty and Cyber Security
Data Protection in a Connected World: Sovereignty and Cyber Security
anupriti
 
Blockchain and Cyber Defense Strategies in new genre times
Blockchain and Cyber Defense Strategies in new genre timesBlockchain and Cyber Defense Strategies in new genre times
Blockchain and Cyber Defense Strategies in new genre times
anupriti
 
20240702 QFM021 Machine Intelligence Reading List June 2024
20240702 QFM021 Machine Intelligence Reading List June 202420240702 QFM021 Machine Intelligence Reading List June 2024
20240702 QFM021 Machine Intelligence Reading List June 2024
Matthew Sinclair
 
STKI Israeli Market Study 2024 final v1
STKI Israeli Market Study 2024 final  v1STKI Israeli Market Study 2024 final  v1
STKI Israeli Market Study 2024 final v1
Dr. Jimmy Schwarzkopf
 
K2G - Insurtech Innovation EMEA Award 2024
K2G - Insurtech Innovation EMEA Award 2024K2G - Insurtech Innovation EMEA Award 2024
K2G - Insurtech Innovation EMEA Award 2024
The Digital Insurer
 
DealBook of Ukraine: 2024 edition
DealBook of Ukraine: 2024 editionDealBook of Ukraine: 2024 edition
DealBook of Ukraine: 2024 edition
Yevgen Sysoyev
 
Implementations of Fused Deposition Modeling in real world
Implementations of Fused Deposition Modeling  in real worldImplementations of Fused Deposition Modeling  in real world
Implementations of Fused Deposition Modeling in real world
Emerging Tech
 
Paradigm Shifts in User Modeling: A Journey from Historical Foundations to Em...
Paradigm Shifts in User Modeling: A Journey from Historical Foundations to Em...Paradigm Shifts in User Modeling: A Journey from Historical Foundations to Em...
Paradigm Shifts in User Modeling: A Journey from Historical Foundations to Em...
Erasmo Purificato
 
Research Directions for Cross Reality Interfaces
Research Directions for Cross Reality InterfacesResearch Directions for Cross Reality Interfaces
Research Directions for Cross Reality Interfaces
Mark Billinghurst
 
Verti - EMEA Insurer Innovation Award 2024
Verti - EMEA Insurer Innovation Award 2024Verti - EMEA Insurer Innovation Award 2024
Verti - EMEA Insurer Innovation Award 2024
The Digital Insurer
 
Interaction Latency: Square's User-Centric Mobile Performance Metric
Interaction Latency: Square's User-Centric Mobile Performance MetricInteraction Latency: Square's User-Centric Mobile Performance Metric
Interaction Latency: Square's User-Centric Mobile Performance Metric
ScyllaDB
 
Coordinate Systems in FME 101 - Webinar Slides
Coordinate Systems in FME 101 - Webinar SlidesCoordinate Systems in FME 101 - Webinar Slides
Coordinate Systems in FME 101 - Webinar Slides
Safe Software
 
How RPA Help in the Transportation and Logistics Industry.pptx
How RPA Help in the Transportation and Logistics Industry.pptxHow RPA Help in the Transportation and Logistics Industry.pptx
How RPA Help in the Transportation and Logistics Industry.pptx
SynapseIndia
 
What Not to Document and Why_ (North Bay Python 2024)
What Not to Document and Why_ (North Bay Python 2024)What Not to Document and Why_ (North Bay Python 2024)
What Not to Document and Why_ (North Bay Python 2024)
Margaret Fero
 

Recently uploaded (20)

Calgary MuleSoft Meetup APM and IDP .pptx
Calgary MuleSoft Meetup APM and IDP .pptxCalgary MuleSoft Meetup APM and IDP .pptx
Calgary MuleSoft Meetup APM and IDP .pptx
 
Transcript: Details of description part II: Describing images in practice - T...
Transcript: Details of description part II: Describing images in practice - T...Transcript: Details of description part II: Describing images in practice - T...
Transcript: Details of description part II: Describing images in practice - T...
 
Cookies program to display the information though cookie creation
Cookies program to display the information though cookie creationCookies program to display the information though cookie creation
Cookies program to display the information though cookie creation
 
What's Next Web Development Trends to Watch.pdf
What's Next Web Development Trends to Watch.pdfWhat's Next Web Development Trends to Watch.pdf
What's Next Web Development Trends to Watch.pdf
 
Quality Patents: Patents That Stand the Test of Time
Quality Patents: Patents That Stand the Test of TimeQuality Patents: Patents That Stand the Test of Time
Quality Patents: Patents That Stand the Test of Time
 
Why do You Have to Redesign?_Redesign Challenge Day 1
Why do You Have to Redesign?_Redesign Challenge Day 1Why do You Have to Redesign?_Redesign Challenge Day 1
Why do You Have to Redesign?_Redesign Challenge Day 1
 
Data Protection in a Connected World: Sovereignty and Cyber Security
Data Protection in a Connected World: Sovereignty and Cyber SecurityData Protection in a Connected World: Sovereignty and Cyber Security
Data Protection in a Connected World: Sovereignty and Cyber Security
 
Blockchain and Cyber Defense Strategies in new genre times
Blockchain and Cyber Defense Strategies in new genre timesBlockchain and Cyber Defense Strategies in new genre times
Blockchain and Cyber Defense Strategies in new genre times
 
20240702 QFM021 Machine Intelligence Reading List June 2024
20240702 QFM021 Machine Intelligence Reading List June 202420240702 QFM021 Machine Intelligence Reading List June 2024
20240702 QFM021 Machine Intelligence Reading List June 2024
 
STKI Israeli Market Study 2024 final v1
STKI Israeli Market Study 2024 final  v1STKI Israeli Market Study 2024 final  v1
STKI Israeli Market Study 2024 final v1
 
K2G - Insurtech Innovation EMEA Award 2024
K2G - Insurtech Innovation EMEA Award 2024K2G - Insurtech Innovation EMEA Award 2024
K2G - Insurtech Innovation EMEA Award 2024
 
DealBook of Ukraine: 2024 edition
DealBook of Ukraine: 2024 editionDealBook of Ukraine: 2024 edition
DealBook of Ukraine: 2024 edition
 
Implementations of Fused Deposition Modeling in real world
Implementations of Fused Deposition Modeling  in real worldImplementations of Fused Deposition Modeling  in real world
Implementations of Fused Deposition Modeling in real world
 
Paradigm Shifts in User Modeling: A Journey from Historical Foundations to Em...
Paradigm Shifts in User Modeling: A Journey from Historical Foundations to Em...Paradigm Shifts in User Modeling: A Journey from Historical Foundations to Em...
Paradigm Shifts in User Modeling: A Journey from Historical Foundations to Em...
 
Research Directions for Cross Reality Interfaces
Research Directions for Cross Reality InterfacesResearch Directions for Cross Reality Interfaces
Research Directions for Cross Reality Interfaces
 
Verti - EMEA Insurer Innovation Award 2024
Verti - EMEA Insurer Innovation Award 2024Verti - EMEA Insurer Innovation Award 2024
Verti - EMEA Insurer Innovation Award 2024
 
Interaction Latency: Square's User-Centric Mobile Performance Metric
Interaction Latency: Square's User-Centric Mobile Performance MetricInteraction Latency: Square's User-Centric Mobile Performance Metric
Interaction Latency: Square's User-Centric Mobile Performance Metric
 
Coordinate Systems in FME 101 - Webinar Slides
Coordinate Systems in FME 101 - Webinar SlidesCoordinate Systems in FME 101 - Webinar Slides
Coordinate Systems in FME 101 - Webinar Slides
 
How RPA Help in the Transportation and Logistics Industry.pptx
How RPA Help in the Transportation and Logistics Industry.pptxHow RPA Help in the Transportation and Logistics Industry.pptx
How RPA Help in the Transportation and Logistics Industry.pptx
 
What Not to Document and Why_ (North Bay Python 2024)
What Not to Document and Why_ (North Bay Python 2024)What Not to Document and Why_ (North Bay Python 2024)
What Not to Document and Why_ (North Bay Python 2024)
 

Introduction to Diffusion Models

  • 1. Introduction to Diffusion Models 2022.01.03. KAIST ALIN-LAB Sangwoo Mo 1
  • 2. • Diffusion model is SOTA on image generation • Beat BigGAN and StyleGAN on high-resolution images Diffusion Model Boom! 2 Dhariwal & Nichol. Diffusion Models Beat GANs on Image Synthesis. NeurIPS’21
  • 3. • Diffusion model is SOTA on density estimation • Beat autoregressive models on likelihood score Diffusion Model Boom! 3 Song et al. Maximum Likelihood Training of Score-Based Diffusion Models. NeurIPS’21 Kingma et al. Variational Diffusion Models. NeurIPS’21
  • 4. • Diffusion model is useful for image editing • Editing = Rough scribble + diffusion (i.e., naturalization) • Scribbled images are unseen for GANs, but diffusion models still can denoise them Diffusion Model Boom! 4 Meng et al. SDEdit: Image Synthesis and Editing with Stochastic Differential Equations. arXiv’21
  • 5. • Diffusion model is useful for image editing • Also can be combined with vision-and-language model Diffusion Model Boom! 5 Nichol et al. GLIDE: Towards Photorealistic Image Generation and Editing with Text-Guided Diffusion Models. arXiv’21
  • 6. • Diffusion model is also effective for non-visual domains • Continuous domains like speech, and even for discrete domains like text Diffusion Model Boom! 6 Kong et al. DiffWave: A Versatile Diffusion Model for Audio Synthesis. ICLR’21 Austin et al. Structured Denoising Diffusion Models in Discrete State-Spaces. NeurIPS’21
  • 7. • Trilemma of generative models: Quality vs. Diversity vs. Speed • Diffusion model produces diverse and high-quality samples, but generations is slow Diffusion Model is All We Need? 7 Xiao et al. Tackling the Generative Learning Trilemma with Denoising Diffusion GANs. arXiv’21
  • 8. • Today’s content • Diffusion Probabilistic Model – ICML’15 • Denoising Diffusion Probabilistic Model (DDPM) – NeurIPS’20 • Improve quality & diversity of diffusion model • Denoising Diffusion Implicit Model (DDIM) – ICLR’21 • Improve generation speed of diffusion model • Not covering • Relation of diffusion model and score matching • Extension to stochastic differential equation • There are lots of new interesting works (see NeurIPS’21, ICLR’22) Outline 8 Score SDE: Song et al. Score-Based Generative Modeling through Stochastic Differential Equations. ICLR’21 → See Score SDE (ICLR’21)
  • 9. • Diffusion model aims to learn the reverse of noise generation procedure • Forward step: (Iteratively) Add noise to the original sample → The sample 𝑥! converges to the complete noise 𝑥" (e.g., ∼ 𝒩(0, 𝐼)) Diffusion Probabilistic Model 9 Sohl-Dickstein et al. Deep Unsupervised Learning using Nonequilibrium Thermodynamics. ICML’15 Forward (diffusion) process
  • 10. • Diffusion model aims to learn the reverse of noise generation procedure • Forward step: (Iteratively) Add noise to the original sample → The sample 𝑥! converges to the complete noise 𝑥" (e.g., ∼ 𝒩(0, 𝐼)) • Reverse step: Recover the original sample from the noise → Note that it is the “generation” procedure Diffusion Probabilistic Model 10 Sohl-Dickstein et al. Deep Unsupervised Learning using Nonequilibrium Thermodynamics. ICML’15 Reverse process Forward (diffusion) process
  • 11. • Diffusion model aims to learn the reverse of noise generation procedure • Forward step: (Iteratively) Add noise to the original sample → Technically, it is a product of conditional noise distributions 𝑞(𝐱#|𝐱#$%) • Usually, the parameters 𝛽# are fixed (one can jointly learn, but not beneficial) • Noise annealing (i.e., reducing noise scale 𝛽# < 𝛽#$%) is crucial to the performance Diffusion Probabilistic Model 11 Sohl-Dickstein et al. Deep Unsupervised Learning using Nonequilibrium Thermodynamics. ICML’15
  • 12. • Diffusion model aims to learn the reverse of noise generation procedure • Forward step: (Iteratively) Add noise to the original sample → Technically, it is a product of conditional noise distributions 𝑞(𝐱#|𝐱#$%) • Reverse step: Recover the original sample from the noise → It is also a product of conditional (de)noise distributions 𝑝&(𝐱#'%|𝐱#) • Use the learned parameters: denoiser 𝝁& (main part) and randomness 𝚺& Diffusion Probabilistic Model 12 Sohl-Dickstein et al. Deep Unsupervised Learning using Nonequilibrium Thermodynamics. ICML’15
  • 13. • Diffusion model aims to learn the reverse of noise generation procedure • Forward step: (Iteratively) Add noise to the original sample Reverse step: Recover the original sample from the noise • Training: Minimize variational lower bound of the model 𝑝&(𝐱!) Diffusion Probabilistic Model 13 Sohl-Dickstein et al. Deep Unsupervised Learning using Nonequilibrium Thermodynamics. ICML’15
  • 14. • Diffusion model aims to learn the reverse of noise generation procedure • Forward step: (Iteratively) Add noise to the original sample Reverse step: Recover the original sample from the noise • Training: Minimize variational lower bound of the model 𝑝& 𝐱! → It can be decomposed to the step-wise losses (for each step 𝑡) Diffusion Probabilistic Model 14 Sohl-Dickstein et al. Deep Unsupervised Learning using Nonequilibrium Thermodynamics. ICML’15
  • 15. • Diffusion model aims to learn the reverse of noise generation procedure • Training: Minimize variational lower bound of the model 𝑝& 𝐱! → It can be decomposed to the step-wise losses (for each step 𝑡) • Here, the true reverse step 𝑞(𝐱#$%|𝐱#, 𝐱!) can be computed as a closed form of 𝛽# • Note that we only define the true forward step 𝑞(𝐱#|𝐱#$%) • Since all distributions above are Gaussian, the KL divergences are tractable Diffusion Probabilistic Model 15 Sohl-Dickstein et al. Deep Unsupervised Learning using Nonequilibrium Thermodynamics. ICML’15
  • 16. • Diffusion model aims to learn the reverse of noise generation procedure • Network: Use the image-to-image translation (e.g., U-Net) architectures • Recall that input is 𝐱# and output is 𝐱#$%, both are images • It is expensive since both input and output are high-dimensional • Note that the denoiser 𝜇& 𝐱(, t shares weights, but conditioned by step 𝑡 Diffusion Probabilistic Model 16 * Image from the pix2pix-HD paper Sohl-Dickstein et al. Deep Unsupervised Learning using Nonequilibrium Thermodynamics. ICML’15
  • 17. • Diffusion model aims to learn the reverse of noise generation procedure • Sampling: Draw a random noise 𝒙" then apply the reverse step 𝑝&(𝐱#'%|𝐱#) • It often requires the hundreds of reverse steps (very slow) • Early and late steps change the high- and low-level attributes, respectively Diffusion Probabilistic Model 17 * Image from the DDPM paper Sohl-Dickstein et al. Deep Unsupervised Learning using Nonequilibrium Thermodynamics. ICML’15
  • 18. • DDPM reparametrizes the reverse distributions of diffusion models • Key idea: The original reverse step fully creates the denoiser 𝜇& 𝐱(, t from 𝐱# • However, 𝐱#$% and 𝐱# share most information, and thus it is redundant → Instead, create the residual 𝜖& 𝐱(, t and add to the original 𝐱# Denoising Diffusion Probabilistic Model (DDPM) 18 Ho et al. Denoising Diffusion Probabilistic Models. NeurIPS'20
  • 19. • DDPM reparametrizes the reverse distributions of diffusion models • Key idea: The original reverse step fully creates the denoiser 𝜇& 𝐱(, t from 𝐱# • However, 𝐱#$% and 𝐱# share most information, and thus it is redundant → Instead, create the residual 𝜖& 𝐱(, t and add to the original 𝐱# • Formally, DDPM reparametrizes the learned reverse distribution as1 and the step-wise objective 𝐿#$% can be reformulated as2 Denoising Diffusion Probabilistic Model (DDPM) 19 1. 𝛼! are some constants determined by 𝛽! 2. Note that we need no “intermediate” samples, and only compare the forward noise 𝝐 and reverse noise 𝝐" conditioned on 𝐱# Ho et al. Denoising Diffusion Probabilistic Models. NeurIPS'20
  • 20. • DDPM initiated the diffusion model boom • Achieved SOTA on CIFAR-10, with high-resolution scalability • It produces more diverse samples than GAN (no mode collapse) Denoising Diffusion Probabilistic Model (DDPM) 20 Ho et al. Denoising Diffusion Probabilistic Models. NeurIPS'20
  • 21. • DDIM roughly sketches the final sample, then refine it with the reverse process • Motivation: • Diffusion model is slow due to the iterative procedure • GAN/VAE creates the sample by one-shot forward operation • ⇒ Can we combine the advantages for fast sampling of diffusion models? • Technical spoiler: • Instead of naïvely applying diffusion model upon GAN/VAE, DDIM proposes a principled approach of rough sketch + refinement Denoising Diffusion Implicit Model (DDIM) 21 Song et al. Denoising Diffusion Implicit Models. ICLR’21
  • 22. • DDIM roughly sketches the final sample, then refine it with the reverse process • Key idea: • Given 𝐱#, generate the rough sketch 𝐱! and refine 𝑝&(𝐱#$%|𝐱#, 𝐱!)1 • Unlike original diffusion model, it is not a Markovian structure Denoising Diffusion Implicit Model (DDIM) 22 1. Recall that the original diffusion model uses 𝑝"(𝐱$%&|𝐱$) Song et al. Denoising Diffusion Implicit Models. ICLR’21
  • 23. • DDIM roughly sketches the final sample, then refine it with the reverse process • Key idea: Given 𝐱#, generate the rough sketch 𝐱! and refine 𝑞(𝐱#$%|𝐱#, 𝐱!) • Formulation: Define the forward distribution 𝑞(𝐱#$%|𝐱#, 𝐱!) as then, the forward process is derived from Bayes’ rule Denoising Diffusion Implicit Model (DDIM) 23 Song et al. Denoising Diffusion Implicit Models. ICLR’21
  • 24. • DDIM roughly sketches the final sample, then refine it with the reverse process • Key idea: Given 𝐱#, generate the rough sketch 𝐱! and refine 𝑞(𝐱#$%|𝐱#, 𝐱!) • Formulation: Forward process is and reverse process is Denoising Diffusion Implicit Model (DDIM) 24 Song et al. Denoising Diffusion Implicit Models. ICLR’21
  • 25. • DDIM roughly sketches the final sample, then refine it with the reverse process • Key idea: Given 𝐱#, generate the rough sketch 𝐱! and refine 𝑞(𝐱#$%|𝐱#, 𝐱!) • Formulation: Forward process is and reverse process is • Training: The variational lower bound of DDIM is identical to the one of DDPM1 • It is surprising since the forward/reverse formulation is totally different Denoising Diffusion Implicit Model (DDIM) 25 1. Precisely, the bound is different, but the solution is identical under some assumption (though violated in practice) Song et al. Denoising Diffusion Implicit Models. ICLR’21
  • 26. • DDIM significantly reduces the sampling steps of diffusion model • Creates the outline of the sample after only 10 steps (DDPM needs hundreds) Denoising Diffusion Implicit Model (DDIM) 26 Song et al. Denoising Diffusion Implicit Models. ICLR’21
  • 27. • New golden era of generative models • Competition of various approaches: GAN, VAE, flow, diffusion model1 • Also, lots of hybrid approaches (e.g., score SDE = diffusion + continuous flow) • Which model to use? • Diffusion model seems to be a nice option for high-quality generation • However, GAN is (currently) still a more practical solution which needs fast sampling (e.g., real-time apps.) Take-home Message 27 1. VAE also shows promising generation performance (see NVAE, very deep VAE)
  • 28. 28 Thank you for listening! 😀