CS180 Proj5

Part 5A The Power of Diffusion Models

In this part, I played around with a pretrained diffusion model which is called DeepFloyed IF. I used this model to denoise an image. This method was used to implment diffusion. Then classifer free guidende was implemented.

Part 0. Sampling using a diffusion model

Approach

For this part, I just used pytorch along with the deepfloyed diffusion model to output images.This was done using 20 and 100 steps. I used seed 1234 The quality does improve with more steps, and it does show the text prompts.

Results steps = 20

An oil painting of a snowy mountain

man wearing a hat

rocket ship

Results steps = 30

An oil painting of a snowy mountain

man wearing a hat

rocket ship

1.1 Forward Process

Approach

Here I wrote the forward process which was the equation which adds noise to an image with mean sqrt(alpha) and variance sqrt(1-alpha)

Results

Original Image

Image with noise added t=250

Image with noise added t=500

Image with noise added t=750

1.2 Classical Denoising

Approach

Here I denoised by adding a gaussian filter blur to the noised images with kernel 13 sigma 2

Results

Gaussian Blur Denoised t=250

Gaussian Blur Denoised t=500

Gaussian Blur Denoised t=750

1.3 One Step Denoising

Approach

Use the pretrained difussion model and equations given to recover the image from the noise. THe model predicts the noise, and the equation is used to reocver the predicted image.

Results

Original Image

Image with noise added t=250

One step denoised t=250

Image with noise added t=500

One step denoised t=500

Image with noise added t=750

One step denoised t=750

One Step Denoised Result

Gaussian Blur Denoised t=750

1.4 Iterative Denoising

Approach

We can get a much better result if we denoise in steps to get the clear image. We first create a list of timesteps strided timestpes. For each timestep we use this equation to recover the next iteration of denoised image.

Results

Iterative Denoised t=90

Iterative Denoised t=240

Iterative Denoised t=390

Iterative Denoised t=540

Iterative Denoised t=690

Iterative Denoised Result

One Step Denoised Result

Gaussian Blur Denoised Result

1.5 Diffusion Model Sampling

Approach

Instead of starting from a noise image, we can start from pure noise to sample from the diffusion model.

Results

Sample 1

Sample 2

Sample 3

Sample 4

Sample 5

1.6 Classifier Free Guidance

Approach

We can get a much higher quality image if we apply CFG. We the noise estimate of the conditional and the unconditional.

Results

Sample 1

Sample 2

Sample 3

Sample 4

Sample 5

1.7 Image to Image Translation

Approach

Here we take a image and add noise to it, then denoise. We take the original image noise and force it into the image manifold without conditioning.

Results

Original Image Example 1

Ex1 SDEdit i_start = 1

Ex1 SDEdit i_start = 3

Ex1 SDEdit i_start = 5

Ex1 SDEdit i_start = 7

Ex1 SDEdit i_start = 10

Ex1 SDEdit i_start = 20

Original Image Example 2

Ex2 SDEdit i_start = 1

Ex2 SDEdit i_start = 3

Ex2 SDEdit i_start = 5

Ex2 SDEdit i_start = 7

Ex2 SDEdit i_start = 10

Ex2 SDEdit i_start = 20

Original Image Example 3

Ex3 SDEdit i_start = 1

Ex3 SDEdit i_start = 3

Ex3 SDEdit i_start = 5

Ex3 SDEdit i_start = 7

Ex3 SDEdit i_start = 10

Ex3 SDEdit i_start = 20

Hand Drawn and Web Images

We apply the same process on hand drawn and web images

Results

Avocado Ex 1

Avocado SDEdit i_start = 1

Avocado SDEdit i_start = 3

Avocado SDEdit i_start = 5

Avocado SDEdit i_start = 7

Avocado SDEdit i_start = 10

Avocado SDEdit i_start = 20

Earth Ex 1

Earth SDEdit i_start = 1

Earth SDEdit i_start = 3

Earth SDEdit i_start = 5

Earth SDEdit i_start = 7

Earth SDEdit i_start = 10

Earth SDEdit i_start = 20

Tomato Ex 1

Tomato SDEdit i_start = 1

Tomato SDEdit i_start = 3

Tomato SDEdit i_start = 5

Tomato SDEdit i_start = 7

Tomato SDEdit i_start = 10

Tomato SDEdit i_start = 20

1.7.2 InPainting

Approach

We apply the process so that we can the same content whenever the mask is 0 and diffused content when the mask is 1, where we apply this equation

Results

Original

mask

Hole to fill

Result

Original

mask

Hole to fill

Result

Original

mask

Hole to fill

Result

1.7.3 Text-Conditional Image-to-Image Translation

Approach

Instead of using a high quality photo, we can use a different text prompt to translate.

Results

Rocket Noise Level 1

Rocket Noise Level 3

Rocket Noise Level 5

Rocket Noise Level 7

Rocket Noise Level 10

Rocket Noise Level 20

Original

Oski Original

Skull Noise Level 1

Skull Noise Level 3

Skull Noise Level 5

Skull Noise Level 7

Skull Noise Level 10

Skull Noise Level 20

Puff Original

Skull Noise Level 1

Skull Noise Level 3

Skull Noise Level 5

Skull Noise Level 7

Skull Noise Level 10

Skull Noise Level 20

1.8 Visual Anagrams

Approach

We can make visual anagrams where the image would look like one thing from one side and another thing from another side. At step t we use the first prompt to get noise e1, then we flip the iamge and get noise e2. We then average the noise and use that as our noise estimate.

Results

An Oil Painting of People Around a Fire

An Old Man

Snowy Village

Barista

Rocket Ship

Pencil

1.9 Hybrid Images

Approach

We can do somthing similar to project 2 where we create hybrid iamge.s We take the low pass noise of one prompt and the high pass noise of the other prompt and combine the noise to diffuse.

Results

Dog Skull

Rocket Waterfall

Waterfall Skull

Part 5B Diffusion Models From Scratch

1.2 Using the UNet to Train a Denoiser

Approach

We first implement a UNet mode to train a denoiser. Lets first visualize the effect of adding noise to the image.

To implemnt the unet, I followed this architecture

I trained the denoiser by regressing to the orignal image after adding noise to it. church

Results

We can also check the denoisers effectiveness on other values of sigma. It was trained in sigma = 0.5

Training a DDPM Denoising UNET

Approach

We define the FCBlock add add it allowing it to accept a time parameter.

We can then sample in a similar process to 5a to implment the DDPM.

Results

5 Epochs Result

20 Epochs Result

Class Conditioned

Approach

Class conditioning was impplemented so that specfic digits can be generated. It was implemented by adding additional fcblocks that add class infomration into the Unet.

Results

5 Epochs Result

20 Epochs Result

Stable Diffusion

Part 5A The Power of Diffusion Models

Part 0. Sampling using a diffusion model

Approach

Results steps = 20

An oil painting of a snowy mountain

man wearing a hat

rocket ship

Results steps = 30

An oil painting of a snowy mountain

man wearing a hat

rocket ship

1.1 Forward Process

Approach

Results

Original Image

Image with noise added t=250

Image with noise added t=500

Image with noise added t=750

1.2 Classical Denoising

Approach

Results

Gaussian Blur Denoised t=250

Gaussian Blur Denoised t=500

Gaussian Blur Denoised t=750

1.3 One Step Denoising

Approach

Results

Original Image

Image with noise added t=250

One step denoised t=250

Image with noise added t=500

One step denoised t=500

Image with noise added t=750

One step denoised t=750

One Step Denoised Result

Gaussian Blur Denoised t=750

1.4 Iterative Denoising

Approach

Results

Iterative Denoised t=90

Iterative Denoised t=240

Iterative Denoised t=390

Iterative Denoised t=540

Iterative Denoised t=690

Iterative Denoised Result

One Step Denoised Result

Gaussian Blur Denoised Result

1.5 Diffusion Model Sampling

Approach

Results

Sample 1

Sample 2

Sample 3

Sample 4

Sample 5

1.6 Classifier Free Guidance

Approach

Results

Sample 1

Sample 2

Sample 3

Sample 4

Sample 5

1.7 Image to Image Translation

Approach

Results

Original Image Example 1

Ex1 SDEdit i_start = 1

Ex1 SDEdit i_start = 3

Ex1 SDEdit i_start = 5

Ex1 SDEdit i_start = 7

Ex1 SDEdit i_start = 10

Ex1 SDEdit i_start = 20

Original Image Example 2

Ex2 SDEdit i_start = 1

Ex2 SDEdit i_start = 3

Ex2 SDEdit i_start = 5

Ex2 SDEdit i_start = 7

Ex2 SDEdit i_start = 10