DELE CA2 PART A: GAN

Published on Feb 13, 2022

Scene 1 (0s)

DELE CA2 PART A: GAN. Done by: Oh Li Ting (2019334) Class: DAAA/2B/05.

Scene 2 (9s)

CONTENT. Background research Data exploration & preprocessing Application of GAN Evaluation of GAN Conclusion.

Scene 3 (33s)

BACKGROUND RESEARCH. A im - generate 1000 small color images CIFAR10 contains 10 classes Class labels - Additional information that is correlated with the input images, can be used to improve the GAN. This improvement may come in the form of more stable training, faster training, and/or generated images that have better quality. Instead of generating random images, we want to generate images that belongs to a specific class Hence, CGAN is applied.

Scene 4 (1m 32s)

BACKGROUND RESEARCH. CGAN: same GAN but providing additional information The conditioning is performed by feeding the class label into both the generator and discriminator input layer Involves using an embedding layer followed by a fully connected layer with a linear activation that scales the embedding to the size of the image before concatenating it in the model as an additional channel or feature map Conditional data need to be converted into a form of data that we can train by using embedding layer.

Scene 5 (2m 29s)

Data Exploration & Preprocessing. In our case, preprocessing is simple Convert to floats Scale from [0,255] to [-1,1] Perform data augmentation to help generate samples from more robust distribution.

Scene 6 (2m 52s)

Discriminator & Generator. Given an input image, the Discriminator outputs the likelihood of the image being real. A new second input is defined that takes an integer for the class label of the image. This has the effect of making the input image conditional on the provided class label..

Scene 7 (3m 55s)

GAN Model. Generator is trained via GAN combined model. Discriminator is trained separately so here only generator will be trained by keeping the discriminator constant. connect the image generated output from the generator as well as the class label input, both as input to the discriminator model.

Scene 8 (4m 31s)

Generate real & fake samples & latent points. Loading the dataset and selecting a batch of samples respectively must be updated to make use of the real class labels from the training dataset.

Scene 9 (5m 16s)

Train GAN. Retrieve and use the class labels in the calls to updating the discriminator and generator models Separate training is more effective. Train the discriminator on real and fake images, separately (half batch each) Update discriminator model weights Generate 'fake' examples Create inverted labels for the fake samples Train the generator with latent_dim as x and 1 as y. Update the generator via the discriminator's error Print loss Start next batch again.

Scene 10 (6m 10s)

OUTPUT. 1st column: airplanes 2nd: automobile 3rd column: bird 4th column: cat 5th column: deer 6th column: dog 7th column: frog 8th column: horse 9th column: ship 10th column: truck.

Scene 11 (6m 55s)

MODEL EVALUATION. Scale of the gradients at the lower layer of the Generator was too small Discriminator gradients were also consistent throughout.

Scene 12 (7m 56s)

MODEL IMPROVEMENT. Through multiples trials and errors, there are several changes that I have made in order to improve model performance and the quality of images generated 1. Downsample Using Strided Convolutions - Instead of using the default two-dimensional stride of (1,1) I change it to (2,2). 2. Use large kernels and more filters - Use larger kernels at the top convolutional layers to cover more pixels in the previous layer image and hence, can look at more information. 3. Adding Batch normalization - noticed that it resulted in distinctly sharper generated images 4. Use label smoothing - - decided to use soft labels (positive labels ==> 1 to [0.7, 1.2]) (negative labels ==> 0 to [0.0, 0.3]) 5. Tuning parameters for learning rate (applying different learning rate for discriminator and generator), LeakyRelu alpha value (best value is 0.02), batch size, optimizer for Discriminator and Generator (Adam) and number of units in a dense layer.

Scene 13 (10m 31s)

OUTPUT. Airplane. Automobile. Bird. Cat. Deer. Dog.

Scene 14 (10m 43s)

MODEL EVALUATION. The scale of the gradients reaching the bottom layer of the Generator is clearly higher than in the previous version Gradients flow as expected as the training progresses Generator getting large gradients early on and the Discriminator getting consistently high gradients at the top layer once the Generator has been trained enough.

Scene 15 (11m 13s)

CONCLUSION. CGAN make it possible to generate pictures of different classes Prior knowledge and researches are important as implementing good practices in training the model and suitable hyper parameter values have huge impact on the result Results could get better if more epochs are carried out May want to experiment about mode collapse.

Scene 16 (12m 23s)

Thank You.