Charles Deledalle - Teaching - Summer 2019

UCSD KPU's Smart IoT Workshop

Lectures

Chapter 1 - Introduction (42Mb)
Overview of the course. Introduction to image sciences, image processing and computer vision. Basics of machine learning, terminologies, paradigms. No-free lunch theorem. Supervised versus unsupervised learning. Clustering and K-Means. Classification and regression. Linear least squares and polynomial curve fitting. Model complexity and overfitting. Curse of dimensionality. Dimensionality reduction and principal component analysis. Image representation, semantic gap, image features, and classical computer vision pipelines.

Chapter 2 - Preliminaries to deep learning (13Mb)
Binary classification and linear separators. Perceptron, ADALINE, artificial neurons. Artificial neural networks (ANNs), activation functions, and universal approximation theorem. Linear versus non-linear classification problems. Typical tasks, architectures and loss functions. Gradient descent and back-propagation. Support Vector Machines (SVMs), soft-margins and kernel trick. Connections between ANNs and SVMs.

Chapter 3 - Introduction to deep learning (16Mb)
Feature hierarchy. Feature engineering versus end-to-end learning. Deep architectures and gradient vanishing problem. Greedy layer-wise unsupervised pretraining and stacked auto-encoders. Modern deep learning: rectifiers, gradient clipping, stochastic and mini-batch gradient descent, learning rate schedules, adaptive learning rates, momentum and Nesterov acceleration, weight decay, gradient noise, early stopping, normalized initializations, dropout, batch normalization, residual networks, curriculum learning, multi-task learning, best practice. Weakness and adversarial perturbations.

Chapter 4 - Image classification and CNNs (29Mb)
Linear and shift invariant feature maps, cross-correlation, convolution, boundary conditions and padding, averaging and derivative filters, filter banks, Gabor filters, and wavelets. Convolutional Neural Networks (CNNs). Hubel and Wiesel brain activity model. Simple and complex cells. Convolutional layers, pooling layers, overcomplete representation, receptive fields. Flat tensors and fully connected classification heads. Backpropagation for CNNs. Typical architectures: Neocognitron, LeNet-5, AlexNet, ZFNet, VGG, GoogLeNet, ResNet, DenseNet, MobileNet. Evaluation metrics.

Chapter 5 - Detection, Segmentation, Captioning (35Mb)
Localization and classification. Overfeat: class agnostic versus class specific localization, fully convolutional neural networks, greedy merge strategy. Multi-object detection. Region proposal and selective search. R-CNN, Fast R-CNN, Faster R-CNN, R-FCN, YOLO and SSD. Image segmentation. Semantic segmentation and transposed convolutions. Instance segmentation and Mask R-CNN. Image captioning. Recurrent Neural Networks (RNNs). Language generation. Long Short Term Memory (LSTMs). DeepImageSent, Show and Tell, and Show, Attend and Tell algorithms.

Chapter 6 - Generation, Super-Resolution, Style transfer (49Mb)
Image generation. Gaussian models for human faces, limits and relations with linear neural networks. Generative adversarial networks (GANs), generators, discriminators, adversarial loss and two player games. Convolutional GAN and image arithmetic. Super-resolution. Nearest-neighbor, bilinear and bicubic interpolation. Image sharpening. Linear inverse problems, Tikhonov and Total-Variation regularization. Super-Resolution CNN, VDSR, Fast SRCNN, SRGAN, perceptual, adversarial and content losses. Style transfer: Gatys model, content loss and style loss.

Assignments (individual)

Assignment 0 - Python, Numpy and Matplotlib (prereq, optional)
Assignment 1 - Backpropagation
Assignment 2 - CNNs and Pytorch
Assignment 3 - Transfer Learning
Assignment 4 - Image Denoising

Connection script: launch-pod.sh
MNIST: dataset, MNISTtools.py
Caltech-UCSD-Birds: dataset, train.csv, val.csv
BSDS: dataset
nntools package: nntools.py

Last modified: Fri Aug 23 13:52:53 UTC 2019