Introduction to Machine Learning Algorithms

Published on Jul 20, 2025

Scene 1 (0s)

[Audio] Introduction to Deep Learning Pabitra Mitra Indian Institute of Technology Kharagpur [email protected] NSM Workshop on Accelerated Data Science.

Scene 2 (14s)

[Audio] Deep Learning Based on neural networks Uses deep architectures Very successful in many applications.

Scene 3 (22s)

[Audio] Perceptron Bias b x1 w1 Activation function Induced Field v Output y x2 w2  () Input values Summing function   xm wm weights.

Scene 4 (35s)

[Audio] Neuron Models The choice of activation function determines the  neuron model. if ( ) Examples: step function:    c v b c v a v if  c v a if ramp function:    v  d v b ) (       c d a b c v a )) otherwise ) /( )( (( if  sigmoid function with z,x,y parameters 1 ( ) y xv z v       ) exp( 1 2   Gaussian function:    v v 2 1 exp 2 1 ( )            .

Scene 5 (1m 18s)

[Audio] Sigmoid unit x01 w0 w1 w2 x1 x2 S f f (net) o  n : :  net wn ixi w   i 0 xn f is the sigmoid function 1 ) ( e x x f    1 Derivative can be easily computed: Logistic equation df x    ) ( ( ) 1 ( ) f x f x dx used in many applications other functions possible (tanh) Single unit: apply gradient descent rule Multilayer networks: backpropagation 5.

Scene 6 (1m 56s)

[Audio] Multi layer feed-forward NN (FFNN) FFNN is a more general network architecture, where there are hidden layers between input and output layers. Hidden nodes do not directly receive inputs nor send outputs to the external environment. FFNNs overcome the limitation of single-layer NN. They can handle non-linearly separable learning tasks. Output layer Input layer Hidden Layer 3-4-2 Network.

Scene 7 (2m 26s)

[Audio] Backpropagation Initialize all weights to small random numbers Repeat For each training example 1. Input the training example to the network and compute the network outputs 2. For each output unit k dk ← ok 1  ok tk  ok 3. For each hidden unit h dh ← oh 1  oh Skoutputs wk,hdk 4. Update each network weight wj,i wj,i ← wj,i + Dwj,i where Dwj,i  h dj xj,i 7.

Scene 8 (3m 8s)

[Audio] NN DESIGN ISSUES Data representation Network Topology Network Parameters Training Validation.

Scene 9 (3m 16s)

[Audio] Expressiveness Every bounded continuous function can be approximated with arbitrarily small error, by network with one hidden layer (Cybenko et al '89) Hidden layer of sigmoid functions Output layer of linear functions Any function can be approximated to arbitrary accuracy by a network with two hidden layers (Cybenko '88) Sigmoid units in both hidden layers Output layer of linear functions 9.

Scene 10 (3m 47s)

[Audio] Choice of Architecture Neural Networks Training Set vs Generalization error.

Scene 11 (3m 54s)

[Audio] Motivation for Depth. Motivation for Depth.

Scene 12 (4m 0s)

[Audio] Motivation: Mimic the Brain Structure Decision Decision Learning Higher Brain Neurons Arranged In Coupled Layers Mid/Low Level Feature Extraction End-to-End Neural Architecture Neurons Input Signal Sensory.

Scene 13 (4m 17s)

[Audio] Motivation Practical success in computer vision, signal processing, text mining Increase in volume and complexity of data Availability of GPUs.

Scene 14 (4m 28s)

[Audio] Convolutional Neural Network: Motivation.

Scene 15 (4m 34s)

Scene 16 (4m 40s)

CNN.

Scene 17 (4m 46s)

image conv-64 conv-64 maxpool conv•128 conv-128 maxpool conv•2S6 conv-256 maxpool conv-512 conv-512 conv-512 conv-512 maxpool FC-4096 FC.4096 FC-IOOO softmax Convolutional Layer 224 224 ooo 64 Can be implemented efficiently with convolutions 224 224 Every blue neuron is connected to a 3x3x3 array of inputs.

Scene 18 (4m 56s)

conv-64 conv-64 maxpool conv.128 conv-128 conv-2S6 conv-256 conv-512 conv-512 conv.512 conv-512 maxpool FC4096 FC.4096 FC-iOOO softmax 224 224 Pooling Layer 112 downsampling 112.

Scene 19 (5m 5s)

Max Pooling Layer Single depth slice x 5 3 1 1 6 2 2 2 7 1 3 4 8 4 max pool 6 3 8 4.

Scene 20 (5m 12s)

im conv•128 L conv-128 J maxpool J conv.512 conv-512 maxpool FC-4096 FC.4096 FC-IOOO Fully Connected Layer [1 xl x4096] "neurons" [7x7x512] Every "neuron" in the output: 1. 2. computes a dot product between the input and its weights f = -4- b' thresholds it at zero.

Scene 21 (5m 26s)

Every layer of a ConvNet has the same API: - Takes a 3D volume of numbers - Outputs a 3D volume of numbers - Constraint: function must be differentiable depth height ooo ooo •0 00 e width probabilities [lxix1000] image.

Scene 22 (5m 37s)

Scene 23 (5m 42s)

Example activation maps CONV POOL CONV CONV POOL FC CONV CONV POOL CONV ReLlJ ReLlJ ReLlJ ReLlJ ReLU ReLlJ (Fully-connected) uck ar irplane hip orse.

Scene 24 (5m 50s)

conv-64 maxpool conv.128 conv-128 maxpool conv-256 conv-256 maxpool conv-512 conv-512 maxpool conv.512 conv-512 FC-4096 FC.4096 FC-IOOO softmax [224x224x3] differentiable function 0.2 cat 0.4 dog 0.09 chair 0.01 bagel 0.3 banana 1000].

Scene 25 (6m 0s)

Training Loop until tired: 1. Sample a batch of data 2. Forward it through the network to get predictions 3. Backprop the errors 4. Update the weights.

Scene 26 (6m 11s)

[Audio] ResNetCNN+SkipConnectionsPyramidalcellsincortex.

Scene 27 (6m 18s)

[Audio] Full ResNet architecture: Stack residual blocks Every residual block has two 3x3 conv layers Periodically, double # of filters and downsample spatially using stride 2 (in each dimension) Additional conv layer at the beginning No FC layers at the end (only FC 1000 to output classes).

Scene 28 (6m 40s)

Densenet.

Scene 29 (6m 46s)

[Audio] Challenges of Depth Overfitting – dropout Vanishing gradient – ReLU activation Accelerating training – batch normalization Hyperparameter tuning.

Scene 30 (6m 56s)

[Audio] Computational Complexity. Computational Complexity.

Scene 31 (7m 1s)

[Audio] Types of Deep Architectures RNN, LSTM (sequence learning) Stacked Autoencoders (representation learning) GAN (classification, distribution learning) Combining architectures – unified backprop if all layers differentiable Tensorflow, PyTorch.

Scene 32 (7m 22s)

[Audio] References Introduction to Deep Learning – Ian Goodfellow Stanford Deep Learning course.