What happens if gradient is set at 1 in "Gradient descent algorithm"

So, What is gradient descent algorithm ?

- Given a feed forward netword we apply gradient descent as a fundamental function of Operation :
1. Randomly initialize : b, W1, W2, ... , Wm
2. Repeat untill convergence
3. Predict y(i) for each data point in training
4. Calculate loss J(b,w)
5. Calculate gradient of J(b,w)
b(new) = b(old) -a.gradient
w1(new) = w1(old)-a.gradient.w1
w2(new) = w2(old)-a.gradient.w2
wm(new) = wm(old) - a.gradient.wm
6. Update b,w1,w2,......,wm .. Simulateneously..

- When "Louis Augustin Cauchy" needed a function to find local minima he used idea of slope to iteratively move in direction guided by slope to reach local minima.
- Using the same idea in feed forward networks leads to convergence of minimum error.

Why gradient anyway ?

So if gradient is set to 1 or in other words if we dont use gradient descent we reach to a point directly on x - axis which will not be the required local minima.

Passion for Programming

Search This Blog

What happens if gradient is set at 1 in "Gradient descent algorithm"

So, What is gradient descent algorithm ?

Comments

Post a Comment

Popular posts from this blog

ASCII to Decimal conversion

Create One-Click Shutdown and Reboot Shortcuts

Event Sourcing with CQRS.