DELTA BAR DELTA (JACOBS) - MULTI LAYER NETWORK HEURISTIC ALGORITHM

 Delta-Bar-Delta (Jacobs)
Since the cost surface for multi-layer networks can be complex, choosing a learning rate can be difficult. What works in one location of the cost surface may not work well in another location. Delta-Bar-Delta is a heuristic algorithm for modifying the learning rate as training progresses:
  • Each weight has its own learning rate.
  • For each weight: the gradient at the current timestep is compared with the gradient at the previous step (actually, previous gradients are averaged)
  • If the gradient is in the same direction the learning rate is increased
  • If the gradient is in the opposite direction the learning rate is decreased
  • Should be used with batch only.
Let
gij(t) = gradient of E wrt wij at time t
then define
Then the learning rate mij for weight wij at time t+1 is given by
where b, g , and k are chosen by the hand.
Downsides:
  • Knowing how to choose the parameters b, g , and k is not easy.
  • Doesn't work for
Share on Google Plus

Declaimer - MARTINS LIBRARY

NB: Join our Social Media Network on Google Plus | Facebook | Twitter | Linkedin