Fast Curvature Matrix-Vector Products for Second-Order Gradient Descent

N. N. Schraudolph. Fast Curvature Matrix-Vector Products for Second-Order Gradient Descent. Neural Computation, 14(7):1723–1738, 2002.
Earlier version

Download


252.3kB	134.6kB	224.2kB

Abstract

We propose a generic method for iteratively approximating various second-order gradient steps---Newton, Gauss-Newton, Levenberg-Marquardt, and natural gradient---in linear time per iteration, using special curvature matrix-vector products that can be computed in O(n). Two recent acceleration techniques for online learning, matrix momentum and stochastic meta-descent (SMD), in fact implement this approach. Since both were originally derived by very different routes, this offers fresh insight into their operation, resulting in further improvements to SMD.

BibTeX Entry

@article{Schraudolph02,
     author = {Nicol N. Schraudolph},
      title = {\href{http://nic.schraudolph.org/pubs/Schraudolph02.pdf}{
               Fast Curvature Matrix-Vector Products
               for Second-Order Gradient Descent}},
    journal = {\href{http://neco.mitpress.org/}{Neural Computation}},
     volume =  14,
     number =  7,
      pages = {1723--1738},
       year =  2002,
   b2h_type = {Journal Papers},
  b2h_topic = {>Stochastic Meta-Descent},
   b2h_note = {<a href="b2hd-Schraudolph01.html">Earlier version</a>},
   abstract = {
    We propose a generic method for iteratively approximating
    various second-order gradient steps\,---\,Newton, Gauss-Newton,
    Levenberg-Marquardt, and natural gradient\,---\,in {\em linear}\/
    time per iteration, using special curvature matrix-vector products
    that can be computed in O(n).  Two recent acceleration techniques
    for online learning, {\em matrix momentum}\/ and {\em stochastic
    meta-descent}\/ (SMD), in fact implement this approach.  Since both
    were originally derived by very different routes, this offers fresh
    insight into their operation, resulting in further improvements to SMD.
}}