Step Size Adaptation in Reproducing Kernel Hilbert Space
S. Vishwanathan, N. N. Schraudolph, and A. J. Smola. Step Size Adaptation in Reproducing Kernel Hilbert Space. Journal of Machine Learning Research, 7:1107–1133, 2006.
Download
738.1kB | 257.4kB | 657.0kB |
Abstract
This paper presents an online Support Vector Machine (SVM) that uses the Stochastic Meta-Descent (SMD) algorithm to adapt its step size automatically. We formulate the online learning problem as a stochastic gradient descent in Reproducing Kernel Hilbert Space (RKHS) and translate SMD to the nonparametric setting, where its gradient trace parameter is no longer a coefficient vector but an element of the RKHS. We derive efficient updates that allow us to perform the step size adaptation in linear time. We apply the online SVM framework to a variety of loss functions, and in particular show how to handle structured output spaces and achieve efficient online multiclass classification. Experiments show that our algorithm outperforms more primitive methods for setting the gradient step size.
BibTeX Entry
@article{VisSchSmo06, author = {S.~V.~N. Vishwanathan and Nicol N. Schraudolph and Alex J. Smola}, title = {\href{http://nic.schraudolph.org/pubs/VisSchSmo06.pdf}{ Step Size Adaptation in Reproducing Kernel Hilbert Space}}, journal = jmlr, volume = 7, pages = {1107--1133}, year = 2006, b2h_type = {Journal Papers}, b2h_topic = {>Stochastic Meta-Descent, Kernel Methods}, abstract = { This paper presents an online Support Vector Machine (SVM) that uses the Stochastic Meta-Descent (SMD) algorithm to adapt its step size automatically. We formulate the online learning problem as a stochastic gradient descent in Reproducing Kernel Hilbert Space (RKHS) and translate SMD to the nonparametric setting, where its gradient trace parameter is no longer a coefficient vector but an element of the RKHS. We derive efficient updates that allow us to perform the step size adaptation in linear time. We apply the online SVM framework to a variety of loss functions, and in particular show how to handle structured output spaces and achieve efficient online multiclass classification. Experiments show that our algorithm outperforms more primitive methods for setting the gradient step size. }}