Towards closing the gap between the theory and practice of SVRG - Equipe Signal, Statistique et Apprentissage Accéder directement au contenu
Communication Dans Un Congrès Année : 2019

Towards closing the gap between the theory and practice of SVRG

Résumé

Amongst the very first variance reduced stochastic methods for solving the empiri- cal risk minimization problem was the SVRG method [11]. SVRG is an inner-outer loop based method, where in the outer loop a reference full gradient is evaluated, after which m ∈ N steps of an inner loop are executed where the reference gradient is used to build a variance reduced estimate of the current gradient. The simplicity of the SVRG method and its analysis has lead to multiple extensions and variants for even non-convex optimization. Yet there is a significant gap between the param- eter settings that the analysis suggests and what is known to work well in practice. Our first contribution is that we take several steps here towards closing this gap. In particular, the current analysis shows that m should be of the order of the condition number so that the resulting method has a favourable complexity. Yet in practice m = n works well irregardless of the condition number, where n is the number of data points. Furthermore, the current analysis shows that the inner iterates have to be reset using averaging after every outer loop. Yet in practice SVRG works best when the inner iterates are updated continuously and not reset. We provide an anal- ysis of these aforementioned practical settings and show that they achieve the same favourable complexity as the original analysis (with slightly better constants). Our second contribution is to provide a more general analysis than had been previously done by using arbitrary sampling, which allows us to analyse virtually all forms of mini-batching through a single theorem. Since our setup and analysis reflects what is done in practice, we are able to set the parameters such as the mini-batch size and step size using our theory in such a way that produces a more efficient algorithm in practice, as we show in extensive numerical experiments.
Fichier principal
Vignette du fichier
1908.02725.pdf (12.37 Mo) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)
Loading...

Dates et versions

hal-02365285 , version 1 (15-11-2019)

Identifiants

  • HAL Id : hal-02365285 , version 1

Citer

Othmane Sebbouh, Nidham Gazagnadou, Samy Jelassi, Francis Bach, Robert M Gower. Towards closing the gap between the theory and practice of SVRG. Conference on Neural Information Processing Systems, Dec 2019, Vancouver, Canada. ⟨hal-02365285⟩
65 Consultations
50 Téléchargements

Partager

Gmail Facebook X LinkedIn More