Improved SVRG for non-stronglyconvex or sum-of-non-convex objectives, International Conference on Machine Learning, pp.1080-1089, 2016. ,
Stochastic gradient push for distributed deep learning, 2018. ,
Neuro-Dynamic Programming, Athena Scientific, 1996. ,
Optimization methods for large-scale machine learning, SIAM Review, vol.60, issue.2, pp.223-311, 2018. ,
On the linear convergence of the stochastic gradient method with constant step-size, pp.1-9, 2017. ,
Stochastic primal-dual hybrid gradient algorithm with arbitrary sampling and imaging applications, SIAM Journal on Optimization, vol.28, issue.4, pp.2783-2808, 2018. ,
URL : https://hal.archives-ouvertes.fr/hal-01569426
Libsvm: a library for support vector machines, ACM Transactions on Intelligent Systems and Technology, vol.2, issue.3, p.27, 2011. ,
Convergence diagnostics for stochastic gradient descent with constant learning rate, Proceedings of the Twenty-First International Conference on Artificial Intelligence and Statistics, vol.84, pp.9-11, 2018. ,
Primal method for ERM with flexible mini-batching schemes and non-convex losses, 2015. ,
Importance sampling for minibatches, Journal of Machine Learning Research, vol.19, issue.27, pp.1-21, 2018. ,
Fast and simple PCA via convex optimization, 2015. ,
Optimal minibatch and step sizes for saga, 36th International Conference on Machine Learning, 2019. ,
URL : https://hal.archives-ouvertes.fr/hal-02005431
Stochastic quasigradient methods: Variance reduction via Jacobian sketching, 2018. ,
, , 2017.
Accelerated coordinate descent with arbitrary sampling and best rates for minibatches, 2018. ,
Train faster, generalize better: stability of stochastic gradient descent, 33rd International Conference on Machine Learning, 2016. ,
Beyond the regret minimization barrier: optimal algorithms for stochastic strongly-convex optimization, The Journal of Machine Learning Research, vol.15, issue.1, pp.2489-2512, 2014. ,
Nonconvex variance reduced optimization with arbitrary sampling, 2018. ,
Linear convergence of gradient and proximal-gradient methods under the polyak?ojasiewicz condition, Joint European Conference on Machine Learning and Knowledge Discovery in Databases, pp.795-811, 2016. ,
Can decentralized algorithms outperform centralized algorithms? a case study for decentralized parallel stochastic gradient descent, Advances in Neural Information Processing Systems, pp.5330-5340, 2017. ,
Momentum and stochastic momentum for stochastic gradient, proximal point and subspace descent methods, 2017. ,
, Convergence analysis of inexact randomized iterative methods, 2019.
The power of interpolation: Understanding the effectiveness of SGD in modern overparametrized learning, Workshop and Conference Proceedings, vol.80, pp.3331-3340, 2018. ,
Non-asymptotic analysis of stochastic approximation algorithms for machine learning, Advances in Neural Information Processing Systems, pp.451-459, 2011. ,
URL : https://hal.archives-ouvertes.fr/hal-00608041
Linear convergence of first order methods for non-strongly convex optimization, Mathematical Programming, pp.1-39, 2018. ,
Batched stochastic gradient descent with weighted sampling, Springer Proceedings in Mathematics & Statistics, pp.279-306, 2017. ,
Stochastic gradient descent, weighted sampling, and the randomized kaczmarz algorithm, Mathematical Programming, Series A, vol.155, issue.1, pp.549-573, 2016. ,
On Cezari's convergence of the steepest descent method for approximating saddle point of convex-concave functions, Soviet Mathetmatics Doklady, vol.19, 1978. ,
Problem complexity and method efficiency in optimization, 1983. ,
Robust stochastic approximation approach to stochastic programming, SIAM Journal on Optimization, vol.19, issue.4, pp.1574-1609, 2009. ,
URL : https://hal.archives-ouvertes.fr/hal-00976649
Introductory Lectures on Convex Optimization: A Basic Course, vol.87, 2013. ,
SGD and hogwild! Convergence without the bounded gradients assumption, Proceedings of the 35th International Conference on Machine Learning, vol.80, pp.3750-3758, 2018. ,
Coordinate descent with arbitrary sampling I: Algorithms and complexity, Optimization Methods and Software, vol.31, issue.5, pp.829-857, 2016. ,
Coordinate descent with arbitrary sampling II: Expected separable overapproximation, Optimization Methods and Software, vol.31, issue.5, pp.858-884, 2016. ,
Quartz: Randomized dual coordinate ascent with arbitrary sampling, Advances in Neural Information Processing Systems, pp.865-873, 2015. ,
Making gradient descent optimal for strongly convex stochastic optimization, 29th International Conference on Machine Learning, vol.12, pp.1571-1578, 2012. ,
Hogwild: A lockfree approach to parallelizing stochastic gradient descent, Advances in Neural Information Processing Systems, pp.693-701, 2011. ,
On optimal probabilities in stochastic coordinate descent methods, Optimization Letters, vol.10, issue.6, pp.1233-1243, 2016. ,
Parallel coordinate descent methods for big data optimization, Mathematical Programming, vol.156, issue.1-2, pp.433-484, 2016. ,
, Stochastic reformulations of linear systems: algorithms and convergence theory, 2017.
A stochastic approximation method. The Annals of Mathematical Statistics, pp.400-407, 1951. ,
Fast convergence of stochastic gradient descent under a strong growth condition, 2013. ,
URL : https://hal.archives-ouvertes.fr/hal-00855113
SDCA without duality, regularization, and individual convexity, International Conference on Machine Learning, pp.747-754, 2016. ,
Pegasos: primal estimated subgradient solver for SVM, 24th International Conference on Machine Learning, pp.807-814, 2007. ,
Stochastic gradient descent for nonsmooth optimization: Convergence results and optimal averaging schemes, Proceedings of the 30th International Conference on Machine Learning, pp.71-79, 2013. ,
Fast and faster convergence of SGD for over-parameterized models and an accelerated perceptron, 2018. ,