F. R. Bach and E. Moulines, Non-Asymptotic Analysis of Stochastic Approximation Algorithms for Machine Learning, NIPS, 2011.
URL : https://hal.archives-ouvertes.fr/hal-00608041

A. Bellet, A. Habrard, and M. Sebban, A Survey on Metric Learning for Feature Vectors and Structured Data, 2013.
URL : https://hal.archives-ouvertes.fr/hal-01666935

A. Bellet, A. Habrard, and M. Sebban, Metric Learning, 2015.
URL : https://hal.archives-ouvertes.fr/hal-01121733

L. Bottou and O. Bousquet, The Tradeoffs of Large Scale Learning, NIPS, 2007.

S. Clémençon, G. Lugosi, and N. Vayatis, Ranking and empirical risk minimization of U-statistics, Ann. Statist, vol.36, 2008.

S. Clémençon, S. Robbiano, and J. Tressou, Maximal deviations of incomplete U -processes with applications to Empirical Risk Sampling, SDM, 2013.

S. Clémençon, On U-processes and clustering performance, NIPS, pp.37-45, 2011.

S. Clémençon, P. Bertail, and E. Chautru, Scaling up M-estimation via sampling designs: The HorvitzThompson stochastic gradient descent, IEEE Big Data, 2014.

A. Defazio, F. Bach, and S. Lacoste-julien, SAGA: A Fast Incremental Gradient Method With Support for Non-Strongly Convex Composite Objectives, NIPS, 2014.
URL : https://hal.archives-ouvertes.fr/hal-01016843

B. Delyon, Stochastic Approximation with Decreasing Gain: Convergence and Asymptotic Theory, 2000.

G. Fort, Central limit theorems for stochastic approximation with controlled Markov Chain, EsaimPS, 2014.
DOI : 10.1051/ps/2014013
URL : https://hal.archives-ouvertes.fr/hal-00861097

J. Fürnkranz, E. Hüllermeier, and S. Vanderlooy, Binary Decomposition Methods for Multipartite Ranking, ECML/PKDD, pp.359-374, 2009.

A. Herschtal and B. Raskutti, Optimising area under the ROC curve using gradient descent, ICML, p.49, 2004.
DOI : 10.1145/1015330.1015366

S. Janson, The asymptotic distributions of Incomplete U -statistics, Z. Wahrsch. verw. Gebiete, vol.66, pp.495-505, 1984.
DOI : 10.1007/bf00531887

R. Johnson and T. Zhang, Accelerating Stochastic Gradient Descent using Predictive Variance Reduction, NIPS, pp.315-323, 2013.

P. Kar, B. Sriperumbudur, P. Jain, and H. Karnick, On the Generalization Ability of Online Learning Algorithms for Pairwise Loss Functions, ICML, 2013.

H. J. Kushner and G. Yin, Stochastic approximation and recursive algorithms and applications, vol.35, 2003.

N. L. Roux, M. W. Schmidt, and F. Bach, A Stochastic Gradient Method with an Exponential Convergence Rate for Finite Training Sets, NIPS, 2012.
URL : https://hal.archives-ouvertes.fr/hal-00674995

A. J. Lee, U-Statistics: Theory and Practice, 1990.

J. , Incremental Majorization-Minimization Optimization with Application to Large-Scale Machine Learning, 2014.

D. Needell, R. Ward, and N. Srebro, Stochastic Gradient Descent, Weighted Sampling, and the Randomized Kaczmarz algorithm, NIPS, pp.1017-1025, 2014.
DOI : 10.1007/s10107-015-0864-7
URL : http://arxiv.org/pdf/1310.5715

A. Nemirovski, A. Juditsky, G. Lan, and A. Shapiro, Robust Stochastic Approximation Approach to Stochastic Programming, SIAM Journal on Optimization, vol.19, issue.4, pp.1574-1609, 2009.
DOI : 10.1137/070704277
URL : https://hal.archives-ouvertes.fr/hal-00976649

Y. Nesterov, Introductory lectures on convex optimization, vol.87, 2004.
DOI : 10.1007/978-1-4419-8853-9

M. Norouzi, D. J. Fleet, and R. Salakhutdinov, Hamming Distance Metric Learning, NIPS, pp.1070-1078, 2012.

M. Pelletier, Weak convergence rates for stochastic approximation with application to multiple targets and simulated annealing, Ann. Appl.Prob, 1998.
DOI : 10.1214/aoap/1027961032
URL : https://doi.org/10.1214/aoap/1027961032

Q. Qian, R. Jin, J. Yi, L. Zhang, and S. Zhu, Efficient Distance Metric Learning by Adaptive Sampling and Mini-Batch Stochastic Gradient Descent (SGD), Machine Learning, vol.99, pp.353-372, 2015.
DOI : 10.1007/s10994-014-5456-x
URL : https://link.springer.com/content/pdf/10.1007%2Fs10994-014-5456-x.pdf

P. Zhao, S. Hoi, R. Jin, and T. Yang, AUC Maximization. In ICML, pp.233-240, 2011.

P. Zhao and T. Zhang, Stochastic Optimization with Importance Sampling for Regularized Loss Minimization, ICML, 2015.