On Tree-based Methods for Similarity Learning

Stéphan Clémençon; Robin Vogel

Proceedings/Recueil Des Communications Année : 2019

On Tree-based Methods for Similarity Learning

(1, 2, 3) , (3)

1
2
3

Stéphan Clémençon

Fonction : Auteur
PersonId : 174491
IdHAL : stephan-clemencon
ORCID : 0000-0002-5879-9500
IdRef : 08905203X

Laboratoire Traitement et Communication de l'Information

Département Images, Données, Signal

Signal, Statistique et Apprentissage

Robin Vogel

Fonction : Auteur

Signal, Statistique et Apprentissage

Résumé

In many situations, the choice of an adequate similarity measure or metric on the feature space dramatically determines the performance of machine learning methods. Building automatically such measures is the specific purpose of metric/similarity learning. In [21], similarity learning is formulated as a pairwise bipartite ranking problem: ideally, the larger the probability that two observations in the feature space belong to the same class (or share the same label), the higher the similarity measure between them. From this perspective, the ROC curve is an appropriate performance criterion and it is the goal of this article to extend recur-sive tree-based ROC optimization techniques in order to propose efficient similarity learning algorithms. The validity of such iterative partitioning procedures in the pairwise setting is established by means of results pertaining to the theory of U-processes and from a practical angle, it is discussed at length how to implement them by means of splitting rules specifically tailored to the similarity learning task. Beyond these theoret-ical/methodological contributions, numerical experiments are displayed and provide strong empirical evidence of the performance of the algorith-mic approaches we propose.

Domaines

Machine Learning [stat.ML] Probabilités [math.PR] Statistiques [math.ST]

Fichier principal

1906.09243.pdf (684.42 Ko)

Origine : Fichiers produits par l'(les) auteur(s)

Stephan Clémençon : Connectez-vous pour contacter le contributeur

https://telecom-paris.hal.science/hal-02461801

Soumis le : jeudi 30 janvier 2020-23:00:09

Dernière modification le : lundi 15 avril 2024-15:14:30

Dates et versions

hal-02461801 , version 1 (30-01-2020)

Identifiants

HAL Id : hal-02461801 , version 1

Citer

Stéphan Clémençon, Robin Vogel. On Tree-based Methods for Similarity Learning. 2019, In: Nicosia G., Pardalos P., Umeton R., Giuffrida G., Sciacca V. (eds) Machine Learning, Optimization, and Data Science. LOD 2019. Lecture Notes in Computer Science, vol 11943. Springer, Cham. ⟨hal-02461801⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

INSTITUT-TELECOM PARISTECH LTCI IDS S2A IP_PARIS

31 Consultations

122 Téléchargements

On Tree-based Methods for Similarity Learning

Résumé

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Partager