Group nonnegative matrix factorisation with speaker and session variability compensation for speaker identification

Romain Serizel 1, 2 Slim Essid 1, 2 Gael Richard 1, 2
1 S2A - Signal, Statistique et Apprentissage
LTCI - Laboratoire Traitement et Communication de l'Information
Abstract :

This paper presents a feature learning approach for speaker identification that is based on nonnegative matrix factorisation. Recent studies have shown that with such models, the dictionary atoms can represent well the speaker identity. The approaches proposed so far focused only on speaker variability and not on session variability. However, this later point is a crucial aspect in the success of the I-vector approach that is now the state-of-the-art in speaker identification.

This paper proposes a method that relies on group nonnegative matrix factorisation and that is inspired by the I-vector training procedure. By doing so the proposed approach intends to capture both the speaker variability and the session variability. Results on a small corpus prove that the proposed approach can be competitive with I-vectors.

Document type :
Conference papers
Complete list of metadatas

https://hal.telecom-paristech.fr/hal-02288453
Contributor : Telecomparis Hal <>
Submitted on : Saturday, September 14, 2019 - 6:50:10 PM
Last modification on : Thursday, October 17, 2019 - 12:37:03 PM

Identifiers

  • HAL Id : hal-02288453, version 1

Citation

Romain Serizel, Slim Essid, Gael Richard. Group nonnegative matrix factorisation with speaker and session variability compensation for speaker identification. ICASSP, Mar 2016, Shangai, China. pp.5470 - 5474. ⟨hal-02288453⟩

Share

Metrics

Record views

40