Downbeat Detection with Conditional Random Fields and Deep Learned Features

Simon Durand 1, 2 Slim Essid 1, 2
1 S2A - Signal, Statistique et Apprentissage
LTCI - Laboratoire Traitement et Communication de l'Information
Abstract :

In this paper, we introduce a novel Conditional Random Field (CRF) system that detects the downbeat sequence of musical audio signals. Feature functions are computed from four deep learned representations based on harmony, rhythm, melody and bass content to take advantage of the high-level and multi-faceted aspect of this task. Downbeats being dynamic, the powerful CRF classification system allows us to combine our features with an adapted temporal model in a fully data-driven fashion. Some meters being under-represented in our training set, we show that data augmentation enables a statistically significant improvement of the results by taking into account class imbalance. An evaluation of different configurations of our system on nine datasets shows its efficiency and potential over a heuristic based approach and four downbeat tracking algo- rithms.

Complete list of metadatas

https://hal.telecom-paristech.fr/hal-02288480
Contributor : Telecomparis Hal <>
Submitted on : Saturday, September 14, 2019 - 6:52:17 PM
Last modification on : Thursday, October 17, 2019 - 12:37:03 PM

Identifiers

  • HAL Id : hal-02288480, version 1

Citation

Simon Durand, Slim Essid. Downbeat Detection with Conditional Random Fields and Deep Learned Features. International Society for Music Information Retrieval (ISMIR), Aug 2016, New York City, United States. pp.386-392. ⟨hal-02288480⟩

Share

Metrics

Record views

4