Active Learning for Auditory Hierarchy

William Coleman, Charlie Cullen, Ming Yan, Sarah Jane Delany

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

1 Citation (Scopus)

Abstract

Much audio content today is rendered as a static stereo mix: fundamentally a fixed single entity. Object-based audio envisages the delivery of sound content using a collection of individual sound ‘objects’ controlled by accompanying metadata. This offers potential for audio to be delivered in a dynamic manner providing enhanced audio for consumers. One example of such treatment is the concept of applying varying levels of data compression to sound objects thereby reducing the volume of data to be transmitted in limited bandwidth situations. This application motivates the ability to accurately classify objects in terms of their ‘hierarchy’. That is, whether or not an object is a foreground sound, which should be reproduced at full quality if possible, or a background sound, which can be heavily compressed without causing a deterioration in the listening experience. Lack of suitably labelled data is an acknowledged problem in the domain. Active Learning is a method that can greatly reduce the manual effort required to label a large corpus by identifying the most effective instances to train a model to high accuracy levels. This paper compares a number of Active Learning methods to investigate which is most effective in the context of a hierarchical labelling task on an audio dataset. Results show that the number of manual labels required can be reduced to 1.7% of the total dataset while still retaining high prediction accuracy.

Original languageEnglish
Title of host publicationMachine Learning and Knowledge Extraction - 4th IFIP TC 5, TC 12, WG 8.4, WG 8.9, WG 12.9 International Cross-Domain Conference, CD-MAKE 2020, Proceedings
EditorsAndreas Holzinger, Andreas Holzinger, Peter Kieseberg, A Min Tjoa, Edgar Weippl, Edgar Weippl
PublisherSpringer
Pages365-384
Number of pages20
ISBN (Print)9783030573201
DOIs
Publication statusPublished - 2020
Event4th IFIP TC 5, TC 12, WG 8.4, WG 8.9, WG 12.9 International Cross-Domain Conference for Machine Learning and Knowledge Extraction, CD-MAKE 2020 - Dublin, Ireland
Duration: 25 Aug 202028 Aug 2020

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume12279 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference4th IFIP TC 5, TC 12, WG 8.4, WG 8.9, WG 12.9 International Cross-Domain Conference for Machine Learning and Knowledge Extraction, CD-MAKE 2020
Country/TerritoryIreland
CityDublin
Period25/08/2028/08/20

Keywords

  • Active Learning
  • Auditory hierarchy
  • Machine Learning
  • Support Vector Machine

Fingerprint

Dive into the research topics of 'Active Learning for Auditory Hierarchy'. Together they form a unique fingerprint.

Cite this