An Exploration of the Latent Space of a Convolutional Variational Autoencoder for the Generation of Musical Instrument Tones

Anastasia Natsiou, Seán O’Leary, Luca Longo

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

1 Citation (Scopus)

Abstract

Variational Autoencoders (VAEs) constitute one of the most significant deep generative models for the creation of synthetic samples. In the field of audio synthesis, VAEs have been widely used for the generation of natural and expressive sounds, such as music or speech. However, VAEs are often considered black boxes and the attributes that contribute to the synthesis of a sound are yet unsolved. Existing research focused on the way input data can influence the generation of latent space, and how this latent space can create synthetic data, is still insufficient. In this manuscript, we investigate the interpretability of the latent space of VAEs and the impact of each attribute of this space on the generation of synthetic instrumental notes. The contribution to the body of knowledge of this research is to offer, for both the XAI and sound community, an approach for interpreting how the latent space generates new samples. This is based on sensitivity and feature ablation analyses, and descriptive statistics.

Original languageEnglish
Title of host publicationExplainable Artificial Intelligence - 1st World Conference, xAI 2023, Proceedings
EditorsLuca Longo
PublisherSpringer Science and Business Media Deutschland GmbH
Pages470-486
Number of pages17
ISBN (Print)9783031440694
DOIs
Publication statusPublished - 2023
Event1st World Conference on eXplainable Artificial Intelligence, xAI 2023 - Lisbon, Portugal
Duration: 26 Jul 202328 Jul 2023

Publication series

NameCommunications in Computer and Information Science
Volume1903 CCIS
ISSN (Print)1865-0929
ISSN (Electronic)1865-0937

Conference

Conference1st World Conference on eXplainable Artificial Intelligence, xAI 2023
Country/TerritoryPortugal
CityLisbon
Period26/07/2328/07/23

Keywords

  • Audio Representations
  • Audio Synthesis
  • Explainable Artificial Intelligence (XAI)
  • Latent Feature Importance
  • Variational Autoencoders (VAE)

Fingerprint

Dive into the research topics of 'An Exploration of the Latent Space of a Convolutional Variational Autoencoder for the Generation of Musical Instrument Tones'. Together they form a unique fingerprint.

Cite this