Large Relative Delay Estimates for Speech Demixing in Assisted Living Environments: A Survey

Research output: Contribution to journalArticlepeer-review

Abstract

Precise relative delay estimation is important for speech demixing in Assisted Living (AL) scenarios, where large inter-microphone spacing can introduce phase-wraparound ambiguity. This survey comprehensively examines state-of-the-art methods for relative delay estimation and TF masking, focusing on methods that leverage relative delays to compute accurate speech demixing masks. These methods demonstrate that TF masks can be robustly developed using estimates of spatial, spectral, and auditory features. This paper provides a critical analysis of relative delay estimation and speech demixing techniques in AL environments. By examining existing methods, we aim to contribute to the development of effective and robust solutions for speech separation in challenging AL listening conditions. The findings of this survey highlight the importance of considering phase-wraparound and the potential benefits of incorporating auditory features into TF masking algorithms. It also considers the concept of the Internet-of-Auditory-Things (IoAudiT) as a promising future technology for AL and a framework for future research. This work offers valuable insights for researchers and practitioners in the fields of speech processing, audio signal processing, and assistive technology. The findings and recommendations presented in this survey will contribute to the advancement of speech demixing techniques and their application in real-world AL scenarios.

Original languageEnglish
Pages (from-to)193626-193666
Number of pages41
JournalIEEE Access
Volume13
DOIs
Publication statusPublished - 2025

Keywords

  • Assisted living
  • deep learning (DL)
  • hearing aid (HA)
  • inter-aural intensity differences (IIDs)
  • inter-aural time differences (ITDs)
  • machine learning (ML)
  • source separation (SS)
  • spatial covariance matrix (SCM)
  • steered-response power (SRP)

Fingerprint

Dive into the research topics of 'Large Relative Delay Estimates for Speech Demixing in Assisted Living Environments: A Survey'. Together they form a unique fingerprint.

Cite this