TY - JOUR
T1 - A Causal Convolutional Approach for Packet Loss Concealment in Low Powered Devices
AU - Davy, Steven
AU - Belton, Niamh
AU - Tobin, Joshua
AU - Bin Zuber, Owais
AU - Dong, Liu
AU - Xuewen, Yuan
N1 - Publisher Copyright:
© 2023 IEEE.
PY - 2023
Y1 - 2023
N2 - This paper presents a deep learning model for audio Packet Loss Concealment (PLC) for real time communications that is accurate, lightweight, with a low inference time suitable for low powered mobile handsets. We leverage dilated causal convolutions to track short term time dependent features of previous audio making the architecture fully convolutional. The model is semi-autoregressive, meaning it can work autoregressively and non-autoregressively depending on audio loss length and model output size. Whilst existing solutions can perform PLC up to 120 ms. our proposed model can perform PLC for packet losses up to 200ms with an inference time of 51ms on a CPU and a model size of 4.19 MB in Tensorflow Lite. We also show how the inference time can be decreased by increasing the model output size without any decrease in model accuracy or significant increases in model size. The model is assessed in terms of RMSE, PLC-MOS, STOI, PESQ and inference time and compared to two baseline methods.
AB - This paper presents a deep learning model for audio Packet Loss Concealment (PLC) for real time communications that is accurate, lightweight, with a low inference time suitable for low powered mobile handsets. We leverage dilated causal convolutions to track short term time dependent features of previous audio making the architecture fully convolutional. The model is semi-autoregressive, meaning it can work autoregressively and non-autoregressively depending on audio loss length and model output size. Whilst existing solutions can perform PLC up to 120 ms. our proposed model can perform PLC for packet losses up to 200ms with an inference time of 51ms on a CPU and a model size of 4.19 MB in Tensorflow Lite. We also show how the inference time can be decreased by increasing the model output size without any decrease in model accuracy or significant increases in model size. The model is assessed in terms of RMSE, PLC-MOS, STOI, PESQ and inference time and compared to two baseline methods.
KW - Deep Learning
KW - Packet Loss Concealment
UR - http://www.scopus.com/inward/record.url?scp=85169456839&partnerID=8YFLogxK
U2 - 10.1109/ICASSP49357.2023.10096505
DO - 10.1109/ICASSP49357.2023.10096505
M3 - Conference article
AN - SCOPUS:85169456839
SN - 1520-6149
JO - ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
JF - ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
T2 - 48th IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2023
Y2 - 4 June 2023 through 10 June 2023
ER -