TY - JOUR
T1 - Efficient feature extraction and classification for the development of Pashto speech recognition system
AU - Ahmed, Irfan
AU - Irfan, Muhammad Abeer
AU - Iqbal, Abid
AU - Khalil, Amaad
AU - Siddiqui, Salman Ilahi
N1 - Publisher Copyright:
© The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2023.
PY - 2024/5
Y1 - 2024/5
N2 - In this work, a novel framework for the efficient feature extraction and recognition of Pashto speech signals is proposed. The targeted language is one of the low-resource languages and prone to higher Automatic Speech Recognition (ASR) errors due to the availability of its colloquial dialects. We devised a framework which not only employed classical Machine Learning (ML) models for speech recognition tasks, but also achieved a higher level of performance accuracy by using the optimal feature extraction techniques. The designed frameworks for feature extraction are based on two well-know feature extraction techniques: Discrete Wavelet Transform (DWT )coefficients and Mel-Frequency Cepstral Coefficients (MFCC). In our work, we deployed classical ML models i.e., Support Vector Machine (SVM) and K-Nearest Neighbors (k-NN), due to their efficiency in terms of computation complexity, energy efficiency, and higher accuracy as compared to other ML and Deep Learning (DL) model. Hence, our proposed framework exhibited improved performance level when trained on a Pashto isolated words dataset.
AB - In this work, a novel framework for the efficient feature extraction and recognition of Pashto speech signals is proposed. The targeted language is one of the low-resource languages and prone to higher Automatic Speech Recognition (ASR) errors due to the availability of its colloquial dialects. We devised a framework which not only employed classical Machine Learning (ML) models for speech recognition tasks, but also achieved a higher level of performance accuracy by using the optimal feature extraction techniques. The designed frameworks for feature extraction are based on two well-know feature extraction techniques: Discrete Wavelet Transform (DWT )coefficients and Mel-Frequency Cepstral Coefficients (MFCC). In our work, we deployed classical ML models i.e., Support Vector Machine (SVM) and K-Nearest Neighbors (k-NN), due to their efficiency in terms of computation complexity, energy efficiency, and higher accuracy as compared to other ML and Deep Learning (DL) model. Hence, our proposed framework exhibited improved performance level when trained on a Pashto isolated words dataset.
KW - Automatic speech recognition (ASR)
KW - DWT
KW - Feature extraction
KW - k-NN
KW - Machine learning (ML)
KW - MFCC
KW - SVM
UR - https://www.scopus.com/pages/publications/85178186975
U2 - 10.1007/s11042-023-17684-w
DO - 10.1007/s11042-023-17684-w
M3 - Article
AN - SCOPUS:85178186975
SN - 1380-7501
VL - 83
SP - 54081
EP - 54096
JO - Multimedia Tools and Applications
JF - Multimedia Tools and Applications
IS - 18
ER -