Sparse representation and reproduction of speech signals in complex Fourier basis


Kwek, Lee Chung and Tan, Alan Wee Chiat and Lim, Heng Siong and Tan, Cheah Heng and Alaghbari, Khaled Ab. Aziz (2022) Sparse representation and reproduction of speech signals in complex Fourier basis. International Journal of Speech Technology. pp. 1-7. ISSN 1381-2416

[img] Text
Sparse representation and reproduction of speech....pdf
Restricted to Repository staff only

Download (1MB)


Sparse representation concerns the task of determining the most compact representation of a signal via a linear combination of bases of an overcomplete dictionary. As the problem is non-convex, it is common to consider approximate suboptimal solutions, and one such method is the orthogonal matching pursuit (OMP) algorithm. OMP is an iterative greedy algorithm, where at each step, the basis vector which is most correlated with the current residual is selected. For the most part, attention in the past has been directed towards using real-valued dictionaries as the considered signal of interest is also real-valued. From the perspective of speech representation, the use of complex dictionaries in sparse representation is intuitively appealing as audio signals are generally assumed to be a mixture of exponentials, with time-varying amplitudes and phases. However, sparse representation of speech signal based on complex dictionary is less investigated mainly because the measurements are normally real-valued. In this paper, we pursue this intuition by modelling the complex dictionary on the popular discrete Fourier transform, and then proceed to introduce a new orthogonalization mechanism in the OMP for such cases. The customization of the conventional OMP algorithm to the complex setting enables high-quality compact representation of the speech signals with low computational complexity. Results from experiments demonstrate that the proposed approach is able to retain high perceptual similarity of the reconstructed speech signals to the original ones.

Item Type: Article
Uncontrolled Keywords: Speech reproduction, Sparse representations, Complex Fourier bases
Subjects: T Technology > TK Electrical engineering. Electronics Nuclear engineering > TK7800-8360 Electronics
Divisions: Faculty of Engineering and Technology (FET)
Depositing User: Ms Nurul Iqtiani Ahmad
Date Deposited: 19 Jan 2022 08:09
Last Modified: 28 Mar 2022 03:36


Downloads per month over past year

View ItemEdit (login required)