City Research Online

Speaker recognition with hybrid features from a deep belief network

Ali, H., Tran, S. N., Benetos, E. & d'Avila Garcez, A. S. (2016). Speaker recognition with hybrid features from a deep belief network. Neural Computing and Applications, 29(6), pp. 13-19. doi: 10.1007/s00521-016-2501-7

Abstract

Learning representation from audio data has shown advantages over the handcrafted features such as mel-frequency cepstral coefficients (MFCCs) in many audio applications. In most of the representation learning approaches, the connectionist systems have been used to learn and extract latent features from the fixed length data. In this paper, we propose an approach to combine the learned features and the MFCC features for speaker recognition task, which can be applied to audio scripts of different lengths. In particular, we study the use of features from different levels of deep belief network for quantizing the audio data into vectors of audio word counts. These vectors represent the audio scripts of different lengths that make them easier to train a classifier. We show in the experiment that the audio word count vectors generated from mixture of DBN features at different layers give better performance than the MFCC features. We also can achieve further improvement by combining the audio word count vector and the MFCC features.

Publication Type: Article
Additional Information: This is the peer reviewed version of the following article: Ali, H., Tran, S.N., Benetos, E. et al. Neural Comput & Applic (2016)., which has been published in final form at http://dx.doi.org/10.1007/s00521-016-2501-7. This article may be used for non-commercial purposes in accordance with Wiley Terms and Conditions for Self-Archiving.
Publisher Keywords: Deep belief networks, Deep learning, Mel-frequency cepstral coefficients
Subjects: R Medicine > RC Internal medicine > RC0321 Neuroscience. Biological psychiatry. Neuropsychiatry
Departments: School of Science & Technology > Computer Science
SWORD Depositor:
[thumbnail of ali-NCAA.pdf]
Preview
Text - Accepted Version
Download (420kB) | Preview

Export

Add to AnyAdd to TwitterAdd to FacebookAdd to LinkedinAdd to PinterestAdd to Email

Downloads

Downloads per month over past year

View more statistics

Actions (login required)

Admin Login Admin Login