City Research Online

Automatic speaker change detection with the Bayesian information criterion using MPEG-7 features and a fusion scheme

Kotti, M., Benetos, E. & Kotropoulos, C. (2006). Automatic speaker change detection with the Bayesian information criterion using MPEG-7 features and a fusion scheme. In: ISCAS. International Symposium on Circuits and Systems (ISCAS 2006), 21 - 24 May 2006, Island of Kos, Greece. doi: 10.1109/ISCAS.2006.1692970


This paper addresses unsupervised speaker change detection, a necessary step for several indexing tasks. We assume that there is no prior knowledge either on the number of speakers or their identities. Features included in the MPEG-7 Audio Prototype are investigated such as the AudioWaveformEnvelope and the AudioSpecrtumCentroid. The model selection criterion is the Bayesian Information Criterion (BIC). A multiple pass algorithm is proposed. It uses a dynamic thresholding for scalar features and a fusion scheme so as to refine the segmentation results. It also models every speaker by a multivariate Gaussian probability density function and whenever new information is available, the respective model is updated. The experiments are carried out on a dataset created by concatenating speakers from the TIMIT database, that is referred to as the TIMIT data set. It is and demonstrated that the performance of the proposed multiple pass algorithm is better than that of other approaches.

Publication Type: Conference or Workshop Item (Paper)
Additional Information: DOI: 10.1109/ISCAS.2006.1692970 © 2006 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other users, including reprinting/ republishing this material for advertising or promotional purposes, creating new collective works for resale or redistribution to servers or lists, or reuse of any copyrighted components of this work in other works.
Publisher Keywords: Bayesian information criterion, MPEG-7 audio prototype, TIMIT database, automatic speaker change detection, dynamic thresholding, fusion scheme, model selection criterion, multiple pass algorithm, multivariate Gaussian probability density function, unsupervised speaker change detection
Subjects: Q Science > QA Mathematics > QA76 Computer software
Departments: School of Science & Technology > Computer Science
[thumbnail of kotti_ISCAS06.pdf]
Download (129kB) | Preview


Add to AnyAdd to TwitterAdd to FacebookAdd to LinkedinAdd to PinterestAdd to Email


Downloads per month over past year

View more statistics

Actions (login required)

Admin Login Admin Login