Automatic speaker change detection with the Bayesian information criterion using MPEG-7 features and a fusion scheme

Kotti, M., Benetos, E. & Kotropoulos, C. (2006). Automatic speaker change detection with the Bayesian information criterion using MPEG-7 features and a fusion scheme. Paper presented at the International Symposium on Circuits and Systems (ISCAS 2006), 21 - 24 May 2006, Island of Kos, Greece.

[img]
Preview
PDF
Download (129kB) | Preview

Abstract

This paper addresses unsupervised speaker change detection, a necessary step for several indexing tasks. We assume that there is no prior knowledge either on the number of speakers or their identities. Features included in the MPEG-7 Audio Prototype are investigated such as the AudioWaveformEnvelope and the AudioSpecrtumCentroid. The model selection criterion is the Bayesian Information Criterion (BIC). A multiple pass algorithm is proposed. It uses a dynamic thresholding for scalar features and a fusion scheme so as to refine the segmentation results. It also models every speaker by a multivariate Gaussian probability density function and whenever new information is available, the respective model is updated. The experiments are carried out on a dataset created by concatenating speakers from the TIMIT database, that is referred to as the TIMIT data set. It is and demonstrated that the performance of the proposed multiple pass algorithm is better than that of other approaches.

Item Type: Conference or Workshop Item (Paper)
Additional Information: DOI: 10.1109/ISCAS.2006.1692970 © 2006 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other users, including reprinting/ republishing this material for advertising or promotional purposes, creating new collective works for resale or redistribution to servers or lists, or reuse of any copyrighted components of this work in other works.
Uncontrolled Keywords: Bayesian information criterion, MPEG-7 audio prototype, TIMIT database, automatic speaker change detection, dynamic thresholding, fusion scheme, model selection criterion, multiple pass algorithm, multivariate Gaussian probability density function, unsupervised speaker change detection
Subjects: Q Science > QA Mathematics > QA76 Computer software
Divisions: School of Informatics > Department of Computing
URI: http://openaccess.city.ac.uk/id/eprint/2110

Actions (login required)

View Item View Item

Downloads

Downloads per month over past year

View more statistics