City Research Online

Non-Negative Tensor Factorization Applied to Music Genre Classification

Benetos, E. & Kotropoulos, C. (2010). Non-Negative Tensor Factorization Applied to Music Genre Classification. IEEE Transactions on Audio, Speech & Language Processing, 18(8), pp. 1955-1967. doi: 10.1109/tasl.2010.2040784

Abstract

Music genre classification techniques are typically applied to the data matrix whose columns are the feature vectors extracted from music recordings. In this paper, a feature vector is extracted using a texture window of one sec, which enables the representation of any 30 sec long music recording as a time sequence of feature vectors, thus yielding a feature matrix. Consequently, by stacking the feature matrices associated to any dataset recordings, a tensor is created, a fact which necessitates studying music genre classification using tensors. First, a novel algorithm for non-negative tensor factorization (NTF) is derived that extends the non-negative matrix factorization. Several variants of the NTF algorithm emerge by employing different cost functions from the class of Bregman divergences. Second, a novel supervised NTF classifier is proposed, which trains a basis for each class separately and employs basis orthogonalization. A variety of spectral, temporal, perceptual, energy, and pitch descriptors is extracted from 1000 recordings of the GTZAN dataset, which are distributed across 10 genre classes. The NTF classifier performance is compared against that of the multilayer perceptron and the support vector machines by applying a stratified 10-fold cross validation. A genre classification accuracy of 78.9% is reported for the NTF classifier demonstrating the superiority of the aforementioned multilinear classifier over several data matrix-based state-of-the-art classifiers.

Publication Type: Article
Additional Information: © 2010 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other users, including reprinting/ republishing this material for advertising or promotional purposes, creating new collective works for resale or redistribution to servers or lists, or reuse of any copyrighted components of this work in other works.
Publisher Keywords: Content based retrieval, Cost function, Data mining, Feature extraction, Informatics, Multilayer perceptrons, Music information retrieval, Stacking, Support vector machines, Tensile stress
Subjects: M Music and Books on Music > M Music
Q Science > QA Mathematics > QA75 Electronic computers. Computer science
Departments: School of Science & Technology > Computer Science
SWORD Depositor:
[thumbnail of benetoskotropoulos_taslp_postprint.pdf]
Preview
PDF
Download (224kB) | Preview

Export

Add to AnyAdd to TwitterAdd to FacebookAdd to LinkedinAdd to PinterestAdd to Email

Downloads

Downloads per month over past year

View more statistics

Actions (login required)

Admin Login Admin Login