Neural ProbabilisticModels for Melody Prediction, Sequence Labelling and Classification
Cherla, S (2016). Neural ProbabilisticModels for Melody Prediction, Sequence Labelling and Classification. (Unpublished Doctoral thesis, City, University of London)
Abstract
Data-driven sequence models have long played a role in the analysis and generation of musical information. Such models are of interest in computational musicology, computer-aided music composition, and tools for music education among other applications. This dissertation beginswith an experiment tomodel sequences of musical pitch in melodies with a class of purely data-driven predictive models collectively known as Connectionist models. It was demonstrated that a set of six such models could performon par with, or better than state-of-the-art n-gram models previously evaluated in an identical setting. A new model known as the Recurrent
Temporal Discriminative Restricted Boltzmann Machine (RTDRBM), was introduced in the process and found to outperform the rest of the models. A generalisation
of this modelling task was also explored, and involved extending the set of musical features used as input by the models while still predicting pitch as before. The improvement in predictive performance which resulted from adding these new input features is encouraging for future work in this direction.
Based on the above success of the RTDRBM, its application was extended to a non-musical sequence labelling task, namely Optical Character Recognition. This extension involved a modification to the model’s original prediction algorithm as a result of relaxing an assumption specific to the melody modelling task. The generalised model was evaluated on a benchmark dataset and compared against a set of 8 baseline models where it faired better than all of them. Furthermore, a theoretical extension to an existingmodel which was also employed in the above pitch prediction task - the Discriminative Restricted Boltzmann Machine (DRBM) - was
proposed. This led to three new variants of the DRBM (which originally contained Logistic Sigmoid hidden layer activations), withHyperbolic Tangent, Binomial and
Rectified Linear hidden layer activations respectively. The first two of these have been evaluated here on the benchmark MNIST dataset and shown to perform on par with the original DRBM.
Publication Type: | Thesis (Doctoral) |
---|---|
Subjects: | M Music and Books on Music > M Music T Technology > T Technology (General) |
Departments: | Doctoral Theses School of Science & Technology > School of Science & Technology Doctoral Theses School of Science & Technology > Computer Science |
Download (979kB) | Preview
Export
Downloads
Downloads per month over past year