Monaural speech separation with deep learning using phase modelling and capsule networks
Staines, T., Weyde, T. ORCID: 0000-0001-8028-9905 & Galkin, O. (2019). Monaural speech separation with deep learning using phase modelling and capsule networks. 2019 27th European Signal Processing Conference (EUSIPCO), 2019-S, doi: 10.23919/EUSIPCO.2019.8902655
Abstract
The removal of background noise from speech audio is a problem with high practical relevance. A variety of deep learning approaches have been applied to it in recent years, most of which operate on a magnitude spectrogram representation of a noisy recording to estimate the isolated speaking voice. This work investigates ways to include phase information, which is commonly discarded, firstly within a convolutional neural network (CNN) architecture, and secondly by applying capsule networks, to our knowledge the first time capsules have been used in source separation. We present a Circular Loss function, which takes into account the periodic nature of phase. Our results show that the inclusion of phase information leads to an improvement in the quality of speech separation. We also find that in our experiments convolutional neural networks outperform capsule networks at speech separation.
Publication Type: | Article |
---|---|
Additional Information: | © 2019 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works. |
Subjects: | Q Science > QA Mathematics > QA75 Electronic computers. Computer science |
Departments: | School of Science & Technology > Computer Science |
Download (102kB) | Preview
Export
Downloads
Downloads per month over past year