Two-Dimensional Convolutional Recurrent Neural Networks for Speech Activity Detection

Vafeiadis, Anastasios; Fanioudakis, Eleftherios; Potamitis, Ilyas; Votis, Konstantinos; Giakoumis, Dimitrios; Tzovaras, Dimitrios; Chen, Liming; Hamzaoui, Raouf

Repository landing page

research

oai:dora.dmu.ac.uk:2086/18174

Two-Dimensional Convolutional Recurrent Neural Networks for Speech Activity Detection

Authors: Anastasios Vafeiadis
Eleftherios Fanioudakis
Ilyas Potamitis
Konstantinos Votis
Dimitrios Giakoumis
Dimitrios Tzovaras
Liming Chen
Raouf Hamzaoui
Publication date: 17 June 2019
Publisher: 'International Speech Communication Association'

Abstract

Speech Activity Detection (SAD) plays an important role in mobile communications and automatic speech recognition (ASR). Developing efficient SAD systems for real-world applications is a challenging task due to the presence of noise. We propose a new approach to SAD where we treat it as a two-dimensional multilabel image classification problem. To classify the audio segments, we compute their Short-time Fourier Transform spectrograms and classify them with a Convolutional Recurrent Neural Network (CRNN), traditionally used in image recognition. Our CRNN uses a sigmoid activation function, max-pooling in the frequency domain, and a convolutional operation as a moving average filter to remove misclassified spikes. On the development set of Task 1 of the 2019 Fearless Steps Challenge, our system achieved a decision cost function (DCF) of 2.89%, a 66.4% improvement over the baseline. Moreover, it achieved a DCF score of 3.318% on the evaluation dataset of the challenge, ranking first among all submissions

Similar works

Full text

Open in the Core reader

Download PDF

De Montfort University Open Research Archive

oai:dora.dmu.ac.uk:2086/18174

Last time updated on 16/10/2019

This paper was published in De Montfort University Open Research Archive.

Having an issue?

Is data on this page outdated, violates copyrights or anything else? Report the problem now and we will take corresponding actions after reviewing your request.