A Study of the Automatic Speech Recognition Process and Speaker Adaptation

Stokes-Rees, Ian James

Repository landing page

oai:uwspace.uwaterloo.ca:10012/840

A Study of the Automatic Speech Recognition Process and Speaker Adaptation

Authors: Ian James Stokes-Rees
Publication date: 1 January 2000
Publisher: 'University of Waterloo'

Abstract

This thesis considers the entire automated speech recognition process and presents a standardised approach to LVCSR experimentation with HMMs. It also discusses various approaches to speaker adaptation such as MLLR and multiscale, and presents experimental results for cross-task speaker adaptation. An analysis of training parameters and data sufficiency for reasonable system performance estimates are also included. It is found that Maximum Likelihood Linear Regression (MLLR) supervised adaptation can result in 6% reduction (absolute) in word error rate given only one minute of adaptation data, as compared with an unadapted model set trained on a different task. The unadapted system performed at 24% WER and the adapted system at 18% WER. This is achieved with only 4 to 7 adaptation classes per speaker, as generated from a regression tree

Similar works

Full text

Open in the Core reader

Download PDF

University of Waterloo's Institutional Repository

oai:uwspace.uwaterloo.ca:10012...

Last time updated on 01/01/2018

This paper was published in University of Waterloo's Institutional Repository.

Having an issue?

Is data on this page outdated, violates copyrights or anything else? Report the problem now and we will take corresponding actions after reviewing your request.