Dissertations and Theses

Using Blind Source Separation and a Compact Microphone Array to Improve the Error Rate of Speech Recognition

Jeffrey Dean Hoffman, Portland State UniversityFollow

First Advisor

James McNames

Date of Publication

Fall 12-1-2016

Document Type

Thesis

Degree Name

Master of Science (M.S.) in Electrical and Computer Engineering

Department

Electrical and Computer Engineering

Language

English

Subjects

Automatic speech recognition, Interference (Sound), Blind source separation, Microphone arrays

DOI

10.15760/etd.5258

Physical Description

1 online resource (xii, 139 pages)

Abstract

Automatic speech recognition has become a standard feature on many consumer electronics and automotive products, and the accuracy of the decoded speech has improved dramatically over time. Often, designers of these products achieve accuracy by employing microphone arrays and beamforming algorithms to reduce interference. However, beamforming microphone arrays are too large for small form factor products such as smart watches. Yet these small form factor products, which have precious little space for tactile user input (i.e. knobs, buttons and touch screens), would benefit immensely from a user interface based on reliably accurate automatic speech recognition.

This thesis proposes a solution for interference mitigation that employs blind source separation with a compact array of commercially available unidirectional microphone elements. Such an array provides adequate spatial diversity to enable blind source separation and would easily fit in a smart watch or similar small form factor product. The solution is characterized using publicly available speech audio clips recorded for the purpose of testing automatic speech recognition algorithms. The proposal is modelled in different interference environments and the efficacy of the solution is evaluated. Factors affecting the performance of the solution are identified and their influence quantified. An expectation is presented for the quality of separation as well as the resulting improvement in word error rate that can be achieved from decoding the separated speech estimate versus the mixture obtained from a single unidirectional microphone element. Finally, directions for future work are proposed, which have the potential to improve the performance of the solution thereby making it a commercially viable product.

Rights

In Copyright. URI: http://rightsstatements.org/vocab/InC/1.0/ This Item is protected by copyright and/or related rights. You are free to use this Item in any way that is permitted by the copyright and related rights legislation that applies to your use. For other uses you need to obtain permission from the rights-holder(s).

Persistent Identifier

http://archives.pdx.edu/ds/psu/19168

Recommended Citation

Hoffman, Jeffrey Dean, "Using Blind Source Separation and a Compact Microphone Array to Improve the Error Rate of Speech Recognition" (2016). Dissertations and Theses. Paper 3367.
https://doi.org/10.15760/etd.5258

Download

Included in

Electrical and Computer Engineering Commons

COinS

Dissertations and Theses

Using Blind Source Separation and a Compact Microphone Array to Improve the Error Rate of Speech Recognition

First Advisor

Date of Publication

Document Type

Degree Name

Department

Language

Subjects

DOI

Physical Description

Abstract

Rights

Persistent Identifier

Recommended Citation

Included in

Find

Connect

Dissertations and Theses

Using Blind Source Separation and a Compact Microphone Array to Improve the Error Rate of Speech Recognition

Author

Sponsor

First Advisor

Date of Publication

Document Type

Degree Name

Department

Language

Subjects

DOI

Physical Description

Abstract

Rights

Persistent Identifier

Recommended Citation

Included in

Share

Find

Connect