探索基於生成對抗網路之新穎強健性技術
於語音辨識的應用

No Thumbnail Available

Date

2019

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

近年深度學習技術在許多領域有重大突破,在各種實際應用中也大放異彩,於自動語音辨識的應用中也一樣有優秀表現。雖然主流語音辨識系統在某些指標性任務上已經可達到和人類聽覺相當的辨識效果,然而它們卻不像人類一樣對於環境干擾具有強健性,也就是說儘管語音辨識系統有了大幅度的改進,「噪聲」仍舊一定程度的干擾語音辨識之準確度。諸如:背景人聲,火車,公車站牌,汽車噪音,餐館背景雜音…以上皆為常見的環境噪聲干擾。所以強健性技術的研究在當今語音辨識系統發展中扮演著重要角色。有鑑於此,本論文著手研究在語音特徵向量序列之調變頻譜上基於生成對抗網路之有效的增益方法。並在Aurora4語料庫上進行一系列實驗顯示本研究使用的方法可以增進語音辨識的效果。
Nowadays deep learning technologies have achieved record-breaking results in a wide array of realistic applications, such as automatic speech recognition (ASR). Even though mainstream ASR systems evaluated on a few benchmark tasks have already reached human-like performance, they, in reality, are not robust to environmental distortions in the manner that humans are. In view of this, this thesis sets out to develop effective enhancement methods, stemming from the so-called generative adversarial networks (GAN), for use in the modulation domain of speech feature vector sequences. A series of experiments conducted on the Aurora-4 database and task seem to demonstrate the utility of our proposed methods.

Description

Keywords

自動語音辨識, 強健式語音辨識, 生成對抗網路, 深度學習技術, 特徵強健性技術, 調變頻譜, Automatic Speech Recognition, Robustness, Generative Adversarial Networks, Deep Learning, Modulation Spectrum

Citation

Collections

Endorsement

Review

Supplemented By

Referenced By