Feature normalization based on non-extensive statistics for speech recognition

Hilman F. Pardede; Koji Iwano; Koichi Shinoda

doi:http://dx.doi.org/10.1016/j.specom.2013.02.004

論文・著書情報

タイトル

和文:
英文:	Feature normalization based on non-extensive statistics for speech recognition

著者

和文:	Hilman F. Pardede, 岩野公司, 篠田浩一.
英文:	Hilman F. Pardede, Koji Iwano, Koichi Shinoda.

言語

English

掲載誌/書名

和文:
英文:	Speech Communication

巻, 号, ページ

vol. 55 pp. 587-599

出版年月

2013年3月

出版者

和文:
英文:

会議名称

和文:
英文:

開催地

和文:
英文:

ファイル

DOI

http://dx.doi.org/10.1016/j.specom.2013.02.004

アブストラクト

Most compensation methods to improve the robustness of speech recognition systems in noisy environments such as spectral subtraction, CMN, and MVN, rely on the fact that noise and speech spectra are independent. However, the use of limited window in signal processing may introduce a cross-term between them, which deteriorates the speech recognition accuracy. To tackle this problem, we introduce the q-logarithmic (q-log) spectral domain of non-extensive statistics and propose q-log spectral mean normalization (q-LSMN) which is an extension of log spectral mean normalization (LSMN) to this domain. The recognition experiments on a synthesized noisy speech database, the Aurora-2 database, showed that q-LSMN was consistently better than the conventional normalization methods, CMN, LSMN, and MVN. Furthermore, q-LSMN was even more effective when applied to a real noisy environment in the CENSREC- 2 database. It significantly outperformed ETSI AFE front-end. 2013 Elsevier B.V. All rights reserved

Home

各種検索

サポート

T2R2について

関連リンク

論文・著書情報