Home >

news Help

Publication Information


Title
Japanese: 
English:Feature normalization based on non-extensive statistics for speech recognition 
Author
Japanese: Hilman F. Pardede, 岩野 公司, 篠田 浩一.  
English: Hilman F. Pardede, Koji Iwano, Koichi Shinoda.  
Language English 
Journal/Book name
Japanese: 
English:Speech Communication 
Volume, Number, Page vol. 55        pp. 587-599
Published date Mar. 2013 
Publisher
Japanese: 
English: 
Conference name
Japanese: 
English: 
Conference site
Japanese: 
English: 
File
DOI http://dx.doi.org/10.1016/j.specom.2013.02.004
Abstract Most compensation methods to improve the robustness of speech recognition systems in noisy environments such as spectral subtraction, CMN, and MVN, rely on the fact that noise and speech spectra are independent. However, the use of limited window in signal processing may introduce a cross-term between them, which deteriorates the speech recognition accuracy. To tackle this problem, we introduce the q-logarithmic (q-log) spectral domain of non-extensive statistics and propose q-log spectral mean normalization (q-LSMN) which is an extension of log spectral mean normalization (LSMN) to this domain. The recognition experiments on a synthesized noisy speech database, the Aurora-2 database, showed that q-LSMN was consistently better than the conventional normalization methods, CMN, LSMN, and MVN. Furthermore, q-LSMN was even more effective when applied to a real noisy environment in the CENSREC- 2 database. It significantly outperformed ETSI AFE front-end. 2013 Elsevier B.V. All rights reserved

©2007 Tokyo Institute of Technology All rights reserved.