Home >

news ヘルプ

論文・著書情報


タイトル
和文: 
英文:A Unified Network for Multi-Speaker Speech Recognition with Multi-Channel Recordings 
著者
和文: Liu Conggui, 井上 中順, 篠田 浩一.  
英文: Conggui Liu, Nakamasa Inoue, Koichi Shinoda.  
言語 English 
掲載誌/書名
和文: 
英文:Proc. APSIPA 
巻, 号, ページ         pp. 1304-1307
出版年月 2017年12月11日 
出版者
和文: 
英文: 
会議名称
和文: 
英文:APSIPA ASC 2017 
開催地
和文: 
英文:No. 5 Jalan Stesen Sentral, Kuala Lumpur 
ファイル
公式リンク http://apsipa2017.org/
 
DOI https://doi.org/10.1109/APSIPA.2017.8282233
アブストラクト Despite the recent progress in speech recognition, meeting speech recognition is still a challenging task, since it is often difficult to separate one speaker’s voice from the others in meetings. In this paper, we propose a joint training framework of speaker separation and speech recognition with multi-channel recordings for this purpose. The location of each speaker is first estimated and then used to recover her/his original speech in a delay-and-subtraction (DAS) algorithm. The two components, speaker separation and speech recognition, are represented by one deep net, which is optimized as a whole using training data. We evaluated our method using simulated data generated from WSJCAM0 database. Compared with the independent training of the two components, our proposed method improved word accuracy by 15.2% when the locations of speakers are known, and by 53.6% when the locations of speakers are unknown

©2007 Tokyo Institute of Technology All rights reserved.