Home >

news ヘルプ

論文・著書情報


タイトル
和文: 
英文:SepVAC: Multitask Learning of Speaker Separation, Speaker Localization, Microphone Array Localization, and Room Acoustic Parameter Estimation in Various Acoustic Conditions 
著者
和文: HartantoRoland, Sakti, 篠田浩一.  
英文: Roland Hartanto, Sakti Sakriani, Koichi Shinoda.  
言語 English 
掲載誌/書名
和文: 
英文:Interspeech 2025 
巻, 号, ページ         pp. 2480-2484
出版年月 2025年8月17日 
出版者
和文: 
英文:International Speech Communication Association (ISCA) 
会議名称
和文: 
英文:Interspeech 2025 
開催地
和文: 
英文:Rotterdam 
公式リンク https://www.isca-archive.org/interspeech_2025/hartanto25_interspeech.html
 
アブストラクト This paper proposes a multitask learning method for speech separation, that Separates speech and estimates the recording conditions in Various Acoustic Conditions (SepVAC) jointly. Unlike the previous methods that aim to achieve robustness against the uncertainty caused by noise and reverberation, this method explicitly estimates speaker & microphone location and room acoustic parameters to disambiguate them from speech features. We introduce curriculum learning to learn the model parameters stably. In our evaluation using SMS-WSJ-Plus dataset, it outperforms the state-of-the-art SpatialNet baseline by 0.67 points in word error rate (WER).

©2007 Institute of Science Tokyo All rights reserved.