Home >

news ヘルプ

論文・著書情報


タイトル
和文: 
英文:Speech-linguistic Multimodal Representation for Depression Severity Assessment 
著者
和文: R Makiuchi Mariana, Warnita Tifani, 宇都 有昭, 篠田 浩一.  
英文: Rodrigues Makiuchi Mariana, Warnita Tifani, Uto Kuniaki, Shinoda Koichi.  
言語 English 
掲載誌/書名
和文:情報処理学会研究報告. SLP, 音声言語情報処理 
英文:IPSJ SIG Technical Report 
巻, 号, ページ Vol. 2019-SLP-130    No. 8    pp. 1-4
出版年月 2019年12月 
出版者
和文:情報処理学会 
英文:Information Processing Society of Japan 
会議名称
和文:第130回SLP研究発表会 
英文: 
開催地
和文:東京都 
英文:Tokyo 
ファイル
公式リンク https://ipsj.ixsq.nii.ac.jp/ej/?action=pages_view_main&active_action=repository_view_main_item_detail&item_id=200787&item_no=1&page_id=13&block_id=8
 
アブストラクト Depression is a common, but serious mental disorder that affects people all over the world. Therefore, an automatic depression assessment system is demanded to make the diagnosis of this disorder more popular. We propose a multimodal fusion of speech and linguistic representations for depression detection. We train our model to infer the Patient Health Questionnaire (PHQ) score of subjects from the E-DAIC corpus. For the speech modality, we apply VGG-16 extracted features to a Gated Convolutional Neural Network (GCNN) and a LSTM layer. For the linguistic representation, we extract BERT features from transcriptions and input them to a CNN and a LSTM layer. We evaluated the feature fusion model with the Concordance Correlation Coefficient (CCC), achieving a score of 0.696 on the development set and 0.403 on the testing set. The inclusion of visual features is also discussed. The results of this work were submitted to the Audio/Visual Emotion Challenge and Workshop (AVEC 2019).

©2007 Tokyo Institute of Technology All rights reserved.