Home >

news Help

Publication Information


Title
Japanese: 
English:Speech-linguistic Multimodal Representation for Depression Severity Assessment 
Author
Japanese: R Makiuchi Mariana, Warnita Tifani, 宇都 有昭, 篠田 浩一.  
English: Rodrigues Makiuchi Mariana, Warnita Tifani, Uto Kuniaki, Shinoda Koichi.  
Language English 
Journal/Book name
Japanese:情報処理学会研究報告. SLP, 音声言語情報処理 
English:IPSJ SIG Technical Report 
Volume, Number, Page Vol. 2019-SLP-130    No. 8    pp. 1-4
Published date Dec. 2019 
Publisher
Japanese:情報処理学会 
English:Information Processing Society of Japan 
Conference name
Japanese:第130回SLP研究発表会 
English: 
Conference site
Japanese:東京都 
English:Tokyo 
File
Official URL https://ipsj.ixsq.nii.ac.jp/ej/?action=pages_view_main&active_action=repository_view_main_item_detail&item_id=200787&item_no=1&page_id=13&block_id=8
 
Abstract Depression is a common, but serious mental disorder that affects people all over the world. Therefore, an automatic depression assessment system is demanded to make the diagnosis of this disorder more popular. We propose a multimodal fusion of speech and linguistic representations for depression detection. We train our model to infer the Patient Health Questionnaire (PHQ) score of subjects from the E-DAIC corpus. For the speech modality, we apply VGG-16 extracted features to a Gated Convolutional Neural Network (GCNN) and a LSTM layer. For the linguistic representation, we extract BERT features from transcriptions and input them to a CNN and a LSTM layer. We evaluated the feature fusion model with the Concordance Correlation Coefficient (CCC), achieving a score of 0.696 on the development set and 0.403 on the testing set. The inclusion of visual features is also discussed. The results of this work were submitted to the Audio/Visual Emotion Challenge and Workshop (AVEC 2019).

©2007 Tokyo Institute of Technology All rights reserved.