Speech-linguistic Multimodal Representation for Depression Severity Assessment

Rodrigues Makiuchi Mariana; Warnita Tifani; Uto Kuniaki; Shinoda Koichi

論文・著書情報

タイトル

和文:
英文:	Speech-linguistic Multimodal Representation for Depression Severity Assessment

著者

和文:	R Makiuchi Mariana, Warnita Tifani, 宇都有昭, 篠田浩一.
英文:	Rodrigues Makiuchi Mariana, Warnita Tifani, Uto Kuniaki, Shinoda Koichi.

言語

English

掲載誌/書名

和文:	情報処理学会研究報告. SLP, 音声言語情報処理
英文:	IPSJ SIG Technical Report

巻, 号, ページ

Vol. 2019-SLP-130 No. 8 pp. 1-4

出版年月

2019年12月

出版者

和文:	情報処理学会
英文:	Information Processing Society of Japan

会議名称

和文:	第130回SLP研究発表会
英文:

開催地

和文:	東京都
英文:	Tokyo

ファイル

公式リンク

https://ipsj.ixsq.nii.ac.jp/ej/?action=pages_view_main&active_action=repository_view_main_item_detail&item_id=200787&item_no=1&page_id=13&block_id=8

アブストラクト

Depression is a common, but serious mental disorder that affects people all over the world. Therefore, an automatic depression assessment system is demanded to make the diagnosis of this disorder more popular. We propose a multimodal fusion of speech and linguistic representations for depression detection. We train our model to infer the Patient Health Questionnaire (PHQ) score of subjects from the E-DAIC corpus. For the speech modality, we apply VGG-16 extracted features to a Gated Convolutional Neural Network (GCNN) and a LSTM layer. For the linguistic representation, we extract BERT features from transcriptions and input them to a CNN and a LSTM layer. We evaluated the feature fusion model with the Concordance Correlation Coefficient (CCC), achieving a score of 0.696 on the development set and 0.403 on the testing set. The inclusion of visual features is also discussed. The results of this work were submitted to the Audio/Visual Emotion Challenge and Workshop (AVEC 2019).

Home

各種検索

サポート

T2R2について

関連リンク

論文・著書情報