Combining Audio Features and  Visual i-vector at MediaEval 2015 Multimodal Person Discovery in Broadcast TV

Fumito Nishi; Nakamasa Inoue; Koichi Shinoda

論文・著書情報

タイトル

和文:
英文:	Combining Audio Features and Visual i-vector at MediaEval 2015 Multimodal Person Discovery in Broadcast TV

著者

和文:	西史人, 井上中順, 篠田浩一.
英文:	Fumito Nishi, Nakamasa Inoue, Koichi Shinoda.

言語

English

掲載誌/書名

和文:
英文:	Proc. MediaEval Workshop

巻, 号, ページ

出版年月

2015年9月14日

出版者

和文:
英文:

会議名称

和文:
英文:	MediaEval 2015

開催地

和文:	Wurzen
英文:	Wurzen

ファイル

公式リンク

http://wwwu.edu.uni-klu.ac.at/miriegle/mediaeval/Paper%2039.pdf

アブストラクト

This paper describes our diarization system for the Multimodal Person Discovery in Broadcast TV task of the MediaEval 2015 Benchmark evaluation campaign [1]. The goal of this task is naming speakers, who are appearing and speaking simultaneously in the video, without prior knowledge. Our diarization system is based on multimodal approach to combine audio and visual informations. We extract features from a face in each shot to make visual i-vectors [2], and introduce them to the provided baseline system. In the case of faces are extracted correctly, the performance becomes better, but based on the test run, clear improvement could not be observed.

Home

各種検索

サポート

T2R2について

関連リンク

論文・著書情報