Robust Video Information Retrieval using Speech Technologies

Koichi Shinoda

論文・著書情報

タイトル

和文:
英文:	Robust Video Information Retrieval using Speech Technologies

著者

和文:	篠田浩一.
英文:	Koichi Shinoda.

言語

English

掲載誌/書名

和文:
英文:

巻, 号, ページ

出版年月

2015年6月30日

出版者

和文:
英文:

会議名称

和文:
英文:	Korea University

開催地

和文:	ソウル
英文:	Seoul

アブストラクト

The amount of video data on the Internet has been rapidly increasing. Those video have large variety and in most case with low quality. Robust techniques for video indexing are strongly demanded. In automatic video semantic indexing, a user submits a textual input query for a desired object or a scene to a search system, which returns video shots that include the object or scene. In this application, many techniques developed in speech research have been successfully employed. For example, a new method using Gaussian-mixture model (GMM) supervectors and support vector machines (SVMs) was recently proven to be very effective. In this method, speech technologies such as speaker verification and speaker adaptation techniques play very important roles. In this lecture, we first introduce the activities of NIST TRECVID workshop which is a showcase of the state-of-the-art video search technologies, and then, discuss several techniques such as SIFT and HOG features, Bag of Visual Words, Fisher kernel, Multi-modal framework, and Fast tree search, to achieve robustness against the variety of the Internet video.

Home

各種検索

サポート

T2R2について

関連リンク

論文・著書情報