n-Gram Models for Video Semantic Indexing

Nakamasa Inoue; Koichi Shinoda

doi:10.1145/2647868.2654961

Publication Information

Title

Japanese:
English:	n-Gram Models for Video Semantic Indexing

Author

Japanese:	井上中順, 篠田浩一.
English:	Nakamasa Inoue, Koichi Shinoda.

Language

English

Journal/Book name

Japanese:
English:	Proc. ACM Multimedia (MM)

Volume, Number, Page

pp. 777-780

Published date

Nov. 3, 2014

Publisher

Japanese:
English:	ACM

Conference name

Japanese:
English:	ACM Multimedia

Conference site

Japanese:	Orlando, FL
English:

Official URL

http://acmmm.org/2014/

DOI

https://doi.org/10.1145/2647868.2654961

Abstract

We propose n-gram modeling of shot sequences for video semantic indexing, in which semantic concepts are extracted from a video shot. Most previous studies for this task have assumed that video shots in a video clip are independent from each other. We model the time-dependency between them assuming that n-consecutive video shots are dependent. Our models improve the robustness against occlusion and camera-angle changes by effectively using information from the previous video shots. In our experiments on the TRECVID 2012 Semantic Indexing Benchmark, we applied the proposed models to a system using Gaussian mixture models and support vector machines. Mean average precision was improved from 30.62% to 32.14%, which is the best performance on the TRECVID 2012 Semantic Indexing to the best of our knowledge.

Home

Search

Support

About T2R2

Related Links

Publication Information