Home >

news Help

Publication Information

English:n-Gram Models for Video Semantic Indexing 
Japanese: 井上 中順, 篠田 浩一.  
English: Nakamasa Inoue, Koichi Shinoda.  
Language English 
Journal/Book name
English:Proc. ACM Multimedia (MM) 
Volume, Number, Page         pp. 777-780
Published date Nov. 3, 2014 
Conference name
English:ACM Multimedia 
Conference site
Japanese:Orlando, FL 
Official URL http://acmmm.org/2014/
DOI https://doi.org/10.1145/2647868.2654961
Abstract We propose n-gram modeling of shot sequences for video semantic indexing, in which semantic concepts are extracted from a video shot. Most previous studies for this task have assumed that video shots in a video clip are independent from each other. We model the time-dependency between them assuming that n-consecutive video shots are dependent. Our models improve the robustness against occlusion and camera-angle changes by effectively using information from the previous video shots. In our experiments on the TRECVID 2012 Semantic Indexing Benchmark, we applied the proposed models to a system using Gaussian mixture models and support vector machines. Mean average precision was improved from 30.62% to 32.14%, which is the best performance on the TRECVID 2012 Semantic Indexing to the best of our knowledge.

©2007 Tokyo Institute of Technology All rights reserved.