In this paper, we propose velocity pyramid for multimedia
event detection. Recently, spatial pyramid matching is proposed to in-
troduce coarse geometric information into Bag of Features framework,
and is eective for static image recognition and detection. In video, not
only spatial information but also temporal information, which repre-
sents its dynamic nature, is important. In order to fully utilize it, we
propose velocity pyramid where video frames are divided into motional
sub-regions. Our method is eective for detecting events characterized
by their temporal patterns. Experiment on the dataset of MED (Mul-
timedia Event Detection) has shown 10% improvement of performance
by velocity pyramid than without this method. Further, when combined
with spatial pyramid, velocity pyramid provides an extra 3% gains to
the detection result