Home >

news Help

Publication Information


Title
Japanese: 
English:Pixel-level Contrastive Learning of Driving Videos with Optical Flow 
Author
Japanese: 高橋 那弥, Shingo Yashima, Kohta Ishikawa, 佐藤 育郎, 横田 理央.  
English: Tomoya Takahashi, Shingo Yashima, Kohta Ishikawa, Ikuro Sato, Rio Yokota.  
Language English 
Journal/Book name
Japanese: 
English:Proc. CVPR workshop 2023 
Volume, Number, Page        
Published date June 2023 
Publisher
Japanese: 
English:IEEE 
Conference name
Japanese: 
English:CVPR workshop 2023 
Conference site
Japanese:バンクーバー 
English:Vancouver 
Abstract Recognition of the external environment through cameras and LIDAR play a central role in the safety of autonomous driving. The accuracy of such recognition has drastically improved with the advent of deep learning, but is still insufficient for fully autonomous driving. Even though it is possible to collect large amounts of driving data [1, 11, 14, 18], the cost to annotate such data is prohibitive. Recent efforts have focused on self-supervised learning, which does not require annotated data. In this work, we improve the accuracy of self-supervised learning on driving data by combing pixel-wise contrastive learning (PixPro) with optical flow. Unlike most self-supervised methods, PixPro is trained on pixel-level pretext tasks, which yields better accuracy on downstream tasks requiring dense pixel predictions. However, PixPro does not consider the large change in scale of objects, commonly found in driving data. We show that by incorporating optical flow into the pixel-wise contrastive pre-training, we can improve the performance of downstream tasks such as semantic segmentation on CityScapes. We found that using the optical flow between temporarily distant frames can help learn the invariance between large scale changes, which allows us to exceed the performance of the original PixPro method. Our code is released upon acceptance.

©2007 Tokyo Institute of Technology All rights reserved.