Rethinking Space-Time Networks with Improved Memory Coverage for Efficient Video Object Segmentation

STCN

We present Space-Time Correspondence Networks (STCN) as the new, effective, and efficient framework to model space-time correspondences in the context of video object segmentation. STCN achieves SOTA results on multiple benchmarks while running fast at 20+ FPS without bells and whistles. Its speed is even higher with mixed precision. Despite its effectiveness, the network itself is very simple with lots of room for improvement. See the paper for technical details.

68747470733a2f2f696d6775722e636f6d2f534946713563312e676966

68747470733a2f2f696d6775722e636f6d2f6e487657757a692e676966

A Gentle Introduction

framework

There are two main contributions: STCN framework (above figure), and L2 similarity. We build affinity between images instead of between (image, mask) pairs — this leads to a significantly speed up, memory saving (because we compute one, instead of multiple affinity matrices), and robustness. We

 

 

 

To finish reading, please visit source site