Smoothing Matters: Momentum Transformer for Domain Adaptive Semantic Segmentation
This repo contains the supported code and configuration files to reproduce semantic segmentaion results of TransDA.
Paper
Smoothing Matters: Momentum Transformer for Domain Adaptive Semantic Segmentation
Abstract After the great success of Vision Transformer variants (ViTs) in computer vision, it has also demonstrated great potential in domain adaptive semantic segmentation. Unfortunately, straightforwardly applying local ViTs in domain adaptive semantic segmentation does not bring in expected improvement. We find that the pitfall of local ViTs is due to the severe high-frequency components generated during both the pseudo-label construction and features alignment for target domains. These high-frequency components make the training of local ViTs very unsmooth and hurt their transferability. In this paper, we introduce a low-pass filtering mechanism, momentum network, to smooth the learning