GIFT: Learning Transformation-Invariant
Dense Visual Descriptors via Group CNNs

NeurIPS 2019
Yuan Liu Zehong Shen Zhixuan Lin Sida Peng Hujun Bao* Xiaowei Zhou*
State Key Lab of CAD&CG, ZJU-Sensetime Joint Lab of 3D Vision, Zhejiang University     
* Corresponding author

Abstract

Finding correspondences between images with large viewpoint changes requires local descriptors that are robust against geometric transformations. An approach for transformation invariance is to integrate out the transformations by pooling the features extracted from transformed versions of the original images. However, the feature pooling may sacrifice the distinctiveness of the resulting descriptors. In this paper, we introduce a novel visual descriptor named Group Invariant Feature Transform (GIFT), which is both discriminative and robust to geometric transformations. The key idea is that the features extracted from transformed images can be viewed as a function defined on the group of transformations. Instead of pooling, we use group convolutions to exploit underlying structures of the extracted features on the group, resulting in descriptors that are both discriminative and provably invariant to the group of transformations. Extensive experiments show that GIFT outperforms state-of-the-art methods on the HPSequence dataset, the SUN3D dataset and new datasets with large scale and orientation variations.
[Paper]    [Supplementary Material]    [Code]    [Poster]   

Sparse correspondence estimation

Dense correspondence estimation