SAM-guided Graph Cut for 3D Instance Segmentation

ECCV 2024


Haoyu Guo1*, He Zhu2*^, Sida Peng1, Yuang Wang1, Yujun Shen3, Ruizhen Hu4†, Xiaowei Zhou1†

1Zhejiang University    2Beijing Normal University    3Ant Group    4Shenzhen University
* Equal contribution.
^ Work done during internship at Zhejiang University.
Corresponding authors.

Abstract


This paper addresses the challenge of 3D instance segmentation by simultaneously leveraging 3D geometric and multi-view image information. Many previous works have applied deep learning techniques to 3D point clouds for instance segmentation. However, these methods often failed to generalize to various types of scenes due to the scarcity and low-diversity of labeled 3D point cloud data. Some recent works have attempted to lift 2D instance segmentations to 3D within a bottom-up framework. The inconsistency in 2D instance segmentations among views can substantially degrade the performance of 3D segmentation. In this work, we introduce a novel 3D-to-2D query framework to effectively exploit 2D segmentation models for 3D instance segmentation. Specifically, we pre-segment the scene into several superpoints in 3D, formulating the task into a graph cut problem. The superpoint graph is constructed based on 2D segmentation models, where node features are obtained from multi-view image features and edge weights are computed based on multi-view segmentation results, enabling the better generalization ability. To process the graph, we train a graph neural network using pseudo 3D labels from 2D segmentation models. Experimental results on the ScanNet, ScanNet++ and KITTI-360 datasets demonstrate that our method achieves robust segmentation performance and can generalize across different types of scenes.


Overview video



Comparisons


ScanNet
ScanNet++
KITTI-360
Ours
Mask3D [Schult 2023]
Ours
Mask3D [Schult 2023]
Ours
Mask3D [Schult 2023]
Ours
SAM3D [Yang 2023]

Segmentation results showcase



Zoom in by scrolling. You can toggle the “Single Sided” option in Model Inspector (pressing I key) to enable back-face culling (see through walls).


Citation


@inproceedings{guo2024sam-graph,
  title={SAM-guided Graph Cut for 3D Instance Segmentation},
  author={Guo, Haoyu and Zhu, He and Peng, Sida and Wang, Yuang and Shen, Yujun and Hu, Ruizhen and Zhou, Xiaowei},
  booktitle={ECCV},
  year={2024}
}