State Key Lab of CAD & CG, Zhejiang University
ZJU-SenseTime Joint Lab of 3D Vision, Zhejiang University
* denotes equal contributions
Intrinsic image decomposition, i.e., decomposing a natural image into a reflectance image and a shading image, is used in many augmented reality applications for achieving better visual coherence between virtual contents and real scenes. The main challenge is that the decomposition is ill-posed, especially in indoor scenes where lighting conditions are complicated, while real training data is inadequate. To solve this challenge, we propose NIID-Net, a novel learning-based framework that adapts surface normal knowledge for improving the decomposition. The knowledge learned from relatively more abundant data for surface normal estimation is integrated into intrinsic image decomposition in two novel ways. First, normal feature adapters are proposed to incorporate scene geometry features when decomposing the image. Secondly, a map of integrated lighting is proposed for propagating object contour and planarity information during shading rendering. Furthermore, this map is capable of representing spatially-varying lighting conditions indoors. Experiments show that NIID-Net achieves competitive performance in reflectance estimation and outperforms all previous methods in shading estimation quantitatively and qualitatively.
We insert virtual posters into real scenes by editing reflectance layers and remains shading layers for achieving photorealistic image editing. This application is suitable for augmented reality systems such as advertising and scene refurnishing. The entire video is presented on the YouTube.
Given a single sRGB input image, the proposed NIID-Net predicts a colorful reflectance image and a gray-scale shading intensity image. The NIID-Net contains a NEM (blue rectangle) and an IID-Net (orange rectangle). The IID-Net integrates surface normal knowledge via the NFAs and shading rendering.
The first, second and third rows are estimated shading images, predicted or ground-truth normal maps, and estimated reflectance images. We and GLoSH (SUNCG+IIW+SAW) predict surface normals by the deep neural networks, while Chen and Koltun compute surface normals from ground-truth depth. Geometry contours in our predicted shading images are the sharpest. Blue rectangles: we remove the most textures from the predicted shading. Green rectangles: we recover the highlights best. Orange rectangles: the intensity of predicted shading from Chen and Koltun is strongly affected by that of the input image, while the intensity of our predictions is more coherent in the neighborhood. Our reflectance images are also better than those of Chen and Koltun, as many shading variations are shifted into their reflectance.
We compare our results with Li and Snavely’s (CGI+IIW+SAW), and GLoSH (SUNCG+IIW+SAW). For each sample, the first row shows predicted shading images, and the second row shows predicted reflectance images. Blue rectangles: our shading results have the least texture residuals. Orange rectangles: our method best captures the shading effects. Green rectangles: our method predicts the most detailed reflectance as well as the most smooth shading. More results are presented in the supplementary material.
@article{luo2020niid,
title={NIID-Net: Adapting Surface Normal Knowledge for Intrinsic Image Decomposition in Indoor Scenes},
author={Luo, Jundan and Huang, Zhaoyang and Li, Yijin and Zhou, Xiaowei and Zhang, Guofeng and Bao, Hujun},
journal={IEEE Transactions on Visualization and Computer Graphics},
volume={26},
number={12},
pages={3434--3445},
year={2020},
publisher={IEEE}
}