Im4D: High-Fidelity and Real-Time Novel View Synthesis for Dynamic Scenes

SIGGRAPH Asia 2023 (Conference Track)


Haotong Lin, Sida Peng*, Zhen Xu, Tao Xie, Xingyi He, Hujun Bao, Xiaowei Zhou

State Key Lab of CAD & CG, Zhejiang University
* denotes corresponding author

Abstract


The following video shows our rendering results on the DNA-Rendering dataset.


This paper aims to tackle the challenge of dynamic view synthesis from multi-view videos. The key observation is that while previous grid-based methods offer consistent rendering, they fall short in capturing appearance details of a complex dynamic scene, a domain where multi-view image-based rendering methods demonstrate the opposite properties. To combine the best of two worlds, we introduce Im4D, a hybrid scene representation that consists of a grid-based geometry representation and a multi-view image-based appearance representation. Specifically, the dynamic geometry is encoded as a 4D density function composed of spatiotemporal feature planes and a small MLP network, which globally models the scene structure and facilitates the rendering consistency. We represent the scene appearance by the original multi-view videos and a network that learns to predict the color of a 3D point from image features, instead of memorizing detailed appearance totally with networks, thereby naturally making the learning of networks easier. Our method is evaluated on five dynamic view synthesis datasets including DyNeRF, ZJU-MoCap, NHR, DNA-Rendering and ENeRF-Outdoor datasets. The results show that Im4D exhibits state-of-the-art performance in rendering quality and can be trained efficiently, while realizing real-time rendering with a speed of 79.8 FPS for 512x512 images, on a single RTX 3090 GPU.

Overview Video



More rendering results



Video results on the DNA-Rendering[Cheng et al., 2023] dataset.


Video results on the ENeRF-Outdoor[Lin et al., 2022] and DyNeRF[Li et al., 2022] datasets.


Video results on the NHR[Wu et al., 2020] and ZJU-MoCap[Peng et al., 2021] datasets.


Comparisions with ENeRF, IBRNet and K-Planes


Citation


@inproceedings{lin2023im4d,
  title={High-Fidelity and Real-Time Novel View Synthesis for Dynamic Scenes},
  author={Lin, Haotong and Peng, Sida and Xu, Zhen and Xie, Tao and He, Xingyi and Bao, Hujun and Zhou, Xiaowei},
  booktitle={SIGGRAPH Asia Conference Proceedings},
  year={2023}
}