Factorized and Controllable Neural Re-Rendering of Outdoor Scene
for Photo Extrapolation

ACM Multimedia 2022 (Oral)

Boming Zhao1*, Bangbang Yang1*, Zhenyang Li2, Zuoyue Li3, Guofeng Zhang1, Jiashu Zhao4, Dawei Yin2, Zhaopeng Cui1, Hujun Bao1
* denotes equal contribution

1State Key Lab of CAD & CG, Zhejiang University    2Baidu Inc    3ETH Zürich    4Wilfrid Laurier University


Expanding an existing tourist photo from a partially captured scene to a full scene is one of the desired experiences for photography applications. Although photo extrapolation has been well studied, it is much more challenging to extrapolate a photo (i.e., selfie) from a narrow field of view to a wider one while maintaining a similar visual style. In this paper, we propose a factorized neural re-rendering model to produce photorealistic novel views from cluttered outdoor Internet photo collections, which enables the applications including controllable scene re-rendering, photo extrapolation and even extrapolated 3D photo generation. Specifically, we first develop a novel factorized re-rendering pipeline to handle the ambiguity in the decomposition of geometry, appearance and illumination. We also propose a composited training strategy to tackle the unexpected occlusion in Internet images. Moreover, to enhance photo-realism when extrapolating tourist photographs, we propose a novel realism augmentation process to complement appearance details, which automatically propagates the texture details from a narrow captured photo to the extrapolated neural rendered image. The experiments and photo editing examples on outdoor scenes demonstrate the superior performance of our proposed method in both photo-realism and downstream applications.

Overview Video

Our Method


Given an Internet photo collection of an outdoor attraction, we learn a novel neural re-rendering model that encodes scenes with several factorized components, which enables the applicabilities of controllable scene re-rendering, photo extrapolation and extrapolated 3D photo generation. All the images are from the IMC-PT dataset. Photos by Flickr users astrobri, soniadal82, Fotero, scriptingnews, Devin Ford, and MikiAnn.

Applicability: Scene Touring with Mesh Visualization

We show scene touring on two outdoor attractions, and also visualize the learned scene mesh. Our model successfully learns detailed shapes of buildings and sculptures, which ensures photorealistic scene re-rendering with external lighting.

Applicability: Illumination Adaptation & Controlling

We show the network ability of illumination adaptation and controlling. By simply given a captured reference photo or a user-selected HDR map, our model can render novel views of the scene with the target illumination condition.

Applicability: Extrapolated 3D Photo Generation

Given a travel photo, our method first segment the foreground person's view and generates the extrapolated background view. By blending these two views, we obtain an extrapolated 3D photo with a vivid camera moving effect. As a comparison, standard 3D photo inpainting methods are not aware of the scene structure, so their results are bounded by the visible area of the given photo.

Comparison of Photo Extrapolation


We compare photo extrapolation with Auto-Stitch, PixelSynth and NeRF-W on four outdoor scenes. Photos by Flickr users Hugão Cota, Legalv1, Foster's Lightroom, and stobor.


    title={Factorized and Controllable Neural Re-Rendering of Outdoor Scene for Photo Extrapolation},
    author={{Boming Zhao and Bangbang Yang} and Li, Zhenyang and Li, Zuoyue and Zhang, Guofeng and Zhao, Jiashu and Yin, Dawei and Cui, Zhaopeng and Bao, Hujun},
    booktitle={Proceedings of the 30th ACM International Conference on Multimedia},