GeneAvatar:
Generic Expression-Aware Volumetric Head Avatar
Editing from a Single Image

CVPR 2024


1State Key Lab of CAD & CG, Zhejiang University, 2Google, 3ETH Zürich, 4ByteDance
*Equal contribution.

Corresponding author.



TL;DR: We learn a modification generative model to empower various volumetric avatar representations (e.g., INSTA, NeRFBlendshape, Next3D) with the ability to edit the geometry and texture of 3D avatar using a single image.

GeneAvatar is a generic approach to edit 3D avatars in various volumetric representations (NeRFBlendShape, INSTA, Next3D) from a single perspective using 2D editing methods with drag-style, text-prompt and pattern painting. Our editing results are consistent across multiple facial expression and camera viewpoints.



How to Use

We show the pipeline of using Geneavatar to edit a personalized volumetric avatar. First, take a selfie video. Second, choose a 3DMM-based volumetric avatar method to build the personalized volumetric avatar. Third, use 2D image editing tools (e.g., drag-style GAN, text-driven image editing, Photoshop) to edit a single rendered image. Fourth, use GeneAvatar to lift the 2D editing from a single edited image to the 3D avatar.



How to Deploy

Deploy Geneavatar to your own volumetric avatar representation avatar_model need three steps:

  1. Load modification generative models:
  2. 
    def setup_geneavatar(model_path):
        geneavatar = GeneAvatar(model_path)
        return geneavatar
                        
  3. Add per-sample deformation and color blendering to the volume rendering pipeline of your own avatar representation avatar_model like:
  4. 
    """
    Avatar volume rendering pipeline
    """
    ...
    # (GeneAvatar addition) deform the sample points in volume rendering
    sample_points = sample_points + genavatar.forward_geo(vertices_3DMM, sample_points)
    ...
    
    # avatar rendering
    density, template_color = avatar_model(sample_points)
    ...
    
    # (GeneAvatar addition) color blendering
    color = genavatar.forward_color(vertices_3DMM, sample_points, template_color)
    ...
                        
  5. Use a PTI-inversion paradigm project.py to edit your avatar_model with a single edited image where avatar rendering function should be overwritten by your own avatar method:
  6. 
    """
    Project.py: perform auto-decoding optimization on a single-edited image to lift 2D editing effect to 3D avatar.
    """
    
    def synthesis(data, avatar_model):
    """
    Replace your avatar rendering function here.
    """
        preds = avatar_model.render(data)
        return preds
    ...
                        

Abstract

Despite the great success in 2D editing using user-friendly tools, such as Photoshop, semantic strokes, or even text prompts, similar capabilities in 3D areas are still limited, either relying on 3D modeling skills or allowing editing within only a few categories. In this paper, we present a novel semantic-driven NeRF editing approach, which enables users to edit a neural radiance field with a single image, and faithfully delivers edited novel views with high fidelity and multi-view consistency. To achieve this goal, we propose a prior-guided editing field to encode fine-grained geometric and texture editing in 3D space, and develop a series of techniques to aid the editing process, including cyclic constraints with a proxy mesh to facilitate geometric supervision, a color compositing mechanism to stabilize semantic-driven texture editing, and a feature-cluster-based regularization to preserve the irrelevant content unchanged. Extensive experiments and editing examples on both real-world and synthetic data demonstrate that our method achieves photo-realistic 3D editing using only a single edited image, pushing the bound of semantic-driven editing in 3D real-world scenes.



Video

YouTube Source






Framework Overview

Geneavatar architecture.

We use an expression-aware generative model that accepts a modification latent code $\mathbf{z}_{g/t}$ and 3DMM coefficients and outputs a modification field of a tri-plane structure. The modification field modifies the geometry and texture of the template avatar by deforming the sample points $\mathbf{x}$ and blending the color $\mathbf{c}_o$ with the modification color $c_{\Delta}$ respectively. We lift the 2D editing effect to 3D using an auto-decoding optimization and synthesize novel views across different expression.



Geometry Editing on INSTA Avatars

Click left or right arrow for more examples. Drag the sliders on below videos to see more of different renderings. Refresh the page if some videos are missing.


Geometry Editing on NBShape Avatars

Click left or right arrow for more examples. Drag the sliders on below videos to see more of different renderings. Refresh the page if some videos are missing.


Geometry Editing on Next3D Avatars

Click left or right arrow for more examples. Drag the sliders on below videos to see more of different renderings. Refresh the page if some videos are missing.

Texture Editing on INSTA Avatars

Text-driven

Click left or right arrow for more examples. Drag the sliders on below videos to see more of different renderings. Refresh the page if some videos are missing.


Texture Editing on INSTA Avatars

Pattern Painting & Makup

Click left or right arrow for more examples. Drag the sliders on below videos to see more of different renderings. Refresh the page if some videos are missing.


Texture Editing on NBShape Avatars

Pattern Painting & Makup

Click left or right arrow for more examples. Drag the sliders on below videos to see more of different renderings. Refresh the page if some videos are missing.


Texture Editing on Next3D Avatars

Click left or right arrow for more examples. Drag the sliders on below videos to see more of different renderings. Refresh the page if some videos are missing.

Hybrid Editing

Click left or right arrow for more examples. Drag the sliders on below videos to see more of different renderings. Refresh the page if some videos are missing.

Face Reenactment

BibTex

                
@inproceedings{bao2024geneavatar,
    title={GeneAvatar: Generic Expression-Aware Volumetric Head Avatar Editing from a Single Image},
    author={Bao, Chong and Zhang, Yinda and Li, Yuan and Zhang, Xiyu and Yang, Bangbang and Bao, Hujun and Pollefeys, Marc and Zhang, Guofeng and Cui, Zhaopeng},
    booktitle={The IEEE/CVF Computer Vision and Pattern Recognition Conference (CVPR)},
    year={2024}
}

The website template is borrowed from LAMP.