E³Gen: Efficient, Expressive and Editable Avatars Generation

Weitian Zhang¹

Yichao Yan¹

Yunhui Liu²

Xingdong Sheng²

Xiaokang Yang¹

¹Shanghai Jiao Tong University

²Lenovo Research

Paper

Arxiv

Code

Our novel method, E³Gen, demonstrates its capability of generating high-fidelity avatars with detailed cloth wrinkles and achieves a rendering resolution of 1024² in real time. Furthermore, our method offers comprehensive control over camera views, and full-body poses, and supports attribute transfer and local editing.

Abstract

This paper aims to introduce 3D Gaussian for efficient, expressive, and editable digital avatar generation. This task faces two major challenges: (1) The unstructured nature of 3D Gaussian makes it incompatible with current generation pipelines; (2) the expressive animation of 3D Gaussian in a generative setting that involves training with multiple subjects remains unexplored. In this paper, we propose a novel avatar generation method named E³Gen, to effectively address these challenges. First, we propose a novel generative UV features plane representation that encodes unstructured 3D Gaussian onto a structured 2D UV space defined by the SMPL-X parametric model. This novel representation not only preserves the representation ability of the original 3D Gaussian but also introduces a shared structure among subjects to enable generative learning of the diffusion model. To tackle the second challenge, we propose a part-aware deformation module to achieve robust and accurate full-body expressive pose control. Extensive experiments demonstrate that our method achieves superior performance in avatar generation and enables expressive full-body pose control and editing.

Full Video

Method Overview

Random Generation

Our method can generate digital avatars with complex appearance and rich details, clearly exhibiting fingers and clothing wrinkles.

Novel Pose Animation

Our method exhibit robustly animation results to these challenging novel poses.

Editing

Our method enables local editing and partial attribute transfer. The edited results also support pose control.

Citation



@inproceedings{zhang2024e3gen,
    author = {Zhang, Weitian and Yan, Yichao and Liu, Yunhui and Sheng, Xingdong and Yang, Xiaokang},
    title = {E3Gen: Efficient, Expressive and Editable Avatars Generation},
    year = {2024},
    booktitle = {Proceedings of the 32nd ACM International Conference on Multimedia},
    pages = {6860–6869},
    numpages = {10},
    series = {MM '24}
}