1Apple
2University of Illinois, Urbana-Champaign
"What is really needed to make an existing 2D GAN 3D-aware?"
To answer this question, we modify a classical GAN, i.e., StyleGANv2, as little as possible. We find that only two modifications are absolutely necessary:
We refer to the generated output as a 'generative multiplane image' (GMPI) and emphasize that its renderings are not only high-quality but also guaranteed to be view-consistent, which makes GMPIs different from many prior works. Importantly, the number of alpha maps can be dynamically adjusted and can differ between training and inference, alleviating memory concerns and enabling fast training of GMPIs in less than half a day at a resolution of 10242. Our findings are consistent across three challenging and common high-resolution datasets, including FFHQ, AFHQv2 and MetFaces.
Please click to play / pause each figure; drag the separator to see the
pixel-aligned geometry (during the video playing).
Click here to reset.
Desktop browser is recommended for this to work properly.
We present several generated scenes in an interactive viewer. Please click each image to open the interactive viewer.
The Chrome browser is recommended.
Here are brief instructions for using the viewer.
We would like to thank the DeepView authors for their interactive MPI web viewer.
The following videos present the appearance and geometry of 3D content generated by GMPI with random sampling. We use Marching Cubes to extract geometry from predicted alpha maps.
@inproceedings{zhao2022gmpi,
title = {Generative Multiplane Images: Making a 2D GAN 3D-Aware},
author = {Xiaoming Zhao
and Fangchang Ma
and David Güera
and Zhile Ren
and Alexander G. Schwing
and Alex Colburn},
booktitle = {Proc. ECCV},
year = {2022},
}
Work done as part of Xiaoming Zhao's internship at Apple. Supported in part by NSF grants 1718221, 2008387, 2045586, 2106825, MRI #1725729, NIFA award 2020-67021-32799