Title: Generative Models for Implicit Distribution Estimation: a Statistical Perspective
Abstract: The estimation of distributions of complex objects from high-dimensional data with low-dimensional structures is an important topic in statistics and machine learning. Modern generative modeling techniques accomplish this by encoding and decoding data to generate new, realistic synthetic data objects, including images and texts. A key aspect of these models is the extraction of low-dimensional latent features, assuming the data lies on a low-dimensional manifold. Our study develops a minimax framework for distribution estimation on unknown submanifolds, incorporating smoothness assumptions on both the target distribution and the manifold. Through the perspective of minimax rates, we examine some existing popular generative models, such as variational autoencoders, generative adversarial networks, and score-based generative models. By analyzing their theoretical properties, we characterize their statistical capabilities in implicit distribution estimation and identify certain limitations that could lead to potential improvements.