Title: Minimax-Optimal Dimension-Reduced Clustering for High-Dimensional Nonspherical Mixtures
Abstract: In mixture models, nonspherical (anisotropic) noise within each cluster is widely present in real-world data. This work investigates both computationally efficient procedures and fundamental statistical limits for clustering in high-dimensional nonspherical mixtures. We propose a novel clustering method, Covariance Projected Spectral Clustering (COPO), which adapts to a wide range of dependent noise structures. The key idea is to project the high-dimensional data onto a low-dimensional space via eigen-decomposition of a diagonal-deleted Gram matrix, and then leverage the projected covariance matrices in this space to sharpen clustering. Through a fine-grained analysis of the subspace estimation step, which is of independent interest, we establish tight algorithmic upper bounds for COPO, applicable to Gaussian noise with flexible covariance as well as general noise with local dependence. To characterize the fundamental difficulty of clustering high-dimensional anisotropic Gaussian mixtures, we establish two distinct minimax lower bounds, each highlighting different covariance-driven barriers. Our results show that COPO achieves minimax optimality in this setting. Extensive simulation studies under diverse noise structures, along with real data analysis, demonstrate the superior empirical performance of our method.