High-dimensional eigen-analysis and spiked models

Debashis Paul
University of California, Davis

In this talk we shall consider several statistical problems involving eigen-analysis for high-dimensional data. Principal components analysis, i.e. spectral analysis of the sample covariance matrix of the observations, is a commonly used tool for dimension reduction in such problems. We describe a certain subclass of the problems that involves population covariances with a few eigenvalues separated from the bulk of the eigenvalues. This model is referred to as the "spiked model."

The first problem involves independent and identically distributed large-dimensional vectors with a spiked covariance matrix. Here we describe a phase transition phenomenon for the larger sample eigenvalues and the corresponding sample eigenvectors. We also show that if the population eigenvalues exceed a certain threshold the corresponding sample eigenvalues and certain projections of the corresponding eigenvectors have asymptotically Gaussian behavior. Another class of model that we study involves time-course spatial data with a separable spatio-temporal covariance having a few large eigenvalues for the spatial covariance. We also consider a class of multi-dimensional time series for which similar phenomena take place.

Time permitting, we shall give a brief account on some recently developed estimation procedures that can improve on the estimates under some structural assumptions on the covariance matrix.

Back to agenda