Abstract

Principal component analysis, i.e. the transformation of correlated variables into uncorrelated orthogonal components, are calculated by means of the matrix Q containing the orthonormal eigenvectors of the variance-covariance matrix \Sigma of a multivariate spatial process { \Z(x)=(Z_1(x),\dots,Z_p(x))^T: x\in D\subset\IR^d} (Mardia et al., 1979). The variance-covariance matrix of spatial processes is rarely known and therefore has to be estimated. We denote the sample locations by {x_1,\dots,x_n\}\subset D, where n>1. Let \hat{U} be the natural estimator of \Sigma, defined as \hat{U}=\frac{1}{n-1}\sum_{k=1}^n (\Z(x_k)-\bar{\Z})(\Z(x_k)-\bar{\Z})^T \in\IR^{p\times p}, with \bar{\Z}=n^{-1}\sum_{k=1}^n\Z(x_k). Under spatial dependence the estimator \hat{U} is biased. By adopting theorems on mixing random fields (Brockwell and Davis, 1991; Ibragimov, 1962; Bolthausen, 1982) we can apply the developments of Tyler (1981) to show that the bias can have a significant influence on the eigenvectors, i.e. that the eigenvectors of \Sigma and \hat{U} are significantly different. We introduce a new estimator \hat{\Sigma} which is unbiased but computer-intensive since it depends on the (cross-)covariograms C_{ij}(\x_k-\x_l)=Cov(Z_i(x_k),Z_j(x_l)) of the process. To calculate an approximation of \hat{\Sigma} we use the limiting bias of \hat{U}, i.e. the bias of \hat{U} as the number of observations in a fixed domain D tends to infinity. We find that limiting bias depends only on the range and the sill of the underlying (cross-)covariograms and therefore is easier to compute. Further we show that the approximation of the estimator \hat{\Sigma} is accurate. Finally, the theoretical results are applied to a data set taken from Mondain-Monval et al. (1983), containing measurements of trace elements in Lake Geneva sediments. The measured contents of trace elements are divided by the corresponding natural contents supplying a contamination ratio that is modeled by a second-order stationary isotropic process. We estimate the matrix \hat{U} and correct the bias using our approximations. The (cross-)variograms \gamma_{ij}(|\h|)=Var(Z_i(x)-Z_j(x+h)) (Cressie, 1993) are estimated with the highly robust estimator of scale Q_n (Rousseeuw and Croux, 1993; Genton, 1998a). All the (cross-)variograms are fitted by the iterative procedure developed by Genton (1998) and the corresponding (cross-)covariograms are deduced (Papritz et al., 1993). With the accurate bias correction, we obtain the approximation of the matrix \hat{\Sigma} used for the principal components analysis.


Keywords: Bias; Cross covariogram; Principal components; Spatial correlation; Variance-covariance matrix; Limiting bias.