Fields: On-Line Manual

[an error occurred while processing this directive]

Fields: Web Manual
Table of Contents

1. Introduction

2. Thin Plate Splines: Tps

3. Spatial Process Models: Krig

4. Simulating Random Fields (sim.rf)

5. Spatial Predictions for Large Data Sets

6. Other Fields Functions

Space-Filling Designs
Working With Images
- Gridding Irregular Spaced Observations
- Smoothing Images
- Plotting

References

Printable version of Fields Manual

Back to Fields Home Page

Some Details on Tps and Krig

Tps and Krig:
The Tps estimate can be thought of as a special case of a spatial process estimate using a particular generalized covariance function. Because of this correspondence the actual Tps function is a short wrapper that determines the polynomial order for the null space and then calls Krig with the radial basis generalized covariance (rad.cov). In this way Tps requires little extra coding beyond the Krig function itself and in fact the returned list from Tps is a Krig object and so it is adapted to all the special functions developed for this object format. One down side is some confusion in the summary of the thin plate spline as a spatial process type estimator. For example, the summary refers to a covariance model and maximum likelihood estimates. This is not something usually reported with a spline fit!

Replications

By a replicate point we mean more than one observation made at the same location. The Fields BD dataset is an example. The Tps and Krig functions will handle replicate points automatically, identifying replicated values and calculating the estimate of sigma based on a pure error sums of squares. The actual computation of the estimator is based on collapsing the data into a set of unique locations and the dependent vector y is collapsed into a smaller vector of weighted means. The Fields functions cat.matrix and fast.1way are used to efficiently identify the unique locations and produce the replicate group means and pooled weights. Replicate points posed several ways of performing cross-validation and this issue is discussed below.

Matrix decompositions

In writing Krig a decision was made that the Krig object can be used to find estimators for many different values of lambda (the ratio of sigma squared to rho). The computations to evaluate the estimator for many values of lambda are more extensive than for just one value. In this general case, the matrix computations are dominated the cube of the number of unique locations, np. The Wahba-Bates-Wendelberger algorithm (WBW) requires an eigenvector/eigenvalue of a matrix on the order np x np and the Demmler Reinsch algorithm (DR) requires the inverse square root and eigenvector/eigenvalue decompositions of np size matrices. Unlike WBW the DR is more amenable to using a reduced set of basis functions and so it is the default for Krig. However, for consistency with other spline implementations (e.g. GCVPACK) WBW is the default method for Tps. The user can control which is used by the decomp argument in the call.

Reducing the number of basis functions

The Krig estimate is a function that is a linear combination of basis functions derived from the covariance kernel. By default the standard Kriging estimator uses as many basis functions as unique locations. For large problems and noisy data this may be excessive. For example with 1000 locations one may feel that the true surface can be adequately approximated with far fewer degrees of freedom than that afforded by the full set of 1000 basis functions. To support this approximation, Krig and Tps have the option of taking a set of knot locations that define a basis independently of the data. The basis functions are bump shaped functions defined by the covariance and with the peak centered at the knot location. Thus evenly distributing the knot locations throughout the data region may be reasonable. One might also construct knot locations from a regular grid or use the cover.design function to find an irregular set of space filling points or just randomly sample (without replcement!) the observed data locations.

Here is an example for the Midwest ozone data.

#50 points selected from the locations that purposefully spread out.
knots <- cover.design( ozone2$lon.lat, 50)$design

good<- !is.na( ozone2$y[12,])
x<- ozone2$lon.lat[good,]
y<- c( ozone2$y[12,good])
Tps( x, y, knots=knots)-> out
[1] "Maximum likelihood estimates not found with knots \n \n "

The warning from this fit is expected because some estimates related to the covariance are not well defined for a reduced basis set.

Generalized cross-validation and choosing the smoothing parameter

The smoothing parameter, lambda is estimated by default by minimizing the generalized cross validation function. When there are no replicate points and full basis is used there is no ambiguity about the form for GCV. The basic CV strategy is to omit a part of the data and use the model fit on the remaining data to predict the left out observations. This basic form is the GCV.one criterion calculated in gcv.Krig. However, the GCV criterion is a ratio with the numerator being a mean residual sum of squares and the denominator being the squared fraction of degrees of freedom associated with the residuals. Variations on GCV used in Krig are based on modifying the mean square of the numerator and the number of observations entering the denominator.

For replicate points or a reduced set of basis functions the residual sums of squares can be broken into a pure error piece and one that is is changed by the smoothing parameter. For replicates this is the pure error sums of squares of the observations about their group means and for a reduced basis it is the residual sums of squares from regressing the full basis on the data. The default GCV criterion weights these pieces separately in an attempt to make the numerator more interpretable as an estimate of mean squared error.

With replicate points an alternative to omitting individual observations is to omit the entire replicate group. Technically this is accomplished by omitting the pure error sums of squares from the denominator of the GCV criterion. We refer to this result as GCV.model and is less likely to over fit the data than the other criterion.

The BD dataset in Fields contains replicated points. As an example, here are the results of different criterion applied to this data set.

fit <- Tps(BD[,1:4], BD$lnya, scale.type="range")
fit$lambda.est

Besides these three versions of the GCV criterion another point of flexibility is the cost parameter used in the denominator. The default is cost=1 although specifying larger costs will force the criterion to favor models with fewer model degrees of freedom. For the BD example we see:

gcv.Krig( fit, cost=2)$lambda.est

In the discussion given above it has been assumed that once a GCV criterion is adopted one seeks to find the lambda value that minimizes the criterion. We have two qualifications of this basic approach. First in any minimization it is a good idea to examine the criterion for multiple local minima and other strange behavior. The plot function applied to the Krig or Tps object will plot the GCV measures as a function of the smoothing parameter ( Usually the lower left plot in the panel of 4). It is a good idea to examine this graph. Secondly, with replicate points, one may want the estimate of sigma^2 implied by the curve or surface fit to agree with that obtained from the pure error estimate. So rather than minimize a GCV criterion one would find the value of lambda so that the two estimates of sigma^2 are equal. In terms of the Krig object this would find lambda so that shat.GCV and shat.pure.error are equal. The only hazard of this estimate is when the replicates give an artificially low representation of the error variance. One should also note that this equality is not always possible because the pure error estimate may be outside the range possible from the spline based estimate.

Reusing objects:

The overall strategy for the Krig and Tps function is to do some extensive matrix decomposition once and then exploit their use in finding the estimate at many different values of lambda or even different sets of observations. In particular the predict and gcv.Krig functions are not limited to just the estimated values of lambda and the data vector y used to create the Krig object. The only constraint is that if the locations of the observations or the weighting matrix for the error variance are changed, then decompositions must be redone. Besides being efficient for looking at different values of the smoothing parameter this setup also facilitates bootstrapping and simulation.

Here are some short examples to show how this works.

Tps(ozone$x, ozone$y)-> fit

the GCV estimate gives an effective degrees of freedom of around 4.5.

predict(fit)

evaluates the spline at this level of smoothing

predict( fit, df=8)

evaluates the spline using a model with the effective degrees of freedom is equal to 8. This last result is equivalent to

Tps(ozone$x, ozone$y, df=8)-> fit2
predict( fit2)

but is more efficient because the decompositions are not recreated. Instead they are reused from the fit object.

To redo the GCV search with new data one can just call gcv.Krig with the Krig object and another y,

gcv.Krig( fit, y=ozone$y)-> gcv.out

Here is an example of a bootstrap to see the sampling variability for GCV. Assume that the "true" function is the spline estimate from the data. We will reuse the Krig object to get efficient estimates when just the y's are changed.

true<- fit$fitted.values

sigma<- fit$shat.GCV
out<- matrix(NA, nrow=50, ncol=4)

set.seed(123)

for( k in 1:50){
y<- true + rnorm(20)*sigma
gcv.Krig( fit, y=y)-> temp
out[k,] <- temp$lambda.est[1,]
}

Within this loop we have just saved the estimated smoothing parameter. However, an additional line calling predict with the old object but the simulated data and new value of lambda would give an efficient evaluation of the estimated spline.

Finally a scatterplot of the results

plot( out[,2], out[,4], xlab="eff df", ylab="shat.GCV")

One moral from these results is to try different values of the smoothing parameter -- although GCV usually gives unbiased estimates, it is also highly variable.

This is software for statistical research and not for commercial uses. The authors do not guarantee the correctness of any function or program in this package. Any changes to the software should not be made without the authors permission.

Last modified: Dec 21 2005 by thoar@ucar.edu