Tps {fields} | R Documentation |
Fits a thin plate spline surface to irregularly spaced data. The smoothing parameter is chosen by generalized cross-validation. The assumed model is additive Y = f(X) +e where f(X) is a d dimensional surface. This is a special case of the spatial process estimate. A "fast" version of this function uses a compactly supported Wendland covariance and computes the estimate for a fixed smoothing pararmeter.
Tps(x, Y, m = NULL, p = NULL, scale.type = "range", ...) fastTps(x, Y, m = NULL, p = NULL, theta, ...)
To be helpful, a more complete list of arguments are described that are the same as those for the Krig function.
x |
Matrix of independent variables. Each row is a location or a set of independent covariates. |
Y |
Vector of dependent variables. |
m |
A polynomial function of degree (m-1) will be included in the model as the drift (or spatial trend) component. Default is the value such that 2m-d is greater than zero where d is the dimension of x. |
p |
Exponent for radial basis functions. Default is 2m-d. |
scale.type |
The independent variables and knots are scaled to the specified scale.type. By default the scale type is "range", whereby the locations are transformed to the interval (0,1) by forming (x-min(x))/range(x) for each x. Scale type of "user" allows specification of an x.center and x.scale by the user. The default for "user" is mean 0 and standard deviation 1. Scale type of "unscaled" does not scale the data. |
theta |
The tapering range that is passed to the Wendlend compactly supported covariance. The covariace (i.e. the radial bais function) is zero beyond range theta. |
... |
Any argument that is valid for the Krig function. Some of the main ones
are listed below.
|
A thin plate spline is result of minimizing the residual sum of squares subject to a constraint that the function have a certain level of smoothness (or roughness penalty). Roughness is quantified by the integral of squared m-th order derivatives. For one dimension and m=2 the roughness penalty is the integrated square of the second derivative of the function. For two dimensions the roughness penalty is the integral of
(Dxx(f))**22 + 2(Dxy(f))**2 + (Dyy(f))**22
(where Duv denotes the second partial derivative with respect to u and v.) Besides controlling the order of the derivatives, the value of m also determines the base polynomial that is fit to the data. The degree of this polynomial is (m-1).
The smoothing parameter controls the amount that the data is smoothed. In the usual form this is denoted by lambda, the Lagrange multiplier of the minimization problem. Although this is an awkward scale, lambda =0 corresponds to no smoothness constraints and the data is interpolated. lambda=infinity corresponds to just fitting the polynomial base model by ordinary least squares.
This estimator is implemented by passing the right generalized covariance function based on radial basis functions to the more general function Krig. One advantage of this implementation is that once a Tps/Krig object is created the estimator can be found rapidly for other data and smoothing parameters provided the locations remain unchanged. This makes simulation within R efficient (see example below). Tps does not currenty support the knots argument where one can use a reduced set of basis functions. This is mainly to simplify and a good alternative using knots would be to use a valid covariance from the Matern family and a large range parameter.
See also the mKrig function for handling larger data sets and also for an example of combining Tps and mKrig for evaluation on a larger grid.
The function fastTps
is really a convenient
wrapper function that calls
mKrig
with the Wendland covariance function. This is experimental
and some care needs to exercised in specifying the taper range and
power ( p
) which describes the behavior of the Wendland at the
origin. Note that unlike Tps the locations are not scaled to unit range and this can cause havoc in smoothing problems with variables in
very different units.
So x<- scale(x)
might be a good idea for putting the
varaibles on a common sacel for smoothing. This function does have the potential to approximate estimates of Tps for very large spatial data sets. See wendland.cov
and help on the SPAM package for more background.
A list of class Krig. This includes the predicted surface of fitted.values and the residuals. The results of the grid search minimizing the generalized cross validation function is returned in gcv.grid. Note that the GCV/REML optimization is done even if lambda or df is given. Please see the documentation on Krig for details of the returned arguments.
See "Nonparametric Regression and Generalized Linear Models" by Green and Silverman. See "Additive Models" by Hastie and Tibshirani.
Krig, summary.Krig, predict.Krig, predict.se.Krig, plot.Krig, mKrig
surface.Krig
,
sreg
#2-d example fit<- Tps(ozone$x, ozone$y) # fits a surface to ozone measurements. set.panel(2,2) plot(fit) # four diagnostic plots of fit and residuals. set.panel() summary(fit) # NOTE: the predict function is quite flexible: look<- predict( fit, lambda=2.0) # evaluates the estimate at lambda =2.0 _not_ the GCV estimate # it does so very efficiently from the Krig fit object. look<- predict( fit, df=7.5) # evaluates the estimate at the lambda values such that # the effective degrees of freedom is 7.5 # compare this to fitting a thin plate spline with # lambda chosen so that there are 7.5 effective # degrees of freedom in estimate # Note that the GCV function is still computed and minimized # but the spline estimate is uses 7.5 df. fit1<- Tps(ozone$x, ozone$y,df=7.5) set.panel(2,2) plot(fit1) # four diagnostic plots of fit and residuals. # GCV function (lower left) has vertical line at 7.5 df. set.panel() # The basic matrix decompositions are the same for # both fit and fit1 objects. # predict( fit1) is the same as predict( fit, df=7.5) # predict( fit1, lambda= fit$lambda) is the same as predict(fit) # predict onto a grid that matches the ranges of the data. out.p<-predict.surface( fit) image( out.p) surface(out.p) # perspective and contour plots of GCV spline fit # predict at different effective # number of parameters out.p<-predict.surface( fit,df=10) #1-d example out<-Tps( rat.diet$t, rat.diet$trt) # lambda found by GCV plot( out$x, out$y) lines( out$x, out$fitted.values) # # compare to the ( much faster) one spline algorithm # sreg(rat.diet$t, rat.diet$trt) # # Adding a covariate to the fixed part of model # Note: this is a fairly big problem numerically (850+ locations) Tps( RMprecip$x,RMprecip$y, Z=RMprecip$elev)-> out surface( out, drop.Z=TRUE) US( add=TRUE, col="grey") ### ### fast Tps # m=2 p= 2m-d= 2 # # Note: theta =3 degrees is a very generous taper range. # Use trials with rdist.nearest to sort an efficient tapre range # for large spatial problems fastTps( RMprecip$x,RMprecip$y,m=2,lambda= 1e-2, p=2, theta=3.0) -> out2 # # simulation reusing Tps/Krig object # fit<- Tps( rat.diet$t, rat.diet$trt) true<- fit$fitted.values N<- length( fit$y) temp<- matrix( NA, ncol=50, nrow=N) sigma<- fit$shat.GCV for ( k in 1:50){ ysim<- true + sigma* rnorm(N) temp[,k]<- predict(fit, y= ysim) } matplot( fit$x, temp, type="l") # #4-d example fit<- Tps(BD[,1:4],BD$lnya,scale.type="range") # plots fitted surface and contours # default is to hold 3rd and 4th fixed at median values surface(fit)