\name{nnreg}
\alias{nnreg}
\title{  Fits a surface based on a neural network
 }
\description{
Uses nonlinear regression to fit a single hidden layer neural network
model to regression data. The surface has the form:

y = b.0 + sum(j=1,K)   b.j*phi( mu.j + t(gamma.j)*X)

Here X is a d dimensional vector, b has length K+1, mu has length K
and the gamma.j are K vectors each of length d. The function phi is
the logistic function.
}
\usage{
nnreg(x, y, k1 = NULL, k2 = NULL, start = NULL, ngrind = 250, 
    ntries = 100, npol = 20, ftol1 = 1e-06, ftol2 = 1e-09, itmax1 = 250, 
    itmax2 = 10000, fitted.values = FALSE, 
    all.fits = FALSE, greedy = FALSE, seed = NULL, fast = FALSE, 
    na.rm = TRUE, verbose = FALSE, too.fast = FALSE, just.setup = FALSE, 
    just.read = FALSE, batch.inname = "nnreg.input", 
    batch.outname = "nnreg.output") 
}
%- maybe also `usage' for other objects documented here.
\arguments{
\item{x}{
Matrix of independent variables.
}
\item{y}{
Vector of dependent variables.
}
\item{k1}{
Lower limit for K, where K is the number of hidden units.
}
\item{k2}{
Upper limit for K. If missing set to k1.
}
\item{start}{
Starting values for parameters. This must be a list with components 
k=number of hidden units and theta= parameter vector. Parameters are 
assumed in the order respected by a "netfit" object that is returned by nnreg. 
See the example below for using a previous fit as starting values for 
a more refined fit. 
}
\item{ngrind}{
Number of coarse optimizations.
}
\item{ntries}{
Number of random starting values for each coarse optimization.
}
\item{npol}{
Number of coarse fits improved, i.e polish, using smaller minimization
tolerance.
}
  \item{ftol1}{
Tolerance for coarse fits }
  \item{ftol2}{ 
Tolerance for polished fits}
  \item{itmax1}{ 
Maximum number of iterations for coarse fits }
  \item{itmax2}{ 
Maximum number of iterations for polish fits }
  \item{fitted.values}{ If TRUE 
computes fitted values 
and residuals.}
  \item{all.fits}{If TRUE returns all polished fits in the output object  
 not just the best one.
 }
  \item{greedy}{
If FALSE  fits the full model, if TRUE fits models by adding hidden 
units one at a time above the base model. 
 }
 \item{seed}{
Seed used in generating the random parameter starts.
  }
  \item{fast}{
  If true will reset the values of npol, ntries, tol1 tol2 etc. so that
nnreg runs quickly. This may mean that the solutions are not close to the
global optimum, but this is a useful option for quickly checking this
function.
 
}
  \item{na.rm}{
If true removes NA cases from data before fitting. }

  \item{verbose}{ 
 If TRUE intermediate information is written to the standard output from 
within R and also from the FORTRAN subroutine. This is useful for 
troubleshooting and also to view the number of iterations, convergence 
tolerance,  etc.  from each fit.  }
  \item{too.fast}{ 
nnreg will run quickly -- the tolerances are too large to give useful 
results. This is just for checking.}
  \item{just.setup}{ 
 If TRUE will just create an input file to use for running nnreg as a 
batch program. }
  \item{just.read}{ 
    If TRUE will read in the results for running  nnreg as a
batch program. }
  \item{batch.inname}{ 
Name of input file to use with batch program. This must still be 
specified when reading in batch results. The input file here is compared 
to the one reported in the batch output file.} 
\item{batch.outname}{ 
Name of output file created by batch program and also the file name 
assumed for reading in the batch results. }
}

\value{
If just.setup is FALSE 
the returned object is the fit to the data, a list of class nnreg with the 
following 
components. 
(If just.setup is TRUE just the input batch file is created and the 
returned value is of class call and is the R command to read the batch
output file back into R. See the example below.) 

For the nnreg returned object, each of the separate models are encoded as
a netfit object in the models list explained below.

\item{nfits}{
Number of different models fit and described in this object.
}
\item{model}{
A list of length nfits where each component is of class netfit. 
The netfit object bundles together the information needed to reconstruct 
the neural net estimate and predict from this fitted model
the  components of a netfit object are  the dimension of 
x (d), the number of hidden units used in the model (k),  the mean
of each column of the x matrix (xm), the mean of the y values (ym), 
the 
standard
deviation of each column of the x matrix (xsd), the standard 
deviation of
the y values (ysd), the number of parameters in the model (np) and the 
parameters
of model (theta) and the RMS error of the fit (rms).
}
\item{fitted.values}{
A matrix of predicted values from all fitted models. The number of columns 
is nfits 
and the rows correspond to the order of the x values. 
}
\item{residuals}{
A matrix of residuals from all fitted models. The number 
of columns 
is nfits 
and the rows correspond to the order of the x values. 
}
\item{call}{
Call to the function.
}
\item{x}{
Matrix of independent variables.
}
\item{y}{
Vector of dependent variables.
}
\item{n}{
Number of observations or length of y.
}
\item{lags}{
Time lags used in the x matrix, if a time series model. (Not used by nnreg
package)
}
\item{seed}{
Seed used in generating the random parameter starts.
}
\item{best.model}{
The index ( from the set 1:nfits) of the best model based on 
minimizing the GCV criterion with 
cost=2.
}
}

\details{
Parameters of the model are estimated by nonlinear least squares. 
The
parameter space has a large number of local minimum so the strategy is
to generate many parameter sets at random and refine (grind) these 
starting values 
with a robust minimization algorithm. The winners from this coarse 
refinement are then used as starting values for a more stringent minimization 
( a polish).
The details of this search are based on our experience fitting complex 
functions encountered in chaotic nonlinear time series. We believe 
the default settings are on the conservation side of exploring many 
possible parameter vectors. 
 
The two parameters ntries and ngrind are used in generating the many
starting parameter sets for nonlinear least squares. ngrind is the number
of cubes growing geometrically over a range of parameters values.
ntries is the number of parameter sets generated at random by a uniform
distribution in each cube. The best parameter set (out the ntries) in
each cube is used as the start of a coarse optimization.  npol of these
coarse fits are selected for further refinement by a minimization with
smaller tolerance.


Nnreg can also be run in a batch mode that uses a standalone FORTRAN
program executed independently of the R session. If you are interested in 
more of a batch style execution for longer fits or fine tuning the 
FORTRAN this will give you control over the compilation and 
(shell) execution. 
See the example below as
to how this works. For convenience the last lines of the batch input file 
are the R command to read in the output file back into R, this is the 
same as the return value of nnreg when just.setup=TRUE.

The main detail to note is that the user must compile the FORTRAN program,
NNREG.exe, "by hand" in addition to the nearly automatic installation of
the rest of the package. But it is not hard to do.  First find the R
installation directory by

library( nnreg)

.path.package( "nnreg")

Go to  the subdirectory exec and in UNIX/LINUX (or see Brian 
Ripley's packaging of UNIX Rtools for 
Windows).

make NNREG

This will create the executable program NNREG.exe which can then be moved
to a more convenient directory. The basic drill has three steps: 1) 
Create 
the
batch input file in R with a call to nnreg. 2) Execute NNREG.exe in the 
shell
giving the batch input file as the standard input.  3)When the program is
done use nnreg again to read the batch output file in R and reformat as an
nnreg object.
See the example below as to setting up the
input and reading the output files this program.

}

\references{ 
S. Ellner, D.W. Nychka, and A.R. Gallant. 1992. LENNS, a program to 
estimate
the dominant Lyapunov exponent of noisy nonlinear systems from time series
data. Institute of Statistics Mimeo Series No. 2235, Statistics 
Department,
North Carolina State University, Raleigh, NC 27695-8203.
D.W. Nychka, S. Ellner, D. McCaffrey, and A.R. Gallant. 1992. Finding 
Chaos
in Noisy Systems. J. R. Statist. Soc. B 54:399-426.
}
\author{Doug Nychka, Stephen Ellner and Barbara Bailey }

\seealso{ 
predict.nnreg, predict.netfit, plot.nnreg, summary.nnreg, print.nnreg, 
surface
 }

\examples{

# quick example to test things.
xg<- make.surface.grid( list( 1:10,1:10)) # 10X10 grid of points
y<-  5* exp(-((xg[,1]-5))**2) + exp(.5*(xg[,2]-5))

# Gaussian ridge in first variable and slope in second
# fast and seed options are not required but a fixed seed is useful so 
# that everyone using this example gets the same results.

nnreg( xg, y, 2,4, fast=TRUE, seed=123)-> look

# summary of fit
summary( look)
plot(look) # diagnostic plots
surface( look, type="C") #  fitted surface contour plot over image plot

# predict at arbitrary points ( here a slice with Y=5.5 and X in [2,9])
 xp<- cbind( seq(2,9,,100), rep( 5.5, 100))
# Hey, it doesn't get any easier than this!
predict( look, xp)-> yp
plot( xp[,1], yp, type="l")

# predict partial derivatives at observed points
predict( look,derivative=1)-> out
# partial derivatives at slice from above
predict( look,derivative=1, x=xp)-> out
#


# look at all polished models, not just the best ones
nnreg( xg, y, 2,4, fast=TRUE, seed=123, all.fits=TRUE)-> look2
summary( look2)


# fit using greedy algorithm
nnreg( xg, y, 2,4, fast=TRUE, greedy=TRUE)-> look2
summary( look2)
# NOTE: greedy does terribly! 

# refit using starting values based on the third model from fast fit
#(four hidden units), note that start expects a netfit object 

nnreg( xg, y, start= look$model[[3]])-> look4
summary( look4)

# Running  nnreg in  batch
#( See notes above for compiling the stand alone FORTRAN program NNREG.exe)

# first create input file
nnreg( xg, y, 2,4, fast=TRUE, just.setup=TRUE, 
   batch.inname="test.input",
   batch.outname="test.output")-> out.cmd
#
# out.cmd is now just a R command to read in results
#
# Now in the shell, execute the nnreg FORTRAN program with the line
# Dir_pathname\NNREG.exe  < test.input
# 
# Here Dir_pathname is the directory path where NNREG.exe is 
# located. Once the program completes there should now be an output file
# in this case called test.output. To read this back into R 
#
# nnreg( xg, y, 2,4, fast=T, just.read=TRUE,batch.inname="test.input",
#  batch.outname="test.output")-> look
# 
#  (You must give the correct input file name here because nnreg will 
#   check for a match!)
#
# or just use the information from the first call 
# pull off the last lines from the input file.  
#
# eval( out.cmd)-> look 
#
# You can also use the R command on the last lines of the input file. 
#
# in of either these cases this will return an nnreg object just like the 
# examples 
# above. The difference is that it reads in the results from the file
# test.output instead of actually doing the computations.   
#


# A higher dimensional example, the  BD data set from fields. 
#
nnreg( BD[,1:4], BD[,5], 2, 8,fast=TRUE)-> look
summary( look)


}
\keyword{ neural}% at least one, from doc/KEYWORDS
