More programming tips for data analysis

Writing R code often involves adding features that make the results better documented and also making the inputs do the function is easy to use.

This lecture gives some strategies that are often helpful.

As motivation here is a simple function to fit a quadratic polyomial function to a data set using the lm function and return the predicted values. (You need to load the fields package for this to work.)

myFit<- function( x,y){
 degree<- 2
 fit<- lm(y~ x + I(x^2) )
 A<- fit$coefficients
 predValues<- A[1] + A[2]*x + A[3]*x^2
return( predValues)
}

Note: fields.mkpoly will return a matrix where each column is a power of x. At this point what lm does and why is still mysterious – this will be covered when we talk about least squares fitting.

data(AudiA4)
hold<- myFit( AudiA4$mileage, AudiA4$price)
# take a look
plot( AudiA4$mileage, AudiA4$price)
points(AudiA4$mileage,hold, col="seagreen")

predicting at points that are not data points
myFit2<- function( x,y, xnew){
 degree<- 2
 fit<- lm(y~ x + I(x^2) )
 A<- fit$coefficients
 predValues<- A[1] + A[2]*xnew + A[3]*xnew^2
return( cbind( xnew, predValues) )
}

add a line at equally spaced sequence

xnew<- seq( 0,2e5,,150)
plot( AudiA4$mileage, AudiA4$price)
hold2<- myFit2(AudiA4$mileage, AudiA4$price, xnew)
lines( hold2, col="orange2", lwd=3)