Temperature and precipitation are important variables among basic mesurements of most observation stations. We will do some analysis for these variables around Carolina.

```
load('NorthCarolina.Rdata')
load('Raleigh.Rdata')
```

```
library(fields)
quilt.plot(lonLatTemp,tmax[,1])
US(add=T, col='black',lwd=5)
title( paste('Maximum temperature on', time[1] ) )
```

```
quilt.plot(lonLatPrcp,prcp[,1])
US(add=T, col='black',lwd=5)
title(paste('Precipiation on', time[1]) )
```

Letâ€™s look at the time series of the precipitation and mean temperature in a station around Raleigh during June, July and August from 1970 to 2011.

```
plot(raleigh$time, raleigh$tmax,
ylab='degrees C', xlab='time', type="h")
```

```
plot(raleigh$time, raleigh$prcp/25.4,
ylab='inches', xlab='time', type="h")
```

To make this easier we will omit the missing values from each of the data sets and for precipitation omit the days of no rain and also convert from mm to inches

```
YM<- na.omit( raleigh$tmean)
YP<- na.omit( raleigh$prcp)
YP<- YP/25.4 # convert to inches
ind<- YP >0
YP<- YP[ ind]
summary(YM)
```

```
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 11.10 23.85 25.65 25.38 27.20 34.05
```

`summary(YP)`

```
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0.01181 0.09842 0.27953 0.47947 0.61024 5.37008
```

`set.panel(1,2)`

`## plot window will lay out plots in a 1 by 2 matrix`

```
hist(YM,nclass=30)
hist(YP,nclass = 30)
```

Choosing the right number of classes can help figure out the shape. clearer. Also you can see how the quantiles reflect the different shapes.

It is useful to represent the distribution in the histogram with a continuous curve. These typically have one or more parameters that are used to match to the data. Here we are experimenting with adding a normal distribution to the temperatures. The normal is controled by the location of its peak and the peak width.

```
### some examples of normals
xgrid<- seq( 10,40,length.out=200)
y1<- dnorm( xgrid, mean= 25, sd = 2 )
y2<- dnorm( xgrid, mean= 25, sd = 4 )
y3<- dnorm( xgrid, mean= 18, sd = 2 )
plot( xgrid, y1, col="orange2", type="l")
lines( xgrid, y2,col="blue2")
lines(xgrid, y3,col="black")
# the theoretical 95% quantiles for these distributions
look<- c(qnorm( .95, mean= 25, sd=2),
qnorm( .95, mean= 25, sd=4),
qnorm( .95, mean= 18, sd=2)
)
print( look)
```

`## [1] 28.28971 31.57941 21.28971`

`xline( look) # add to plot`

```
xgrid<- seq( 10,40,length.out=200)
y<- dnorm( xgrid, mean= 25, sd = 2 )
hist(YM,nclass=30, prob = TRUE, ylim =c(0,.2))
lines( xgrid, y, col="green4")
```

```
xgrid<- seq( 0,6,length.out=200)
y<- dexp( xgrid, rate = 1.5 )
hist(YP,nclass=30, prob = TRUE)
lines( xgrid, y, col="green4")
```

The theoretical distribution imply quantiles and it make sense to compare those values to the ones in the data.

```
q95Normal<- qnorm( .95, mean= 20, sd=2)
# compare to
quantile( YM, .95)
```

```
## 95%
## 29.45
```

`# in this case not too good. `

The idea is compare all the quantiles of the data to the theoretical quantiles of the distribution. It is a perfect match if they are on a line with intercept of 0 and a slope of 1.

```
N<- length( YM)
pSequence<- (0:(N-1) +.5)/N
qNormal<- qnorm( pSequence, mean= 25, sd=2)
plot( qNormal, sort( YM), xlab="Theoretical", ylab="Data",
col="green2")
abline( 0,1, col="red")
# use the estimates from the data
qNormal2<- qnorm( pSequence, mean= mean(YM), sd=sd( YM))
points(qNormal2, sort( YM), col="orange2" )
abline( 0,1, col="red")
```

```
N<- length( YP)
pSequence<- (0:(N-1) +.5)/N
qExp<- qexp( pSequence, rate=1.5)
plot( qExp, sort( YP), xlab="Theoretical", ylab="Data",
col="green2")
abline( 0,1, col="red")
qExp2<- qexp( pSequence, rate=1/mean( YP))
points( qExp2, sort( YP), col="orange2")
```

Next, we try to draw maps for quantiles of these two variables.

```
tmean.95=apply(tmean,1,quantile,0.95, na.rm=TRUE)
quilt.plot(lonLatTemp,tmean.95,main='95% quantile of mean temperature',cex.main=2,zlim=c(20,32))
US(add=T, col='black',lwd=5)
```

```
# finding the MLE for the exponential
1/mean( YP)
```

`## [1] 2.085629`

```
rate<- seq( 1.5, 2.5,length.out=100)
logLike<- rep( NA, 100)
for( k in 1:100){
n<- length( YP)
logLike[k]<- -sum(YP)*rate[k] + n*log( rate[k])
}
plot( rate, logLike, type="l")
# a range indicates 95% uncertainty
cut95<- qchisq(.95, df=1)/2
ind<- max(logLike) - logLike < cut95
yline( max(logLike) - cut95)
points( rate[ind], logLike[ind], col="red")
xline( min( rate[ind]))
xline( max( rate[ind]))
```