# APPM2720 Quiz 1

You should work on this quiz on your own and not get help from others. You are however, encouraged to use the web and any other reference materials and resources to complete these questions. All the bulleted parts count an equal amount of points.

To make it easy to work this quiz all data sets are included in the APPM2720 Quiz directory. They are also in the dataWorkshop package and also have been posted in the Week3 class folder

• in the body of an email and attach the plot
• use the markdown format in Rstudio along with the knitr button to create an html file that you email to me.
• add your text answers as comment in an R script use the notebook button and convert your R script to an html file.

(1) Load the data set BoulderJuneTemperature ( in BT.rda) create a working data set: BT<-BoulderJuneTemperature\$Temp

• Show how to convert these temperatures in BT from Fahrenheit to Centigrade and create a new data BTCentigrade.
• What is the relationship between the mean of the data in BT and the mean of the new data set? What about the maximum values of each data set?

(2) Load the data set BoulderTemperature (in BoulderTemperature.rda) This is a data frame of monthly mean temperatures from 1897 through 2014).

• How many columns are in this data set? How many rows?
• Explain how to create a subset of these data that is just the first 10 years but includes all the months?
• Explain how to create a subset of these data that just has the months 6, 7, 8 (i.e. the summer months) but includes all the years. For reference call this new data set ST
• Explain how to find the median summer temperatures for each year using the apply function.
• Recall that to extract the "year" variable you can use

yearLabel<- row.names(BoulderTemperature)
Year <- as.numeric(yearLabel)

Make a plot of the these summer values over time and comment on whether you see a trend in temperatures (perhaps due to global warming). Be sure to label and title your plot.

(3) In R to find the missing values (NAs) in a data set you can use the is.na function to create a data set of TRUE/FALSE values where TRUE means the original values was missing. For example:

test<- c( 1,3,4.5, NA,10)
ind<- is.na(test)
sum( ind)

Will give you the number of missing values in a the data set test.

• Create a simple R function that takes a data set as its input and and returns the number of missing values.
• Using your function and the R apply function for the BoulderTemperature data report the number of missing values for each month.