Load the R MPI library that allows some high level communication among different processes. This package uses a worker (slave) and supervisor (master) model where one R session is the supervisor and controls the workers. The big advantage of this model is that the parallel tasks are controlled in R and use R code – little knowledge of MPI is needed. The workers are also independent R sessions. So algorithms can be developed on a small platform such as a laptop or single processor and then the code can be distributed across many others.

Load the Rmpi package

library( Rmpi)

Besides this supervisor session, create two workers. This has been successfully used on the NCAR Yellowstone supercomputer with 512 cores, however, only 2 are used here to run on a Macbook Air.

mpi.spawn.Rslaves( nslaves=2)
##  2 slaves are spawned successfully. 0 failed.
## master (rank 0, comm 1) of size 3 is running on: cisl-bow 
## slave1 (rank 1, comm 1) of size 3 is running on: cisl-bow 
## slave2 (rank 2, comm 1) of size 3 is running on: cisl-bow

Here is a example of completing 5 tasks across the 2 workers. Here the task is generating a sample of normally distributed numbers with different sample sizes. The output is in the form of a list with each component being the random sample.

look<- mpi.applyLB(c(2,3,4,4,5), rnorm) 
print( look)
## [[1]]
## [1] -0.3456049  1.3145128
## 
## [[2]]
## [1]  0.5989564 -0.3799018 -0.9420225
## 
## [[3]]
## [1] -1.8101786  0.5942250  0.1431744 -0.7230462
## 
## [[4]]
## [1] -0.3079891 -0.6506467  1.1256552 -0.1485625
## 
## [[5]]
## [1]  1.6190253  0.4475148  1.8990462 -0.6893079  1.2254623

Here is a function where the “input” is just an index, i.e. the task ID. For more complex tasks this is usually the best way to assign work.

doTask<- function( I, n ){
 y<- rnorm( n)
 return( list( y=y, taskID=I) )
 }

Generate 5 samples of size 4. Here the sample size is conveniently supplied as an argument that will be sent along to doTask Note that first, one has to broadcast this function to the 2 workers.

mpi.bcast.Robj2slave( "doTask")
mpi.applyLB( 1:5, FUN = doTask, n=4 )
## [[1]]
## [[1]]$y
## [1] -0.8187894 -0.1032386  1.0805146  0.7390726
## 
## [[1]]$taskID
## [1] 1
## 
## 
## [[2]]
## [[2]]$y
## [1] -0.33826409  0.52991937 -0.02125094  1.32042095
## 
## [[2]]$taskID
## [1] 2
## 
## 
## [[3]]
## [[3]]$y
## [1]  0.45243889  0.03033372  1.21438556 -1.02300447
## 
## [[3]]$taskID
## [1] 3
## 
## 
## [[4]]
## [[4]]$y
## [1]  2.2088228 -0.1180190 -0.1419504 -1.1958058
## 
## [[4]]$taskID
## [1] 4
## 
## 
## [[5]]
## [[5]]$y
## [1]  0.88751475 -1.28970415 -0.02697971  0.36815803
## 
## [[5]]$taskID
## [1] 5

OK now its time for the real application. Out of sample prediction based on 90% of the data and predicting the withheld 10%. The idea is that we want to try many random 90/10 partitions – we will do 20 in this case.

Set up the rainfall example

library( fields)
## Loading required package: spam
## Loading required package: grid
## Spam version 1.4-0 (2016-08-29) is loaded.
## Type 'help( Spam)' or 'demo( spam)' for a short introduction 
## and overview of this package.
## Help for individual functions is also obtained by adding the
## suffix '.spam' to the function name, e.g. 'help( chol.spam)'.
## 
## Attaching package: 'spam'
## The following objects are masked from 'package:base':
## 
##     backsolve, forwardsolve
## Loading required package: maps
data( NorthAmericanRainfall)
loc<- cbind(NorthAmericanRainfall$longitude,
           NorthAmericanRainfall$latitude)
ppt<- NorthAmericanRainfall$precip/254 #inches of rainfall June-August

A more complicated doTask function. Samples the 10% of data to omit, fits to remaining data, predicts at the omitted locations, and finds the difference between the predicted and observed (called the residual). Note that the numbers have been hardwired to make this more readable but for general use 1720 should be N <- length( ppt) and 170 should be round(.1*N)

doTask2<- function( I){
  out<- sample( 1:1720, 172)
  fit<- Tps( loc[-out,], ppt[-out])
  yHat<- predict( fit, x= loc[out,])
  resid<- ppt[out] - yHat
 return( list( resid = resid, x=loc[out,], taskID=I) )
 }

Now run this just once to check and to get a timing for a single sampling. Note that for this case the calling ID could be any value.

system.time(
check<- doTask2(1)
)
##    user  system elapsed 
##   7.056   0.163   7.259

This example requires more broadcasting of things. Also note that the fields library is being loaded on all workers.

mpi.bcast.Robj2slave( "doTask2")
mpi.bcast.cmd(library(fields))
mpi.bcast.Robj2slave( loc)
mpi.bcast.Robj2slave( ppt)

system.time(
look<- mpi.applyLB( 1:20, FUN = doTask2 )
)
##    user  system elapsed 
##  10.251  55.865  90.501

Often the most coding is combining the results from the list. Here we accumulate the residuals along with their locations. The squared residual is an unbaised, but highly variable estimate of the prediction error.

bigX<- NULL
bigR<- NULL
for( k in 1:20){
hold<- look[[k]]
bigX<- rbind( bigX, hold$x )
bigR<- c( bigR, hold$resid)
}
stats( bigR^2)
##                        [,1]
## N              3.440000e+03
## mean           1.559187e+00
## Std.Dev.       6.764092e+00
## min            1.871329e-08
## Q1             5.071181e-02
## median         2.580142e-01
## Q3             9.496865e-01
## max            2.111015e+02
## missing values 0.000000e+00

Examining how the squared residuals vary over the data domain. Very large values have been omitted. This information can be used more strategically but this is a simple summary.

quilt.plot( bigX, (bigR^2), nx=100, ny=50, zlim=c(0,5))

stats( bigR^2)
##                        [,1]
## N              3.440000e+03
## mean           1.559187e+00
## Std.Dev.       6.764092e+00
## min            1.871329e-08
## Q1             5.071181e-02
## median         2.580142e-01
## Q3             9.496865e-01
## max            2.111015e+02
## missing values 0.000000e+00