# Data Assimilation Research Testbed -- DART
# Copyright 2004-2007, Data Assimilation Research Section
# University Corporation for Atmospheric Research
# Licensed under the GPL -- www.gpl.org/licenses/gpl.html
#
# <next few lines under version control, do not edit>
# $URL: http://subversion.ucar.edu/DAReS/DART/trunk/doc/mpi/README $
# $Id: README 2878 2007-04-17 22:55:40Z nancy $
# $Revision: 2878 $
# $Date: 2007-04-17 16:55:40 -0600 (Tue, 17 Apr 2007) $


Greetings.  This README file contains information about the programs
in this directory, and troubleshooting help if you are trying to get
the MPI option in DART to compile and run.


INTRODUCTION:

Starting with the Jamaica release of DART there is an option to use MPI
in the filter program, to do the data assimilation step in parallel on a
parallel machine.  There is also the option to have the model advance in
parallel as well.  This is controlled by the 'async' namelist setting in
the &filter_nml namelist section of the input.nml file.

The default settings compile DART without MPI, but if you want to enable
that option you need to have a working MPI library and run-time system.

Generally this means simply compiling with mpif90 instead of the underlying
Fortran 90 compiler, but it can also be a bit more complicated if there
are multiple Fortran compilers or if MPI is not installed in a standard
directory.

This directory contains some small test programs which use both MPI and
the netCDF libraries.  It may be simpler to debug any build problems here,
and if you need to submit a problem report to your system admin people
these single executables are much simpler than the entire DART build tree.

Be sure that the file $DART/mkmf/mkmf.template has been updated for your
particular system and compiler.  We supply a default file but generally
you find the mkmf.template.compiler.system file which is closest to your
setup, copy it over mkmf.template, and then update the location of the
netCDF libraries (since they are often not installed in a location searched
by default).  If your system uses the 'module' commands to select
different versions, make sure you have consistent versions of the Fortran
compiler, the netCDF libraries, and the MPI libraries.

There are 2 places where you may need to customize files in the 
$DART/mpi_utilities directory.  First, some systems come with only an 
include file for parameters; some come with only a Fortran 90 module.   
Second, some systems require an interface block to declare the use of
the system() function; others will not compile if it is specified.
There are large block comments at the top of the test file in this
directory - ftest_mpi.f90.  If you can get it compiled here, then you
know what changes (if any) you have to make in the mpi_utilities_mod.f90
and null_mpi_utilities_mod.f90 files in the $DART/mpi_utilities directory.



HOW TO VERIFY AN INSTALLATION:

This directory contains a small set of test programs which use the MPI
(Message Passing Interface) communications library.  They may compile first
time with no problem, but especially on Linux clusters there can be an
almost infinite number of permutations of batch queue systems, compilers, 
and mpi libraries.

Examples of batch queue systems:  PBS, LSF, LoadLeveler
Examples of compiler vendors: Intel (ifort), PGI (pgf90), Absoft (f95)
Examples of MPI libraries:  mpich, LAM, OpenMPI, MPT


Make sure that the closest mkmf.template.compiler.system file has 
been copied over to $DART/mkmf/mkmf.template.

Type:  make
to compile the programs.

Type:  make check
to run the programs interactively and look for errors.  Note that on
some larger systems it is prohibited to run MPI programs on the login
nodes; they can only be submitted to a batch system.  If the f90 and nc
programs run but the mpi program fails, you still might be ok.  Move
on to 'make batch' before declaring an emergency.

Edit the Makefile and select the proper line (bsub, qsub, or neither).  Then
type:  make batch
to submit the test programs to the batch queue.  This will almost certainly
not work without modifications to the 'runme' script if you have LSF or PBS.
(There are directives #LSF and #PBS which select the specific queue, specify
max runtime, accounting charge code, etc which are very system specific.)  
Once you have a working script here you can transfer your changes to the 
model-specific files under $DART/models/your-model/work.


WHAT YOU GET:

ftest_mpi.f90 is a Fortran 90 program which calls a few basic MPI library
functions.  If it compiles and runs interactively, you have mounted one of
the 2 large hurdles in running with MPI.  If you can submit this executable
to the batch queue and have it run, you are done.  Go have a beverage of
your choice.  After that, you can start to do actual science instead of
system-wrangling.

ftest_nc.f90 is a (non-mpi) Fortran 90 program which uses the netCDF libraries.
It can be used to test that the netcdf libs are installed and that you
have the proper setting for the NETCDF variable in your mkmf.template.

'make check' will try to build and run both of these programs interactively.
'make batch' will submit them to the batch system to execute them.  If you
have problems, keep reading below for more help in diagnosing exactly where
things are going wrong.


TROUBLESHOOTING:

If the ftest_mpi.f90 program does not compile, here are a few things to
check.  You must be able to compile and run this simple program before
anything else is going to work.

1. Include file vs module

Some MPI installations supply a header file (a .h or .inc file) which
define the parameters for the MPI library.  Others supply a Fortran 90
module which contains the parameters and subroutine prototypes.  Use one
or the other.  The code contains a commented out 'use' statement and
comes by default expecting to use the include file.

2. Interface block for system() vs not

While this isn't strictly an MPI issue, the system() function is used
by some of the DART code and is called from the mpi module, so if you
need to comment this block in or out, you are editing the same files.
If you get an error trying to link your program and the message seems
related to 'undefined external _system_' (or some close permutation of
that message), go into BOTH mpi_utilities_mod.f90 and 
null_mpi_utilities_mod.f90 and comment in (or out) the interface block
near the comment which has 'BUILD TIP 2'.

3. Compiler wrappers

Most MPI installations include compiler "wrapper" programs which you call
instead of the actual compiler.  They add any needed compiler flags and they
add the MPI libraries to the link lines.  But they are usually built for one
particular compiler, so if your system has multiple Fortran compilers
available you will need to find the right set of MPI wrappers.  Generally it
is called 'mpif90' for the Fortran 90 compiler.  Try to go this route if at
all possible.  This might mean adding a new directory to your shell search
path, or loading a new module with the 'module' command.

Once you have a compiled executable, you have to run it.  This may mean
dealing with the batch system.

4. Batch systems

Most clusters have some form of batch control.  You login to one node on the
cluster, but to execute a compute job you must run a command which adds the
job to a list of waiting jobs.  Especially for MPI jobs which expect to use
multiple processors at the same time, a batch control system ensures that
each job is started on the right number of processors and does not conflict
with other running jobs.

The batch control system knows how many nodes are available for jobs, whether
some queues have higher or lower priority, the maximum time a job can run,
the maximum number of processors a job can request, and it schedules the use
of the nodes based on the jobs in the execution queues.  The two most common
batch systems currently are PBS and LSF.  They are complicated, but
don't despair.  This directory comes with a script which has settings for
the most commonly required options.  If they do not work on your system
the simplest way to proceed is to find a colleague with a working script
and copy it; check your local support web pages or support people; or for
the independent-minded, google for examples out on the web.  Queue names
tend to be different between systems, and many larger systems which charge
by the job require an account code specified on either a #LSF or #PBS line.
These values will have to come from a locally knowledgable person.

If you have a small or exclusive-use cluster you may not have a batch
system.  In that case you can probably simply start your job with the
'mpirun' command.  But even in this case, you may need to supply some
system-specific options, like a 'machine file' which says what node names
are part of this cluster and are available to run your job.


OTHER THINGS:

A few other programs are included in this directory to help diagnose
non-working setups.  To compile and run everything:  make everything
It will echo messages as things pass or fail.

ftest_f90.f90 is a simple, Fortran 90 program without MPI.
It confirms you have a working F90 compiler.  Try: make ftest_f90 
to compile only this program.

ftest_nc.f90 is a non-MPI Fortran 90 program without MPI which opens and writes 
a netCDF file (ftestdata.nc).  If your netCDF library is not installed
where the NETCDF variable points in your mkmf.template file, you can use
this program to debug netcdf problems.  Try: make ftest_nc (to compile), then:
./ftest_nc to run.  It should create a small netcdf file called ftestdata.nc
which can be dumped with "ncdump ftestdata.nc".

ftest_stop.f90 and ftest_go.f90 are MPI test programs for the async=4
MPI model advance option.  If you are using async = 4 for filter and
are having problems, try running this pair.  In particular pay attention
to the 'hostname' of the script and compare it to the Process 0 message
from ftest_stop.  If they are *not* the same, you will need a patch to
your runme script.  Email us for more help.  Try:  make async4
to run this test.  You may have to edit the Makefile to select the
proper batch system.


ctest.c is a C language program.  The Fortran executables have no dependency
on C, but if there is any question about whether there is a working C
compiler on the system, try:  make ctest.

ctest_mpi.c is a C language program which uses the C versions of the MPI
library calls.  DART does not have any C code in it, but it is possible
that the MPI libraries were compiled without the Fortran interfaces.
If this routine compiles and runs but the Fortran ones do not, that 
might be a useful clue.  Try: make ctest_mpi (to compile), then:
make run_c (to execute).  You may need to select the proper batch system
submit command in the Makefile (bsub, qsub, or neither).

ctest_nc.c is a non-MPI C language program which uses the C versions of 
the netCDF library calls.  Again, DART has no C code in it, but it is
possible that the netCDF libraries were compiled without the Fortran
interfaces.  If this routine compiles and runs but the Fortran ones do
not, that might be a useful clue.  Try: make ctest_nc (to compile), then:
./ctest_nc to run.  It should create a small netcdf file called ctestdata.nc
which can be dumped with "ncdump ctestdata.nc".

Any questions, email me at:  nancy@ucar.edu


Good luck -
nancy collins

