Contact: | Nancy Collins |
Reviewers: | |
Revision: | $Revision: 1.1 $ |
Release Name: | $Name: $ |
Change Date: | $Date: 2006/09/15 22:13:01 $ |
Change history: | see CVS log |
This module provides subroutines which access the MPI (message passing interface) parallel communications library. To compile without using MPI substitute the null_mpi_utilities_mod.f90 for this one.
types_mod utilities_mod time_manager_mod mpi_mod
No namelist interfaces are currently defined for this module, but at some point in the future an optional namelist interface &mpi_utilities_nml may be supported. It would be read from file input.nml.
Initializes the MPI library, creates a private communicator, stores the total number of tasks and the local task number for later use, and registers this module. On some implementations of MPI (in particular some variants of MPICH) it is best to initialize MPI before any I/O is done from any of the parallel tasks, so this routine should be called as close to the process startup as possible.
It is not an error to try to initialize the MPI library more than once. It is still necessary to call this routine even if the application itself has already initialized the MPI library. Thise routine creates a private communicator so internal communications are shielded from any other communication called outside the DART libraries.
It is an error to call any of the other routines in this file before calling this routine.
logical, intent(in), optional :: callfinalize
Frees the local communicator, and shuts down the MPI library unless callfinalize is specified and is .FALSE.. On some hardware platforms it is problematic to try to call print or write from the parallel tasks after finalize has been executed, so this should only be called immediately before the process is ready to exit. If the application itself is using MPI, the callfinalize argument can be used to defer closing the MPI library until the application does it itself.
It is an error to call any of the other routines in this file after calling this routine.
callfinalize | If false, do not call the MPI_Finalize() routine. |
integer :: task_count
Returns the total number of MPI tasks this job was started with. Note that MPI task numbers start at 0, but this is a count. So a 4-task job would return 4 here, but the actual task numbers will be from 0 to 3.
task_count | Total number of MPI tasks in this job. |
integer :: my_task_id
Returns the MPI task number. This is one of the routines in which all tasks can make the same function call but each returns a different value. The return can be useful in creating unique filenames or otherwise distinguishing resources which are not shared amongst tasks. MPI task numbers start at 0, so valid task id numbers for a 4-task job would be 0 to 3.
my_task_id | My unique MPI task id number. |
Synchronize tasks. This call does not return until all tasks have called this routine. This ensures all tasks have reached the same place in the code before proceeding.
integer, intent(in) :: dest_id real(r8), dimension(:), intent(in) :: srcarray type(time_type), intent(in), optional :: time
Use the MPI library to send a copy of an array of data from one task to another task. The sending task makes this call; the receiving task must make a corresponding call to receive_from().
If time is specified, it is also sent to the receiving task. The receiving call must match this sending call regarding this argument; if time is specified here it must also be specified in the receive; if not given here it cannot be given in the receive.
The current implementation uses MPI_Ssend() which does a synchronous send. That means this routine will not return until the receiving task has called the receive routine to accept the data. This may be subject to change; MPI has several other non-blocking options for send and receive.
dest_id | The MPI task id of the receiver. |
srcarray | The data to be copied to the receiver. |
time | If specified, send the time as well. |
The send and receive subroutines must be used with care. These calls must be used in pairs; the sending task and the receiving task must make corresponding calls or the tasks will hang. Calling them with different array sizes will result in either a run-time error or a core dump. The optional time argument must either be given in both calls or in neither or one of the tasks will hang. (Sense a trend here?)
integer, intent(in) :: src_id real(r8), dimension(:), intent(out) :: destarray type(time_type), intent(out), optional :: time
Use the MPI library to receive a copy of an array of data from another task. The receiving task makes this call; the sending task must make a corresponding call to send_to(). Unpaired calls to these routines will result in the tasks hanging.
If time is specified, it is also received from the sending task. The sending call must match this receiving call regarding this argument; if time is specified here it must also be specified in the send; if not given here it cannot be given in the send.
The current implementation uses MPI_Recv() which does a synchronous receive. That means this routine will not return until the data has arrived in this task. This may be subject to change; MPI has several other non-blocking options for send and receive.
src_id | The MPI task id of the sender. |
destarray | The location where the data from the sender is to be placed. |
time | If specified, receive the time as well. |
See the notes section of send_to().
integer, intent(in) :: exit_code
A replacement for calling the Fortran intrinsic exit. This routine calls MPI_Abort() to kill all MPI tasks associated with this job. This ensures one task does not exit silently and leave the rest hanging. This is not the same as calling finalize_mpi_utilities() which waits for the other tasks to finish, flushes all messages, closes log files cleanly, etc. This call immediately and abruptly halts all tasks associated with this job.
Depending on the MPI implementation and job control system, the exit code may or may not be passed back to the calling job script.
exit_code | A numeric exit code. |
It would generally be helpful to write out some kind of message before calling this routine, to indicate where in the code it is dying. This routine is called by the error handler in the utilities module after writing out the pending error message.
Currently unimplemented; transposes are implemented with a series of calls to send_to() and receive_from().
real(r8), dimension(:), intent(inout) :: array integer, intent(in) :: root
All tasks must make this call together, but the behavior in each task differs depending on whether it is the root or not. On the task which has an ID equal to root the contents of the array will be sent to all other tasks. On any task which has an ID not equal to root THE array is the location where the data is to be received into. Thus array is intent(in) on root, and intent(out) on all other tasks.
When this routine returns, all tasks will have the contents of the root array in their own arrays.
array | Array containing data to send to all other tasks, or the location in which to receive data. |
root | Task ID which will be the data source. All others are destinations. |
This is another of the routines which must be called by all tasks. The MPI call used here is synchronous, so all tasks block here until everyone has called this routine.
real(r8), dimension(:), intent(in) :: srcarray integer, intent(in) :: root real(r8), dimension(:), intent(out) :: dstarray integer, intent(out) :: dstcount integer, intent(in) :: how integer, dimension(:), intent(out) :: which
Currently unimplemented. Could be used to distribute proper subsets of an array across all tasks in a job.
srcarray | Entire data array to be used as a data source. |
root | Task ID with source array. |
dstarray | Destination array where subset of data is to be placed. |
root | Count of how many items are in the dstarray. |
how | Select different algorithms for doing the distribution. |
which | Integer index array of which values were assigned to this task. |
logical :: iam_task0
Returns .TRUE. if called from the task with MPI task id 0. Returns .FALSE. in all other tasks. It is frequently the case that some code should execute only on a single task. This allows one to easily write a block surrounded by if (iam_task0()) then ... .
iam_task0 | Convenience function to easily test and execute code blocks on task 0 only. |
integer, intent(in) :: from real(r8), dimension(:), intent(inout) :: array1 real(r8), dimension(:), intent(inout) :: array2
Cover routine for array_broadcast(). This call must be matched with the companion call broadcast_recv(). This routine should only be called on the task which is the root of the broadcast; it will be the data source. All other tasks must call broadcast_recv(). This routine sends 2 data arrays because this is a common code pattern in the DART filter code. This routine ensures that from is the same as the current task ID.
In reality the data arrays here are intent(in) only but this routine will be calling array_broadcast() internally and so must be intent(inout) to match.
from | Current task ID; the root task for the data broadcast. |
array1 | First data array to be broadcast. |
array2 | Second data array to be broadcast. |
This is another of the routines which must be called consistently; only one task makes this call and all other tasks call the companion broadcast_recv routine. The MPI call used here is synchronous, so all tasks block until everyone has called one of these two routines.
integer, intent(in) :: from real(r8), dimension(:), intent(inout) :: array1 real(r8), dimension(:), intent(inout) :: array2
Cover routine for array_broadcast(). This call must be matched with the companion call broadcast_send(). This routine must be called on all tasks which are not the root of the broadcast; the array arguments specify the location in which to receive data from the root. (The root task should call broadcast_send().) This routine receives 2 data arrays because this is a common code pattern in the DART filter code. This routine ensures that from is not the same as the current task ID.
In reality the data arrays here are intent(out) only but this routine will be calling array_broadcast() internally and so must be intent(inout) to match.
from | The task ID for the data broadcast source. |
array1 | First array location to receive data into. |
array2 | Second array location to receive data into. |
This is another of the routines which must be called consistently; all tasks but one make this call and exactly one other task calls the companion broadcast_send routine. The MPI call used here is synchronous, so all tasks block until everyone has called one of these two routines.
integer, intent(in) :: addend integer, intent(out) :: sum
All tasks call this routine, each with their own different addend. The returned value in sum is the total of the values summed across all tasks, and is the same for each task.
addend | Single input value per task to be summed up. |
sum | The sum. |
This is another of those calls which must be made from each task, and the calls block until this is so.
We adhere to the F90 standard of starting a namelist with an ampersand '&' and terminating with a slash '/'.
This module currently has no namelist entries. One expected addition would be a list of task numbers in which the default option is to print out all informational messages; the current default is that messages are only printed from task 0 and all others suppressed. (Warnings and Errors print in all cases.)
This namelist would be read from a file called input.nml
Depending on the implementation of MPI, the library routines are either defined in an include file (mpif.h) or in a proper Fortran 90 module (use mpi). If it is available the module is preferred; it allows for better argument checking and optional arguments support in the MPI library calls.
If MPI returns an error, the DART error handler is called with the numeric error code it received from MPI. See any of the MPI references for an up-to-date list of error codes.