MODULE location_mod (threed_cartesian)

DART project logo

Jump to DART Documentation Main Index
version information for this file:
$Id: location_mod.html 11771 2017-06-27 16:19:58Z thoar@ucar.edu $

NAMELIST / INTERFACES / FILES / REFERENCES / ERRORS / PLANS / PRIVATE COMPONENTS / TERMS OF USE

Overview

The DART framework needs to be able to compute distances between locations, to pass location information to and from the model interface code (model_mod.f90), and to be able to read and write location information to files. DART isolates all this location information into separate modules so that the main algorithms can operate with the same code independent of whether the model uses latitude/longitude/height, 1D unit cartesian coordinates, cylindrical coordinates, etc. DART provides about half a dozen possible coordinate systems, and others can be added. The most common one for geophysical models is the threed_sphere version. This document describes an alternative 3D cartesian coordinate system.

Note that only one location module can be compiled into any single DART executable, and most earth observational data is generated in [latitude, longitude, vertical pressure or height] coordinates - the threed_sphere option. The cartesian and 3D sphere locations cannot be mixed or used together.

This location module provides a representation of a physical location in an [X, Y, Z] cartesian coordinate space. A type that abstracts the location is provided along with operators to set, get, read, write, and compute distances between locations. This is a member of a class of similar location modules that provide the same abstraction for different represenations of physical space.

Location-independent code

All types of location modules define the same module name location_mod. Therefore, the DART framework and any user code should include a Fortran 90 use statement of location_mod. The selection of which location module will be compiled into the program is controlled by which source file name is specified in the path_names_xxx file, which is used by the mkmf_xxx scripts.

All types of location modules define the same Fortran 90 derived type location_type. Programs that need to pass location information to subroutines but do not need to interpret the contents can declare, receive, and pass this derived type around in their code independent of which location module is specified at compile time. Model and location-independent utilities should be written in this way. However, as soon as the contents of the location type needs to be accessed by user code then it becomes dependent on the exact type of location module that it is compiled with.

Usage of distance routines

Regardless of the fact that the distance subroutine names include the string 'obs', there is nothing specific to observations in these routines. They work to compute distances between any set of locations. The most frequent use of these routines in the filter code is to compute the distance between a single observation and items in the state vector, and also between a single observation and other nearby observations. However, any source for locations is supported.

In simpler location modules (like the oned version) there is no need for anything other than a brute force search between the base location and all available state vector locations. However in the case of large geophysical models which typically use the threed_cartesian locations code, the brute-force search time is prohibitive. The location code pre-processes all locations into a set of bins and then only needs to search the lists of locations in nearby bins when looking for locations that are within a specified distance.

The expected calling sequence of the get_close routines is as follows:


call get_close_maxdist_init()  ! is called before get_close_obs_init()
call get_close_obs_init()

call get_close_obs()           ! called many, many times

call get_close_obs_destroy()

In the threed_cartesian implementation the first routine initializes some data structures, the second one bins up the list of locations, and then the third one is called multiple times to find all locations within a given radius of some reference location, and to optionally compute the exact separation distance from the reference location. The last routine deallocates the space. See the documentation below for the specific details for each routine.

All 4 of these routines must be present in every location module but in most other versions all but get_close_obs() are stubs. In this threed_cartesian version of the locations module all are fully implemented.

Interaction with model_mod.f90 code

The filter and other DART programs could call the get_close routines directly, but typically do not. They declare them (in a use statement) to be in the model_mod module, and all model interface modules are required to supply them. However in many cases the model_mod only needs to contain another use statement declaring them to come from the location_mod module. Thus they 'pass through' the model_mod but the user does not need to provide a subroutine or any code for them.

However, if the model interface code wants to intercept and alter the default behavior of the get_close routines, it is able to. Typically the model_mod still calls the location_mod routines and then adjusts the results before passing them back to the calling code. To do that, the model_mod must be able to call the routines in the location_mod which have the same names as the subroutines it is providing. To allow the compiler to distinguish which routine is to be called where, we use the Fortran 90 feature which allows a module routine to be renamed in the use statement. For example, a common case is for the model_mod to want to supply additions to the get_close_obs() routine only. At the top of the model_mod code it would declare:


use location_mod, only :: location_get_close_obs => get_close_obs,    &
                          get_close_maxdist_init, get_close_obs_init, &
                          get_close_obs_destroy

That makes calls to the maxdist_init, init, and destroy routines simply pass through to the code in the location_mod, but the model_mod must supply a get_close_obs() subroutine. When it wants to call the code in the location_mod it calls location_get_close_obs().

One use pattern is for the model_mod to call the location get_close_obs() routine without the dist argument. This returns a list of any potentially close locations without computing the exact distance from the base location. At this point the list of locations is a copy and the model_mod routine is free to alter the list in any way it chooses: for example, it can change the locations to make certain types of locations appear closer or further away from the base location. Then typically the model_mod code loops over the list calling the get_dist() routine to get the actual distances to be returned to the calling code.

Horizontal Distance Only

This option is not supported for the threed_cartesian option.

Precomputation for Run-time Search Efficiency

For search efficiency all locations are pre-binned. For the non-octree option, the total list of locations is divided up into nx by ny by nz boxes and the index numbers of all items (both state vector entries and observations) are stored in the appropriate box. To locate all points close to a given location, only the locations listed in the boxes within the search radius must be checked. This speeds up the computations, for example, when localization controls which state vector items are impacted by any given observation. The search radius is the localization distance and only those state vector items in boxes closer than the radius to the observation location are processed.

The default values have given good performance on many of our existing model runs, but for tuning purposes the box counts have been added to the namelist to allow adjustment. By default the code prints some summary information about how full the average box is, how many are empty, and how many items were in the box with the largest count. The namelist value output_box_info can be set to .true. to get even more information about the box statistics. The best performance will be obtained somewhere between two extremes; the worst extreme is all the points are located in just a few boxes. This degenerates into a (slow) linear search through the index list. The other extreme is a large number of empty or sparsely filled boxes. The overhead of creating, managing, and searching a long list of boxes will impact performance. The best performance lies somewhere in the middle, where each box contains a reasonable number of values, more or less evenly distributed across boxes. The absolute numbers for best performance will certainly vary from case to case.

[top]

NAMELIST

This namelist is read from the file input.nml. Namelists start with an ampersand '&' and terminate with a slash '/'. Character strings that contain a '/' must be enclosed in quotes to prevent them from prematurely terminating the namelist.

&location_nml
   nx                  = 10
   ny                  = 10
   nz                  = 10
   x_is_periodic       = .false.
   min_x_for_periodic  = -888888.0
   max_x_for_periodic  = -888888.0
   y_is_periodic       = .false.
   min_y_for_periodic  = -888888.0
   max_y_for_periodic  = -888888.0
   z_is_periodic       = .false.
   min_z_for_periodic  = -888888.0
   max_z_for_periodic  = -888888.0
   compare_to_correct  = .false.
   output_box_info     = .false.
   print_box_level     = 0
   debug               = 0
  /


Items in this namelist either control the way in which distances are computed and/or influence the code performance.

Item Type Description
nx, ny, nz integer The number of boxes in each dimension to use to speed the searches. This is not the number of gridcells.
x_is_periodic, y_is_periodic, z_is_periodic logical If .true., the domain wraps in the coordinate.
min_x_for_periodic, max_x_for_periodic real(r8) The minimum and maximum values that are considered to be identical locations if x_is_periodic = .true.
min_y_for_periodic, max_y_for_periodic real(r8) The minimum and maximum values that are considered to be identical locations if y_is_periodic = .true.
min_z_for_periodic, max_z_for_periodic real(r8) The minimum and maximum values that are considered to be identical locations if z_is_periodic = .true.
compare_to_correct logical If true, do an exhaustive search for the closest point. Only useful for debugging because the performance cost is prohibitive.
output_box_info logical Print out debugging info.
print_box_level logical If output_box_info is true, how detailed should the output be.
debug integer The higher the number, the more verbose the run-time output. 0 (zero) is the minimum run-time output.


[top]

OTHER MODULES USED

types_mod
utilities_mod
random_seq_mod
[top]

PUBLIC INTERFACES

use location_mod, only : location_type
 get_close_type
 get_location
 set_location
 write_location
 read_location
 interactive_location
 set_location_missing
 query_location
 get_close_maxdist_init
 get_close_obs_init
 get_close_obs
 get_close_obs_destroy
 get_dist
 LocationDims
 LocationName
 LocationLName
 horiz_dist_only
 vert_is_undef
 vert_is_surface
 vert_is_pressure
 vert_is_scale_height
 vert_is_level
 vert_is_height
 operator(==)
 operator(/=)

Namelist interface &location_nml must be read from file input.nml.

A note about documentation style. Optional arguments are enclosed in brackets [like this].


type location_type
   private
   real(r8) :: x, y, z
end type location_type

Provides an abstract representation of physical location in a 3D cartesian space.

Component Description
x, y, z location in each dimension


type get_close_type
   private
   integer, pointer  :: loc_box(:)           ! (nloc); List of loc indices in boxes
   integer, pointer  :: count(:, :, :)       ! (nx, ny, nz); # of locs in each box
   integer, pointer  :: start(:, :, :)       ! (nx, ny, nz); Start of list of locs in this box
   real(r8)          :: bot_x, top_x         ! extents in x, y, z
   real(r8)          :: bot_y, top_y
   real(r8)          :: bot_z, top_z
   real(r8)          :: x_width, y_width, z_width    ! widths of boxes in x,y,z
   real(r8)          :: nboxes_x, nboxes_y, nboxes_z ! based on maxdist how far to search
end type get_close_type

Provides a structure for doing efficient computation of close locations.



var = get_location(loc)
real(r8), dimension(3)          :: get_location
type(location_type), intent(in) :: loc

Extracts the x, y, z locations from a location type and returns in a 3 element real array.

get_location The x,y,z values
loc A location type


var = set_location(x, y, z)
var = set_location(lon, lat, height, radius)
type(location_type) :: set_location
real(r8), intent(in)    :: x
real(r8), intent(in)    :: y
real(r8), intent(in)    :: z
or
type(location_type) :: set_location
real(r8), intent(in)    :: lon
real(r8), intent(in)    :: lat
real(r8), intent(in)    :: height
real(r8), intent(in)    :: radius

Returns a location type with the input [x,y,z] or allows the input to be specified as locations on the surface of a sphere with a specified radius and height above the surface.

set_location A location type
x, y, z Coordinates along each axis
lon, lat Longitude, Latitude in degrees
height Vertical location in same units as radius (e.g. meters)
radius The radius of the sphere in same units as height (e.g. meters)


call write_location(locfile, loc [, fform, charstring])
integer,               intent(in)       ::  locfile 
type(location_type),   intent(in)       ::  loc 
character(len=*), optional, intent(in)  ::  fform 
character(len=*), optional, intent(out) ::  charstring 

Given an integer IO channel of an open file and a location, writes the location to this file. The fform argument controls whether write is "FORMATTED" or "UNFORMATTED" with default being formatted. If the final charstring argument is specified, the formatted location information is written to the character string only, and the locfile argument is ignored.

locfile the unit number of an open file.
loc location type to be written.
fform Format specifier ("FORMATTED" or "UNFORMATTED"). Default is "FORMATTED" if not specified.
charstring Character buffer where formatted location string is written if present, and no output is written to the file unit.


var = read_location(locfile [, fform])
type(location_type)                    :: read_location
integer, intent(in)                    :: locfile
character(len=*), optional, intent(in) :: fform

Reads a location_type from a file open on channel locfile using format fform (default is formatted).

read_location Returned location type read from file
locfile Integer channel opened to a file to be read
fform Optional format specifier ("FORMATTED" or "UNFORMATTED"). Default "FORMATTED".


call interactive_location(location [, set_to_default])
type(location_type), intent(out) :: location
logical, optional, intent(in)    :: set_to_default

Use standard input to define a location type. With set_to_default true get one with all elements set to 0.

location Location created from standard input
set_to_default If true, sets all elements of location type to 0


var = query_location(loc [, attr])
real(r8)                               :: query_location
type(location_type), intent(in)        :: loc
character(len=*), optional, intent(in) :: attr

Returns the value of x, y, z depending on the attribute specification. If attr is not present, returns 'x'.

query_location Returns x, y, or z.
loc A location type
attr Selects 'X', 'Y', 'Z'. If not specified, 'X' is returned.


var = set_location_missing()
type(location_type) :: set_location_missing

Returns a location with all elements set to missing values defined in types module.

set_location_missing A location with all elements set to missing values


call get_close_maxdist_init(gc,maxdist, [maxdist_list])
type(get_close_type), intent(inout) :: gc
real(r8), intent(in)                :: maxdist
real(r8), intent(in), optional      :: maxdist_list(:)

Sets the threshhold distance. maxdist is in units of radians. Anything closer than this is deemed to be close. This routine must be called first, before the other get_close routines. It allocates space so it is necessary to call get_close_obs_destroy when completely done with getting distances between locations.

If the last optional argument is not specified, maxdist applies to all locations. If the last argument is specified, it must be a list of exactly the length of the number of specific types in the obs_kind_mod.f90 file. This length can be queried with the get_num_types_of_obs() function to get count of obs types. It allows a different maximum distance to be set per base type when get_close() is called.

gc Data for efficiently finding close locations.
maxdist Anything closer than this number of radians is a close location.
maxdist If specified, must be a list of real values. The length of the list must be exactly the same length as the number of observation types defined in the obs_def_kind.f90 file. (See get_num_types_of_obs() to get count of obs types.) The values in this list are used for the obs types as the close distance instead of the maxdist argument.


call get_close_obs_init(gc, num, obs)
type(get_close_type),             intent(inout) :: gc
integer,                          intent(in)    :: num
type(location_type), dimension(:) intent(in)    :: obs

Initialize storage for efficient identification of locations close to a given location. Allocates storage for keeping track of which 'box' each location in the list is in. Must be called after get_close_maxdist_init, and the list of locations here must be the same as the list of locations passed into get_close_obs(). If the list changes, get_close_obs_destroy() must be called, and both the initialization routines must be called again. It allocates space so it is necessary to call get_close_obs_destroy when completely done with getting distances between locations.

gc Structure that contains data to efficiently find locations close to a given location.
num The number of locations in the list.
obs The locations of each element in the list, not used in 1D implementation.


call get_close_obs(gc, base_obs_loc, base_obs_type, obs, obs_kind, num_close, close_ind, dist)
type(get_close_type),              intent(in)  :: gc
type(location_type),               intent(in)  :: base_obs_loc
integer,                           intent(in)  :: base_obs_type
type(location_type), dimension(:), intent(in)  :: obs
integer,             dimension(:), intent(in)  :: obs_kind
integer,                           intent(out) :: num_close
integer,             dimension(:), intent(out) :: close_ind
real(r8), optional,  dimension(:), intent(out) :: dist

Given a single location and a list of other locations, returns the indices of all the locations close to the single one along with the number of these and the distances for the close ones. The list of locations passed in via the obs argument must be identical to the list of obs passed into the most recent call to get_close_obs_init(). If the list of locations of interest changes get_close_obs_destroy() must be called and then the two initialization routines must be called before using get_close_obs() again.

Note that the base location is passed with the specific type associated with that location. The list of potential close locations is matched with a list of generic kinds. This is because in the current usage in the DART system the base location is always associated with an actual observation, which has both a specific type and generic kind. The list of potentially close locations is used both for other observation locations but also for state variable locations which only have a generic kind.

If called without the optional dist argument, all locations that are potentially close are returned, which is likely a superset of the locations that are within the threshold distance specified in the get_close_maxdist_init() call.

gc Structure to allow efficient identification of locations close to a given location.
base_obs_loc Single given location.
base_obs_type Specific type of the single location.
obs List of locations from which close ones are to be found.
obs_kind Generic kind associated with locations in obs list.
num_close Number of locations close to the given location.
close_ind Indices of those locations that are close.
dist Distance between given location and the close ones identified in close_ind.


call get_close_obs_destroy(gc)
type(get_close_type), intent(inout) :: gc

Releases memory associated with the gc derived type. Must be called whenever the list of locations changes, and then get_close_maxdist_init and get_close_obs_init must be called again with the new locations list.

gc Data for efficiently finding close locations.


var = get_dist(loc1, loc2, [, type1, kind2, no_vert])
real(r8)                        :: get_dist
type(location_type), intent(in) :: loc1
type(location_type), intent(in) :: loc2
integer, optional,   intent(in) :: type1
integer, optional,   intent(in) :: kind2

Returns the distance between two locations.

The type and kind arguments are not used by the default location code, but are available to any user-supplied distance routines which want to do specialized calculations based on the types/kinds associated with each of the two locations.

loc1 First of two locations to compute distance between.
loc2 Second of two locations to compute distance between.
type1 DART specific type associated with location 1.
kind2 DART generic kind associated with location 2.
var distance between loc1 and loc2.


var = vert_is_undef(loc)
logical                         :: vert_is_undef
type(location_type), intent(in) :: loc

Always returns .false.

vert_is_undef Always returns .false.
loc A location type


var = vert_is_surface(loc)
logical                         :: vert_is_surface
type(location_type), intent(in) :: loc

Always returns .false.

vert_is_surface Always returns .false.
loc A location type


var = vert_is_pressure(loc)
logical                         :: vert_is_pressure
type(location_type), intent(in) :: loc

Always returns .false.

vert_is_pressure Always returns .false.
loc A location type


var = vert_is_scale_height(loc)
logical                         :: vert_is_scale_height
type(location_type), intent(in) :: loc

Always returns .false.

vert_is_scale_height Always returns .false.
loc A location type


var = vert_is_level(loc)
logical                         :: vert_is_level
type(location_type), intent(in) :: loc

Always returns .false.

vert_is_level Always returns .false.
loc A location type


var = vert_is_height(loc)
logical                         :: vert_is_height
type(location_type), intent(in) :: loc

Always returns .false.

vert_is_height Always returns .false.
loc A location type


var = has_vertical_localization()
logical :: has_vertical_localization

Always returns .false.

This routine should perhaps be renamed to something like 'using_vertical_for_distance' or something similar. The current use for it is in the localization code inside filter, but that doesn't make this a representative function name. And at least in current usage, returning the opposite setting of the namelist item makes the code read more direct (fewer double negatives).



loc1 == loc2
type(location_type), intent(in) :: loc1, loc2

Returns true if the two location types have identical values, else false.



loc1 /= loc2
type(location_type), intent(in) :: loc1, loc2

Returns true if the two location types do NOT have identical values, else false.



integer, parameter :: LocationDims = 3

This is a constant. Contains the number of real values in a location type. Useful for output routines that must deal transparently with many different location modules.



character(len=129), parameter :: LocationName = "loc3Dcartesian"

This is a constant. A parameter to identify this location module in output metadata.



character(len=129), parameter :: LocationLName = 
"threed cartesian locations: x, y, z"

This is a constant. A parameter set to "threed cartesian locations: x, y, z" used to identify this location module in output long name metadata.


[top]

FILES

filename purpose
input.nml to read the location_mod namelist
[top]

REFERENCES

  1. none
[top]

ERROR CODES and CONDITIONS

RoutineMessageComment
nc_write_location Various NetCDF-f90 interface error messages From one of the NetCDF calls in nc_write_location

KNOWN BUGS

The octree code works fine to store values, but the search for all points within a given radius of a base point is not supported. So for this module the 3D box option (use_octree = .false.) should be used.

[top]

FUTURE PLANS

Want to fix octree code, and make it easier to detect when bad combinations of tuning parameters are being used.

See the note in the 'has_vertical_localization()' about a better name for this routine.

The functions of 'get_close_maxdist_init()' and 'get_close_obs_init()' appear to be able to be combined into a single init routine. This impacts all model_mods, however, since they can intercept these routines. Doing this will be a non-backwards compatible change.

The use of 'obs' in all these routine names should probably be changed to 'loc' since there is no particular dependence that they be observations. They may need to have an associated DART kind, but these routines are used for DART state vector entries so it's misleading to always call them 'obs'.

[top]

PRIVATE COMPONENTS

N/A

[top]

Terms of Use

DART software - Copyright UCAR. This open source software is provided by UCAR, "as is", without charge, subject to all terms of use at http://www.image.ucar.edu/DAReS/DART/DART_download

Contact: DART core group
Revision: $Revision: 11771 $
Source: $URL: https://svn-dares-dart.cgd.ucar.edu/DART/releases/Manhattan/assimilation_code/location/threed_cartesian/location_mod.html $
Change Date: $Date: 2017-06-27 10:19:58 -0600 (Tue, 27 Jun 2017) $
Change history:  try "svn log" or "svn diff"