DART PROGRAM obs_seq_verify

DART project logo

Jump to DART Documentation Main Index
version information for this file:
$Id: obs_seq_verify.html 6380 2013-08-05 23:47:11Z nancy $



verify schematic

obs_seq_verify reorders the observations from a forecast run of DART into a structure that is amenable for the evaluation of the forecast. The big picture is that the verification locations and times identified in the obsdef_mask.nc and the observations from the forecast run (usually called obs_seq.forecast.YYYYMMDDHH) are put into a netCDF variable that looks like this:

verify variable

obs_seq_verify can read in a series of observation sequence files - each of the files must contain the entire forecast from a single analysis time. The name of each file is required to reflect the analysis time. Use obs_sequence_tool to concatenate multiple files into a single observation sequence file if necessary. Only the individual ensemble members forecast values are used - the ensemble mean and spread (as individual copies) are completely ignored. The individual "prior ensemble member NNNN" copies are used. As a special case, the "prior ensemble mean" copy is used if and only if there are no individual ensemble members present (i.e. input.nml &filter_nml:num_output_obs_members == 0).

Dimension Explanation
analysisT This is the netCDF UNLIMITED dimension, so it is easy to 'grow' this dimension. This corresponds to the number of forecasts one would like to compare.
stations The unique horizontal locations in the verification network.
levels The vertical level at each location.
copy This dimension designates the quantity of interest; the observation, the forecast value, or the observation error variance. These quantities are the ones required to calculate the evaluation statistics.
nmembers Each ensemble member contributes a forecast value.
forecast_lead   This dimension relates to the amount of time between the start of the forecast and the verification.

The USAGE section has more on the actual use of obs_seq_verify.



This namelist is read from the file input.nml. Namelists start with an ampersand '&' and terminate with a slash '/'. Character strings that contain a '/' must be enclosed in quotes to prevent them from prematurely terminating the namelist.

   obs_sequence_list = '',
   obs_sequence_name = 'obs_seq.forecast.YYYYMMDDHH',
   station_template  = 'obsdef_mask.nc',
   netcdf_out        = 'forecast.nc',
   obtype_string     = 'METAR_U_10_METER_WIND',
   verbose           = .true.,

You can specify either obs_sequence_name or obs_sequence_list -- not both. One of them has to be an empty string ... i.e. ' '.

Item Type Description
obs_sequence_name character(len=129) Name of the observation sequence file(s).
This may be a relative or absolute filename. If the filename contains a '/' the filename is considered to be comprised of everything to the right, and a directory structure to the left. The directory structure is then queried to see if it can be incremented to handle a sequence of observation files. The default behavior of obs_seq_verify is to look for additional files to include until the files are exhausted or a file is found that contains observations beyond the timeframe of interest.
e.g. 'obsdir_001/obs_seq.forecast' will cause obs_seq_verify to look for 'obsdir_002/obs_seq.forecast', and so on.
If this is set, obs_sequence_list must be set to ' '.
obs_sequence_list character(len=129) Name of an ascii text file which contains a list of one or more observation sequence files, one per line. If this is specified, obs_sequence_name must be set to ' '. Can be created by any method, including sending the output of the 'ls' command to a file, a text editor, or another program.
station_template character(len=129) The name of the netCDF file created by obs_seq_coverage that contains the verification network description.
netcdf_out character(len=129) The base portion of the filename of the file that will contain the forecast quantities. Since each observation type of interest is processed with a separate run of obs_seq_verify, the observation type string is used to create a unique output filename.
calendar character(len=129) The type of the calendar used to interpret the dates.
obtype_string character(len=32) The observation type string that will be verified. The character string must match one of the standard DART observation types. This will be the name of the variable in the netCDF file, and will also be used to make a unique netCDF file name.
verbose logical Print extra run-time information.







obs_seq_verify is built in .../DART/models/your_model/work, in the same way as the other DART components.

Once the forecast has completed, each observation type may be extracted from the observation sequence file and stuffed into the appropriate verification structure. Each observation type must be processed serially at this time, and each results in a separate output netCDF file. Essentially, obs_seq_verify sorts an unstructured, unordered set of observations into a predetermined configuration.

Example: a single 48-hour forecast that is evaluated every 6 hours.

Example 1

In this example, the obsdef_mask.nc file was created by running obs_seq_coverage with the namelist specified in the single 48hour forecast evaluated every 6 hours example. The obsdef_mask.txt file was used to mask the input observation sequence files by obs_selection and the result was run through filter with the observations marked as evaluate_only - resulting in a file called obs_seq.forecast.2008060818.

Just to reiterate the example, both namelists for obs_seq_coverage and obs_seq_verify are provided below.

   obs_sequence_name  = '',
   obs_sequence_list  = 'obs_file_list.txt',
   obs_of_interest    = 'METAR_U_10_METER_WIND',
   textfile_out       = 'obsdef_mask.txt',
   netcdf_out         = 'obsdef_mask.nc',
   calendar           = 'Gregorian',
   first_analysis     =  2008, 6, 8, 18, 0, 0 ,
   last_analysis      =  2008, 6, 8, 18, 0, 0 ,
   forecast_length_days          = 2,
   forecast_length_seconds       = 0,
   verification_interval_seconds = 21600,
   temporal_coverage_percent     = 100.0,
   lonlim1            =    0.0,
   lonlim2            =  360.0,
   latlim1            =  -90.0,
   latlim2            =   90.0,
   verbose            = .true.

   obs_sequence_name = 'obs_seq.forecast.2008060818',
   obs_sequence_list = '',
   station_template  = 'obsdef_mask.nc',
   netcdf_out        = 'forecast.nc',
   obtype_string     = 'METAR_U_10_METER_WIND',
   verbose           = .true.

The pertinent information from the obsdef_mask.nc file is summarized (from ncdump -v experiment_times,analysis,forecast_lead obsdef_mask.nc) as follows:

verification_times = 148812.75, 148813, 148813.25, 148813.5, 148813.75,
                                148814, 148814.25, 148814.5, 148814.75 ;

analysis           = 148812.75 ;

forecast_lead      = 0, 21600, 43200, 64800, 86400, 108000, 129600, 151200, 172800 ;

There is one analysis time, 9 forecast leads and 9 verification times. The analysis time is the same as the first verification time. The run-time output of obs_seq_verify and a dump of the resulting netCDF file follows:

[thoar@mirage2 work]$ ./obs_seq_verify |& tee my.verify.log
 Starting program obs_seq_verify
 Initializing the utilities module.
 Trying to log to unit           10
 Trying to open file dart_log.out

 Starting ... at YYYY MM DD HH MM SS =
                 2011  3  1 10  2 54
 Program obs_seq_verify

 set_nml_output Echo NML values to log file only
 Trying to open namelist log dart_log.nml

 -------------- ASSIMILATE_THESE_OBS_TYPES --------------
 -------------- EVALUATE_THESE_OBS_TYPES --------------

 find_ensemble_size:  opening obs_seq.forecast.2008060818
 location_mod: Ignoring vertical when computing distances; horizontal only
 find_ensemble_size: There are   50 ensemble members.

 fill_stations:  There are          221 stations of interest,
 fill_stations: ...  and              9 times    of interest.
 InitNetCDF:  METAR_U_10_METER_WIND_forecast.nc is fortran unit            5

 obs_seq_verify:  opening obs_seq.forecast.2008060818
 analysis            1 date is 2008 Jun 08 18:00:00

 index    6 is prior ensemble member      1
 index    8 is prior ensemble member      2
 index   10 is prior ensemble member      3
 index  100 is prior ensemble member     48
 index  102 is prior ensemble member     49
 index  104 is prior ensemble member     50

 QC index           1  NCEP QC index
 QC index           2  DART quality control

 Processing obs        10000  of        84691
 Processing obs        20000  of        84691
 Processing obs        30000  of        84691
 Processing obs        40000  of        84691
 Processing obs        50000  of        84691
 Processing obs        60000  of        84691
 Processing obs        70000  of        84691
 Processing obs        80000  of        84691

 METAR_U_10_METER_WIND dimlen            1  is            9
 METAR_U_10_METER_WIND dimlen            2  is           50
 METAR_U_10_METER_WIND dimlen            3  is            3
 METAR_U_10_METER_WIND dimlen            4  is            1
 METAR_U_10_METER_WIND dimlen            5  is          221
 METAR_U_10_METER_WIND dimlen            6  is            1

 obs_seq_verify:  doneDONEdoneDONE does not exist. Finishing up.

 Finished ... at YYYY MM DD HH MM SS =
                 2011  3  1 10  3  7

[thoar@mirage2 work]$ ncdump -h METAR_U_10_METER_WIND_forecast.nc
netcdf METAR_U_10_METER_WIND_forecast {
        analysisT = UNLIMITED ; // (1 currently)
        copy = 3 ;
        stations = 221 ;
        levels = 1 ;
        nmembers = 50 ;
        forecast_lead = 9 ;
        linelen = 129 ;
        nlines = 446 ;
        stringlength = 64 ;
        location = 3 ;
        double analysisT(analysisT) ;
                analysisT:long_name = "time of analysis" ;
                analysisT:units = "days since 1601-1-1" ;
                analysisT:calendar = "Gregorian" ;
                analysisT:missing_value = 0. ;
                analysisT:_FillValue = 0. ;
        int copy(copy) ;
                copy:long_name = "observation copy" ;
                copy:note1 = "1 == observation" ;
                copy:note2 = "2 == prior" ;
                copy:note3 = "3 == observation error variance" ;
                copy:explanation = "see CopyMetaData variable" ;
        int stations(stations) ;
                stations:long_name = "station index" ;
        int original_qc(analysisT, stations, forecast_lead) ;
                original_qc:long_name = "original QC value" ;
                original_qc:missing_value = -888888 ;
                original_qc:_FillValue = -888888 ;
        int dart_qc(analysisT, stations, forecast_lead) ;
                dart_qc:long_name = "DART QC value" ;
                dart_qc:explanation1 = "1 == prior evaluated only" ;
                dart_qc:explanation2 = "4 == forward operator failed" ;
                dart_qc:missing_value = -888888 ;
                dart_qc:_FillValue = -888888 ;
        double levels(levels) ;
                levels:long_name = "vertical level of observation" ;
        int nmembers(nmembers) ;
                nmembers:long_name = "ensemble member" ;
        int forecast_lead(forecast_lead) ;
                forecast_lead:long_name = "forecast lead time" ;
                forecast_lead:units = "seconds" ;
        double location(stations, location) ;
                location:description = "location coordinates" ;
                location:location_type = "loc3Dsphere" ;
                location:long_name = "threed sphere locations: lon, lat, vertical" ;
                location:storage_order = "Lon Lat Vertical" ;
                location:units = "degrees degrees which_vert" ;
        int which_vert(stations) ;
                which_vert:long_name = "vertical coordinate system code" ;
                which_vert:VERTISUNDEF = -2 ;
                which_vert:VERTISSURFACE = -1 ;
                which_vert:VERTISLEVEL = 1 ;
                which_vert:VERTISPRESSURE = 2 ;
                which_vert:VERTISHEIGHT = 3 ;
                which_vert:VERTISSCALEHEIGHT = 4 ;
        char namelist(nlines, linelen) ;
                namelist:long_name = "input.nml contents" ;
        char CopyMetaData(copy, stringlength) ;
                CopyMetaData:long_name = "copy quantity names" ;
        double METAR_U_10_METER_WIND(analysisT, stations, levels, copy, nmembers, forecast_lead) ;
                METAR_U_10_METER_WIND:long_name = "forecast variable quantities" ;
                METAR_U_10_METER_WIND:missing_value = -888888. ;
                METAR_U_10_METER_WIND:_FillValue = -888888. ;

// global attributes:
                :creation_date = "YYYY MM DD HH MM SS = 2011 03 01 10 03 00" ;
                :source = "$URL: https://svn-dares-dart.cgd.ucar.edu/DART/releases/Lanai/obs_sequence/obs_seq_verify.html $" ;
                :revision = "$Revision: 6380 $" ;
                :revdate = "$Date: 2013-08-05 17:47:11 -0600 (Mon, 05 Aug 2013) $" ;
                :obs_seq_file_001 = "obs_seq.forecast.2008060818" ;
[thoar@mirage2 work]$






obs_seq_verify 'namelist: obtype_string (xxxx) is unknown. change input.nml' the requested observation type does not match any supported observation type. If it is spelled correctly, perhaps you need to rerun preprocess to build the appropriate obs_def_mod.modand obs_kind_mod.mod.
obs_seq_verify 'specify "obs_sequence_name" or "obs_sequence_list"' one of these namelist variables MUST be an empty string
obs_seq_verify 'xxxxxx ' is not a known observation type.' one of the obs_of_interest namelist entries specifies an observation type that is not supported. Perhaps you need to rerun preprocess with support for the observation, or perhaps it is spelled incorrectly. All DART observation types are strictly uppercase.
obs_seq_verify 'need at least 1 qc and 1 observation copy' an observation sequence does not have all the metadata necessary. Cannot use "obs_seq.in"-class sequences.
obs_seq_verify 'num_copies ##### does not match #####' ALL observation sequences must contain the same 'copy' information. At some point it may be possible to mix "obs_seq.out"-class sequences with "obs_seq.final"-class sequences, but this seems like it can wait.
obs_seq_verify 'No location had at least ### reporting times.' The input selection criteria did not result in any locations that had observations at all of the required verification times.
set_required_times 'namelist: forecast length is not a multiple of the verification interval' The namelist settings for forecast_length_[days,seconds] and verification_interval_seconds do not make sense. Refer to the forecast time diagram.
set_required_times 'namelist: last analysis time is not a multiple of the verification interval' The namelist settings for first_analysis and last_analysis are not separated by a multiple of verification_interval_seconds. Refer to the forecast time diagram.





Relax the restriction requiring 100.0% temporal coverage.
Sensibly require that we only require against observations that DART can compute. i.e. the prior forward operator must complete successfully - almost all cases of the operator failing are extrapolation issues; the observation is outside the domain.

Note that no attempt is made at checking the QC value of the candidate observations. One of the common problems is that the region definition does not mesh particularly well with the model domain and the DART forward operator fails because it would have to extrapolate (which is not allowed). Without checking the QC value, this can mean there are a lot of 'false positives'; observations that seemingly could be used to validate, but are actually just outside the model domain. I'm working on that ....


Terms of Use

DART software - Copyright 2004 - 2013 UCAR.
This open source software is provided by UCAR, "as is",
without charge, subject to all terms of use at

Contact: Tim Hoar
Revision: $Revision: 6380 $
Source: $URL: https://svn-dares-dart.cgd.ucar.edu/DART/releases/Lanai/obs_sequence/obs_seq_verify.html $
Change Date: $Date: 2013-08-05 17:47:11 -0600 (Mon, 05 Aug 2013) $
Change history:  try "svn log" or "svn diff"