DART Classic Documentation Switch to Manhattan?


Requirements to install and run DART


DART is intended to be highly portable among Unix/Linux operating systems. At this point we have no plans to port DART to Windows machines.

Minimally, you will need:

  1. a Fortran90 compiler,
  2. the netCDF libraries built with the F90 interface,
  3. perl (just about any version),
  4. an environment that understands csh or tcsh, and
  5. the old unix standby ... make


If you want to use the DART diagnostic scripts, you will need Matlab® along with the mexnc and snctools toolboxes (appropriate for your version of Matlab®, but no later than r4024). There was a fundamental change in snctools with revision 4028 that essentially breaks some key components of the DART diagnostic routines. Consequently, we recommend that you use version 4024 of the snctools and mexnc toolboxes. See the section on Configuring Matlab® to read netCDF files. Despite the fact that starting with R2008B release Matlab® has native netCDF support, you will still need these third-party toolboxes.



Requirements: a Fortran90 compiler


The DART software is written in standard Fortran 90, with no compiler-specific extensions. It has been compiled with and run with several versions of each of the following: GNU Fortran Compiler ("gfortran") (free), Intel Fortran Compiler for Linux and OS X, IBM XL Fortran Compiler, Portland Group Fortran Compiler, Lahey Fortran Compiler, Pathscale Fortran Compiler, Since recompiling the code is a necessity to experiment with different models, there are no binaries to distribute.


Requirements: the netCDF library


DART uses the netCDF self-describing data format for the results of assimilation experiments. These files have the extension .nc and can be read by a number of standard data analysis tools. In particular, DART also makes use of the F90 interface to the library which is available through the netcdf.mod and typesizes.mod modules. IMPORTANT: different compilers create these modules with different "case" filenames, and sometimes they are not both installed into the expected directory. It is required that both modules be present. The normal place would be in the netcdf/include directory, as opposed to the netcdf/lib directory.

If the netCDF library does not exist on your system, you must build it (as well as the F90 interface modules). The library and instructions for building the library or installing from an RPM may be found at the netCDF home page: http://www.unidata.ucar.edu/software/netcdf/

NOTE: The location of the netCDF library, libnetcdf.a, and the locations of both netcdf.mod and typesizes.mod will be needed later. Depending on the version of netCDF and the build options selected, the fortran interface routines may be in a separate library named libnetcdff.a (note the 2 F's). In this case both libraries are required to build executables.




[top]


Installing DART : Download the distribution.


The DART source code is distributed through a Subversion server. Subversion (the client-side app is 'svn') allows you to compare your code tree with one on a remote server and selectively update individual files or groups of files - without losing any local modifications. I have a brief summary of the svn commands I use most posted at: http://www.image.ucar.edu/~thoar/svn_primer.html

The DART download site is: http://www.image.ucar.edu/DAReS/DART/DART_download.

svn has adopted the strategy that "disk is cheap". In addition to downloading the code, it downloads an additional copy of the code to store locally (in hidden .svn directories) as well as some administration files. This allows svn to perform some commands even when the repository is not available. It does double the size of the code tree ... so the download is something like 480MB -- pretty big. BUT - all future updates are (usually) just the differences, so they happen very quickly.

If you follow the instructions on the download site, you should wind up with a directory named my_path_to/DART, which we call $DARTHOME. Compiling the code in this tree (as is usually the case) will necessitate much more space.

If you cannot use svn, just let me know and I will create a tar file for you. svn is so superior that a tar file should be considered a last resort.


Installing DART : document conventions


All filenames look like this -- (typewriter font, green).
Program names look like this -- (italicized font, green).
user input looks like this -- (bold, magenta).

commands to be typed at the command line are contained in an indented gray box.

And the contents of a file are enclosed in a box with a border:

&hypothetical_nml
  obs_seq_in_file_name = "obs_seq.in",
  obs_seq_out_file_name = "obs_seq.out",
  init_time_days = 0,
  init_time_seconds = 0,
  output_interval = 1
&end

Installing DART


The entire installation process is summarized in the following steps:

  1. Determine which F90 compiler is available.
  2. Determine the location of (or build) the netCDF library.
  3. Download the DART software into the expected source tree.
  4. Modify certain DART files to reflect the available F90 compiler and location of the appropriate libraries.
  5. Build the executables.

If you can compile and run ONE of the low-order models, you should be able to compile and run ANY of the low-order models. For this reason, we can focus on the Lorenz `63 model. Consequently, the only directories with files to be modified to check the installation are usually: DART/mkmf and DART/models/lorenz_63/work.

We have tried to make the code as portable as possible, but we do not have access to all compilers on all platforms, so there are no guarantees. We are interested in your experience building the system, so please send us a note at dart @ ucar .edu


Customizing the build scripts -- Overview.


DART executable programs are constructed using two tools: mkmf, and make.

mkmf requires two separate input files. The first is a `template' file which specifies details of the commands required for a specific Fortran90 compiler and may also contain pointers to directories containing pre-compiled utilities required by the DART system. This template file will need to be modified to reflect your system. The second input file is a `path_names' file which are supplied by DART and can be used without modification. An mkmf command is executed which uses the 'path_names' file and the mkmf template file to produce a Makefile which is subsequently used by the standard make utility.


Building and Customizing the 'mkmf.template' file


A series of templates for different compilers/architectures exists in the DART/mkmf/ directory and have names with extensions that identify the compiler, the architecture, or both. This is how you inform the build process of the specifics of your system. Our intent is that you copy one that is similar to your system into DART/mkmf/mkmf.template and customize it.
For the discussion that follows, knowledge of the contents of one of these templates (i.e. DART/mkmf/mkmf.template.intel.linux) is needed. Note that only the LAST lines are shown here, the head of the file is just a big comment (worth reading, btw).


...
MPIFC = mpif90
MPILD = mpif90
FC = ifort
LD = ifort
NETCDF = /usr/local
INCS = -I${NETCDF}/include
LIBS = -L${NETCDF}/lib -lnetcdf
FFLAGS = -O2 $(INCS)
LDFLAGS = $(FFLAGS) $(LIBS)

variablevalue
FC the Fortran compiler
LD the name of the loader; typically, the same as the Fortran compiler
NETCDF the location of your netCDF installation containing netcdf.mod and typesizes.mod. Note that the value of the NETCDF variable will be used by the FFLAGS, LIBS, and LDFLAGS variables.

All DART programs are compiled the same way. Each model directory has a directory called work that has the components to build the executables. This is an example of how to build two programs for the lorenz_63 model: preprocess and obs_diag. It is the same for every other DART program.


cd DART/models/lorenz_63/work
./mkmf_preprocess
make
./preprocess
./mkmf_obs_diag
make

Building the Lorenz_63 DART project.


Currently, DART executables are built in a work subdirectory under the directory containing code for the given model. There are nine mkmf_xxxxxx files for the following programs:

programpurpose
preprocess creates custom source code for just the observations of interest
create_obs_sequence specify a (set) of observation characteristics taken by a particular (set of) instruments
create_fixed_network_seq   specify the temporal attributes of the observation sets
perfect_model_obs spinup, generate "true state" for synthetic observation experiments, ...
filter perform experiments
obs_diag creates observation-space diagnostic files to be explored by the Matlab® scripts.
obs_sequence_tool manipulates observation sequence files. It is not generally needed (particularly for low-order models) but can be used to combine observation sequences or convert from ASCII to binary or vice-versa. Since this is a specialty routine - we will not cover its use in this document.
restart_file_tool is useful for manipulating the time associated with the state contained in the restart file, converting from ASCII to binary, etc. Since this is not generally required for an introductory discussion, we're going to ignore this one, too.
wakeup_filter.html is only needed for MPI applications. We're starting at the beginning here, so we're going to ignore this one, too.

quickbuild.csh is a script that will build every executable in the directory. There is an optional argument that will additionally build the mpi-enabled versions - which is not the intent of this set of instructions. Running quickbuild.csh will compile all the executables.



cd DART/models/lorenz_63/work
./quickbuild.csh -nompi

The result (hopefully) is that eight executables now reside in your work directory. The most common problem is that the netCDF libraries and include files (particularly typesizes.mod) are not found. Find them, edit the DART/mkmf/mkmf.template to point to their location, recreate the Makefile, and try again. The next most common problem is from the gfortran compiler complaining about "undefined reference to `system_'" which is covered in the Platform-specific notes section.


Checking the build -- running something.


This section is not intended to provide any details of why we are doing what we are doing - this is sort of a 'black-box' test. The DART/models/lorenz_63/work directory is distributed with input files ready to run a simple experiment: use 20 ensemble members to assimilate observations 'every 6 hours' for 50 days. Simply run the programs perfect_model_obs and filter to generate the results to compare against known results.

The initial conditions files and observations sequences are in ASCII, so there is no portability issue, but there may be some roundoff error in the conversion from ASCII to machine binary. With such a highly nonlinear model, small differences in the initial conditions will result in a different model trajectory. Your results should start out looking VERY SIMILAR and may diverge with time.


./perfect_model_obs
./filter

There should now be the following output files:

from executable "perfect_model_obs"
True_State.nc a netCDF file containing the model trajectory ... the 'truth'
obs_seq.out The observations (harvested as the true model was advanced) that were assimilated.
perfect_restart the final state of the model - in ASCII. The (true) model state at the 'end' of the experiment.
 
from executable "filter"
Prior_Diag.nc A netCDF file of the ensemble model states just before assimilation.
Posterior_Diag.nc A netCDF file of the ensemble model states just after assimilation.
obs_seq.final The observations that were assimilated as well as the ensemble mean estimates of the 'observations' - for comparison.
filter_restart The model states of the ensemble members at the 'end' of the experiment.
 
from both
dart_log.out The run-time log of the experiment. (this grows with each execution)

Note that Prior_Diag.nc and Posterior_Diag.nc contains values of the ensemble mean, ensemble spread, the individual ensemble members, the inflation mean and standard deviation. The simplest way to check the results is with the Matlab® scripts distributed with DART.

The DART/tutorial documents are an excellent way to kick the tires on DART and learn about ensemble data assimilation. If you've been able to build the Lorenz 63 model, you have correctly configured your mkmf.template and you can run anything in the tutorial.




[top]


Configuring Matlab® to read netCDF files.


Find your version of Matlab® (type 'ver' at the Matlab® prompt) and visit http://mexcdf.sourceforge.net/downloads to get the right combination of mexnc and snctools for your version of Matlab®. Follow their installation instructions. You can test if the install went well by trying to read a variable from any netCDF file (it doesn't have to be one created by DART -- see 'help nc_varget', for example).

There was a fundamental change in snctools with revision 4028 that essentially breaks some key components of the DART diagnostic routines. Consequently, we recommend that you use version 4024 of the snctools and mexnc toolboxes. Here is how we download and install that particular version:


cd [wherever_you_install_toolboxes]
svn co -r 4024 svn://svn.code.sf.net/p/mexcdf/svn/mexnc/trunk  mexnc/4024 
svn co -r 4024 svn://svn.code.sf.net/p/mexcdf/svn/snctools/trunk  snctools/4024

Be sure your MATLABPATH is set such that you have access to nc_varget. This generally means you will have to do something like the following at the Matlab® prompt :


>> addpath('wherever_you_install_toolboxes/snctools/4024','-BEGIN')
>> addpath('wherever_you_install_toolboxes/mexnc/4024',   '-BEGIN')

It's very convenient to put these it in your ~/matlab/startup.m so they get run every time Matlab® starts up. There is a very similar process for adding support for the DART diagnostic functions - read the section "Configuring Matlab® to work with DART".




[top]


Are the results correct? (requires Matlab® with netCDF support)


The initial conditions files and observations sequences are in ASCII, so there is no portability issue, but there may be some roundoff error in the conversion from ASCII to machine binary. With such a highly nonlinear model, small differences in the initial conditions will result in a different model trajectory. Your results should start out looking VERY SIMILAR and may diverge with time.

The simplest way to determine if the installation is successful is to run some of the functions we have available in DART/matlab/. Usually, we launch Matlab® from the DART/models/lorenz_63/work directory and use the Matlab® addpath command to make the DART/matlab/ functions available. In this case, we know the true state of the model that is consistent with the observations. The following Matlab® scripts compare the ensemble members with the truth and can calculate an error.


<unix_prompt> cd DART/models/lorenz_63/work
<unix_prompt> matlab
... (lots of startup messages I'm skipping)...
>> addpath ../../../matlab
>> plot_total_err

Input name of True State file; <cr> for True_State.nc
True_State.nc
Input name of prior or posterior diagnostics file;
<cr> for Prior_Diag.nc
Prior_Diag.nc
Comparing True_State.nc and
          Prior_Diag.nc

pinfo = 

                 model: 'Lorenz_63'
               def_var: 'state'
        num_state_vars: 3
       num_ens_members: 22
    time_series_length: 200
         min_state_var: 1
         max_state_var: 3
           min_ens_mem: 1
           max_ens_mem: 22
        def_state_vars: [1 2 3]
            truth_file: 'True_State.nc'
            diagn_file: 'Prior_Diag.nc'
            truth_time: [1 200]
            diagn_time: [1 200]

true state is copy   1
ensemble mean is copy   1
ensemble spread is copy   2
>> plot_ens_time_series


From the plot_ens_time_series graphic, you can see the individual green ensemble members getting more constrained as time evolves. If your figures look similar to these, that's pretty much what you're looking for and you should feel pretty confident that everything is working.




[top]


'Perfect Model' observation experiments (also known as) Observation System Simulation Experiment (OSSE)



Once a model is compatible with the DART facility all of the functionality of DART is available. This includes 'perfect model' experiments (also called Observing System Simulation Experiments - OSSEs). Essentially, the model is run forward from a known state and, at predefined times, an observation forward operator is applied to the model state to harvest synthetic observations. This model trajectory is known as the 'true state'. The synthetic observations are then used in an assimilation experiment. The assimilation performance can then be evaluated precisely because the true state (of the model) is known. Since the same forward operator is used to harvest the synthetic observations as well as during the assimilation, the 'representativeness' error of the assimilation system is not an issue.

There are a set of Matlab® functions to help explore the assimilation performance in state-space as well as in observation-space. An OSSE is explored in depth in our Lorenz '96 example.


Perfect Model Experiment Overview


There are four fundamental steps to running an OSSE from within DART:

  1. Create a blueprint of what, where, and when you want observations. Essentially, define the metadata of the observations without actually specifying the observation values. The default filename for the blueprint is obs_seq.in . For simple cases, this is just running create_obs_sequence and create_fixed_network_sequence, more in-depth solutions are presented below.
  2. Harvest the synthetic observations from the true model state by running perfect_model_obs to advance the model from a known initial condition and apply the forward observation operator based on the observation 'blueprint'. The observation will have noise added to it based on a draw from a random normal distribution with the variance specified in the observation blueprint. The noise-free 'truth' and the noisy 'observation' are recorded in the output observation sequence file. The entire time-history of the true state of the model is recorded in True_State.nc . The default filename for the 'observations' is obs_seq.out .
  3. Assimilate the synthetic observations with filter in the usual way. The prior/forecast states are preserved in Prior_Diag.nc and the posterior/analysis states are preserved in Posterior_Diag.nc . The default filename for the file with the observations and (optionally) the ensemble estimates of the observations is obs_seq.final .
  4. Check to make sure the assimilation was effective! Ensemble DA is not a black box! YOU must check to make sure you are making effective use of the information in the observations!


1. Defining the observation metadata - the 'blueprint'.


There are lots of ways to define an observation sequence that DART can use as input for a perfect model experiment. If you have observations in DART format already, you can simply use them. If you have observations in one of the formats already supported by the DART converters (check DART/observations/observations.html), convert it to a DART observation sequence. You may need to use the obs_sequence_tool to combine multiple observation sequence files into observation sequence files for the perfect model experiment. Any existing observation values and quality control information will be ignored by perfect_model_obs - only the time and location information are used. In fact, any and all existing observation and QC values will be removed.

GENERAL COMMENT ABOUT THE INTERPLAY BETWEEN THE MODEL STOP/START FREQUENCY AND THE IMPACT ON THE OBSERVATION FREQUENCY: There is usually a very real difference between the dynamical timestep of the model and when it is safe to stop and restart the model. The assimilation window is (usually) required to be a multiple of the safe stop/start frequency. For example, an atmospheric model may have a dynamical timestep of a few seconds, but may be constrained such that it is only possible to stop/restart every hour. In this case, the assimilation window is a multiple of 3600 seconds. Trying to get observations at a finer timescale is not possible, we only have access to the model state when the model stops.

If you do not have an input observation sequence, it is simple to create one.

  1. Run create_obs_sequence to generate the blueprint for the types of observations and observation error variances for whatever locations are desired.
  2. Run create_fixed_network_seq to define the temporal distribution of the desired observations.

Both create_obs_sequence and create_fixed_network_seq interactively prompt you for the information they require. This can be quite tedious if you want a spatially dense set of observations. People have been known to actually write programs to generate the input to create_obs_sequence and simply pipe or redirect the information into the program. There are several examples of these in the models/bgrid_solo directory: column_rand.f90, id_set_def_stdin.f90, ps_id_stdin.f90, and ps_rand_local.f90 . Be advised that some observation types have different input requirements, so a 'one size fits all' program is a waste of time.


NOTE: only the observation kinds in the input.nml &obs_kind_nml:assimilate_these_obs_types,evaluate_these_obs will be available to the create_obs_sequence program.


DEVELOPERS TIP: You can specify 'identity' observations as input to perfect_model_obs. Identity observations are the model values AT the exact gridcell location, there is no interpolation at all. Just a straight table-lookup. This can be useful as you develop your model interfaces; you can test many of the routines and scripts without having a working model_interpolate().


More information about creating observation sequence files for OSSE's is available in the observation sequence discussion section.


2. Generating the true state and harvesting the observation values - perfect_model_obs


perfect_model_obs reads the blueprint and an initial state and applies the appropriate forward observation operator for each and every observation in the current 'assimilation window'. If necessary, the model is advanced until the next set of observations is desired. When it has run out of observations or reached the stop time defined by the namelist control, the program stops and writes out a restart file, a diagnostic file, the observation sequence file, and a log file. This is fundamentally a single deterministic forecast for 'as long as it takes' to harvest all the observations.


default filename format contents
perfect_restart   ASCII or binary  The DART model state at the end of the forecast. If the forecast needs to be lengthened, use this as the input. The format of the file is controlled by input.nml &assim_model_nml:write_binary_restart_files The first record is the valid time of the model. The rest is the model state at that time.
True_State.nc netCDF The DART model state at every assimilation timestep. This file has but one 'copy' - the truth. Dump the copy metadata and the time:
ncdump -v time,CopyMetaData True_State.nc
obs_seq.out ASCII or binary
DART-specific linked list
This file has the observations - the result of the forward observation operator. This observation sequence file has two 'copies' of the observation: the noisy 'copy' and the noise-free 'copy'. The noisy copy is designated as the 'observation', the noise-free copy is the truth. The observation-space diagnostic program obs_diag has special options for using the true copy instead of the observation copy. See the obs_diag.html for details.
dart_log.out ASCII The run-time output of perfect_model_obs .

Each model may define the assimilation window differently, but conceptually, all the observations plus or minus half the assimilation window are considered to be simultaneous and a single model state provides the basis for all those observations. For example: if the blueprint requires temperature observations every 30 seconds, the initial model time is noon (12:00) and the assimilation window is 1 hour; all the observations from 11:30 to 12:30 will use the same state as input for the forward observation operator. The fact that you have a blueprint for observations every 30 seconds means a lot of those observations may have the same value (if they are in the same location).


perfect_model_obs uses the input.nml for its control. A subset of the namelists and variables of particular interest for perfect_model_obs are summarized here. Each namelist is fully described by the corresponding module document.


&perfect_model_obs_nml  <--- link to the full namelist description!
   ...
   start_from_restart    = .true.            usually, but not always
   output_restart        = .true.            sure, why not
   init_time_days        = -1                negative means use the time in ...
   init_time_seconds     = -1                the 'restart_in_file_name' file
   first_obs_days        = -1                negative means start at the first time in ...
   first_obs_seconds     = -1                the 'obs_seq_in_file_name' file.
   last_obs_days         = -1                negative means to stop with the last ...
   last_obs_seconds      = -1                observation in the file.
   restart_in_file_name  = "perfect_ics"
   restart_out_file_name = "perfect_restart"
   obs_seq_in_file_name  = "obs_seq.in"
   obs_seq_out_file_name = "obs_seq.out"
   output_interval       = 1
   async                 = 0                 totally depends on the model
   adv_ens_command       = "./advance_ens.csh"       depends on the model
  /

&obs_sequence_nml
   write_binary_obs_sequence = .false.       .false. will create ASCII - easy to check.
  /

&obs_kind_nml
   ...
   assimilate_these_obs_types = 'RADIOSONDE_TEMPERATURE',
   ...                                       list all the synthetic observation
   ...                                       types you want
  /

&assim_model_nml
   ...
   write_binary_restart_files = .true.       your choice
  /

&model_nml
   ...
   time_step_days = 0,                       some models call this 'assimilation_period_days'
   time_step_seconds = 3600                  some models call this 'assimilation_period_seconds'
                                             use whatever value you want
  /

&utilities_nml
   ...
   termlevel   = 1                           your choice
   logfilename = 'dart_log.out'              your choice
  /

Executing perfect_model_obs


Since perfect_model_obs generally requires advancing the model, and the model may use MPI or require special ancillary files or forcing files or ..., it is not possible to provide a single example that will cover all possibilities. The subroutine-callable models (i.e. the low-order models) can run perfect_model_obs very simply:


./perfect_model_obs


3. Performing the assimilation experiment - filter


This step is done with the program filter, which also uses input.nml for input and run-time control. A successful assimilation will depend on many things: an approprite initial ensemble, monitoring and perhaps correcting the ensemble spread, localization, etc. It is simply not possible to design a one-size-fits-all system that will work for all cases. It is critically important to analyze the results of the assimilation and explore ways of making the assimilation more effective. The DART tutorial and the DART_LAB exercises are an invaluable resource to learn and understand how to determine the effectiveness of, and improve upon, an assimilation experiment. The concepts learned with the low-order models are directly applicable to the most complicated models.

It is important to remember that if filter 'terminates normally', it does not necessarily mean the assimilation was effective!

filter produces two state-space output diagnostic files (Prior_Diag.nc and Posterior_Diag.nc) which contains values of the ensemble mean, ensemble spread, perhaps the inflation values, and (optionally) ensemble members for the duration of the experiment. filter also creates an observation sequence file that contains the input observation information as well as the prior and posterior ensemble mean estimates of that observation, the prior and posterior ensemble spread for that observation, and (optionally), the actual prior and posterior ensemble estimates of that observation. Rather than replicate the observation metadata for each of these, the single metadata is shared for all these 'copies' of the observation. See An overview of the observation sequence for more detail. filter also produces a run-time log file that can greatly aid in determining what went wrong if the program terminates abnormally.

A very short description of some of the most important namelist variables is presented here. Basically, I am only discussing the settings necessary to get filter to run. I can guarantee these settings WILL NOT generate the BEST assimilation. Again, see the module documentation for a full description of each namelist.


&filter_nml  <--- link to the full namelist description!
   async                    = 0
   adv_ens_command          = "./advance_model.csh"
   ens_size                 = 40                 something ≥ 20, please
   start_from_restart       = .false.            .false. requires reading available input files
   output_restart           = .true.
   obs_sequence_in_name     = "obs_seq.out"      whatever you called the output from perfect_model_obs
   obs_sequence_out_name    = "obs_seq.final"
   restart_in_file_name     = "filter_ics"       the file (or base file name) of your ensemble
   restart_out_file_name    = "filter_restart"
   init_time_days           = -1                 the time in the restart file is correct
   init_time_seconds        = -1
   first_obs_days           = -1                 same interpretation as with perfect_model_obs
   first_obs_seconds        = -1
   last_obs_days            = -1                 same interpretation as with perfect_model_obs
   last_obs_seconds         = -1
   num_output_state_members = 10                 # of FULL DART model states to put in state-space output files
   num_output_obs_members   = 40                 # of ensemble member 'copies' of observation to save
   output_interval          = 1
   num_groups               = 1
   input_qc_threshold       =  4.0
   outlier_threshold        =  3.0               Observation rejection criterion!
   output_forward_op_errors = .false.
   output_timestamps        = .false.
   output_inflation         = .true.

   inf_flavor               = 0,                       0                  0 is 'do not inflate'
   inf_start_from_restart   = .false.,                 .false.
   inf_output_restart       = .false.,                 .false.
   inf_deterministic        = .true.,                  .true.
   inf_in_file_name         = 'not_initialized',       'not_initialized'
   inf_out_file_name        = 'not_initialized',       'not_initialized'
   inf_diag_file_name       = 'not_initialized',       'not_initialized'
   inf_initial              = 1.0,                     1.0
   inf_sd_initial           = 0.6,                     0.0
   inf_damping              = 0.9,                     0.0
   inf_lower_bound          = 1.0,                     1.0
   inf_upper_bound          = 1000000.0,               1000000.0
   inf_sd_lower_bound       = 0.6,                     0.0
  /

&ensemble_manager_nml
   single_restart_file_in  = .false.       .false. means each enemble member is in a separate file
   single_restart_file_out = .false.
   perturbation_amplitude  = 0.2           not used if 'single_restart_file_in' is .false.
  /

&assim_tools_nml
   filter_kind             = 1             1 is EAKF, 2 is EnKF ...
   cutoff                  = 0.2           this is your localization - units depend on type of 'location_mod'
  /

&obs_kind_nml
   assimilate_these_obs_types = 'RAW_STATE_VARIABLE'    Again, use a list ...
  /

&model_nml
   assimilation_perior_days    = 0                      the assimilation interval is up to you
   assimilation_perior_seconds = 3600
  /


num_output_state_members are '.true.' so the state vector is output at every time for which there are observations (once a day here). Posterior_Diag.nc and Prior_Diag.nc then contain values for 20 ensemble members once a day. Once the namelist is set, execute filter to integrate the ensemble forward for 24,000 steps with the final ensemble state written to the filter_restart. Copy the perfect_model_obs restart file perfect_restart (the `true state') to perfect_ics, and the filter restart file filter_restart to filter_ics so that future assimilation experiments can be initialized from these spun-up states.


mpirun ./filter        -OR-

mpirun.lsf ./filter    -OR-

./filteru              -OR-

however YOU run filter on your system!


perfect_model_obs

4. ASSESS THE PERFORMANCE!


All the concepts of spread, rmse, rank histograms that were taught in the DART tutorial and in DART_LAB should be applied now. Try the techniques described in the Did my experiment work? section. The 'big three' state-space diagnostics are repeated here because they are so important. The first two require the True_State.nc .


plot_bins.m plots the rank histograms for a set of state variables. This requires you to have all or most of the ensemble members available in the Prior_Diag.nc or Posterior_Diag.nc files.
plot_total_err.m plots the evolution of the error (un-normalized) and ensemble spread of all state variables.
plot_ens_mean_time_series.m    plots the evolution of a set of state variables - just the ensemble mean (and Truth, if available). plot_ens_time_series.m is actually a better choice if you can afford to write all/most of the ensemble members to the Prior_Diag.nc and Posterior_Diag.nc files.

DON'T FORGET ABOUT THE OBSERVATION-SPACE DIAGNOSTICS!




[top]



The Data Assimilation Research Testbed - DART Tutorial


DART comes with an extensive set of tutorial materials, working models of several different levels of complexity, and data to be assimilated. It has been used in several multi-day workshops and can be used as the basis to teach a section on Data Assimilation. Download the DART software distribution and look in the tutorial subdirectory for the pdf and framemaker source for each of the 22 tutorial sections. The most recent versions of the tutorial are always provided below.

Browsing the tutorial is worth the effort.
Taking the tutorial is FAR better!


  1. Section 1 [pdf] Filtering For a One Variable System.
  2. Section 2 [pdf] The DART Directory Tree.
  3. Section 3 [pdf] DART Runtime Control and Documentation.
  4. Section 4 [pdf] How should observations of a state variable impact an unobserved state variable? Multivariate assimilation.
  5. Section 5 [pdf] Comprehensive Filtering Theory: Non-Identity Observations and the Joint Phase Space.
  6. Section 6 [pdf] Other Updates for An Observed Variable.
  7. Section 7 [pdf] Some Additional Low-Order Models.
  8. Section 8 [pdf] Dealing with Sampling Error.
  9. Section 9 [pdf] More on Dealing with Error; Inflation.
  10. Section 10 [pdf] Regression and Non-linear Effects.
  11. Section 11 [pdf] Creating DART Executables.
  12. Section 12 [pdf] Adaptive Inflation.
  13. Section 13 [pdf] Hierarchical Group Filters and Localization.
  14. Section 14 [pdf] DART Observation Quality Control.
  15. Section 15 [pdf] DART Experiments: Control and Design.
  16. Section 16 [pdf] Diagnostic Output.
  17. Section 17 [pdf] Creating Observation Sequences.
  18. Section 18 [pdf] Lost in Phase Space: The Challenge of Not Knowing the Truth.
  19. Section 19 [pdf] DART-Compliant Models and Making Models Compliant.
  20. Section 20 [pdf] Model Parameter Estimation.
  21. Section 21 [pdf] Observation Types and Observing System Design.
  22. Section 22 [pdf] Parallel Algorithm Implementation.
  23. Carbon Tutorial [pdf] A Simple 1D Advection Model.

Please suggest ways for us to improve DART.

[top]