The 'low-order' models supported in DART.
ikeda
The Ikeda model is a 2D chaotic map useful for visualization data assimilation updating directly in state space. There are three parameters: a, b, and mu. The state is 2D, x = [X Y]. The equations are:
X(i+1) = 1 + mu * ( X(i) * cos( t ) - Y(i) * sin( t ) ) Y(i+1) = mu * ( X(i) * sin( t ) + Y(i) * cos( t ) ),
where
t = a - b / ( X(i)**2 + Y(i)**2 + 1 )
Note the system is time-discrete already, meaning there is no delta_t. The system stems from nonlinear optics (Ikeda 1979, Optics Communications). Interface written by Greg Lawson. Thanks Greg!
lorenz_63
This is the 3-variable model as described in:
Lorenz, E. N. 1963. Deterministic nonperiodic flow.
J. Atmos. Sci. 20, 130-141.
The system of equations is:
X' = -sigma*X + sigma*Y Y' = -XZ + rX - Y Z' = XY -bZ
lorenz_84
This model is based on:
Lorenz E. N., 1984: Irregularity: A fundamental property of the atmosphere.
Tellus, 36A, 98-110.
The system of equations is:
X' = -Y^2 - Z^2 - aX + aF Y' = XY - bXZ - Y + G Z' = bXY + XZ - Z
Where a, b, F, and G are the model parameters.
9var
This model provides interesting off-attractor transients that behave something like gravity waves.
lorenz_96
This is the model we use to become familiar with new architectures, i.e.,
it is the one we use 'first'. It can be called as a subroutine or as a separate
executable. We can test this model both single-threaded and mpi-enabled.
Quoting from the Lorenz 1998 paper:
... the authors introduce a model consisting of 40 ordinary differential equations, with the dependent variables representing values of some atmospheric quantity at 40 sites spaced equally about a latitude circle. The equations contain quadratic, linear, and constant terms representing advection, dissipation, and external forcing. Numerical integration indicates that small errors (differences between solutions) tend to double in about 2 days. Localized errors tend to spread eastward as they grow, encircling the globe after about 14 days.
...
We have chosen a model with J variables, denoted by X1, ..., XJ; in most of our experiments we have let J = 40. The governing equations are:
dXj/dt = (Xj+1 - Xj-2)Xj-1 - Xj + F (1)for j = 1, ..., J. To make Eq. (1) meaningful for all values of j we define X-1 = XJ-1, X0 = XJ, and XJ+1 = X1, so that the variables form a cyclic chain, and may be looked at as values of some unspecified scalar meteorological quantity, perhaps vorticity or temperature, at J equally spaced sites extending around a latitude circle. Nothing will simulate the atmosphere's latitudinal or vertical extent.
forced_lorenz_96
The forced_lorenz_96 model implements the standard L96 equations except that the forcing term, F, is added to the state vector and is assigned an independent value at each gridpoint. The result is a model that is twice as big as the standard L96 model. The forcing can be allowed to vary in time or can be held fixed so that the model looks like the standard L96 but with a state vector that includes the constant forcing term. An option is also included to add random noise to the forcing terms as part of the time tendency computation which can help in assimilation performance. If the random noise option is turned off (see namelist) the time tendency of the forcing terms is 0.
lorenz_96_2scale
This is the Lorenz 96 2-scale model, documented in Lorenz (1995). It also has the option of the variant on the model from Smith (2001), which is invoked by setting local_y = .true. in the namelist. The time step, coupling, forcing, number of X variables, and the number of Ys per X are all specified in the namelist. Defaults are chosen depending on whether the Lorenz or Smith option is specified in the namelist. Lorenz is the default model. Interface written by Josh Hacker. Thanks Josh!
lorenz_04
The reference for these models is Lorenz, E.N., 2005: Designing
chaotic models. J. Atmos. Sci., 62, 1574-1587.
Model II is a single-scale model, similar to Lorenz 96, but with
spatial continuity in the waves. Model III is a two-scale
model. It is fudamentally different from the Lorenz 96 two-scale
model because of the spatial continuity and the fact that both scales
are projected onto a single variable of integration. The scale
separation is achived by a spatial filter and is therefore not perfect
(i.e. there is leakage). The slow scale in model III is model II,
and thus model II is a deficient form of model III. The basic
equations are documented in Lorenz (2005) and also in the model_mod.f90
code. The user is free to choose model II or III with a Namelist
variable.
simple_advection
This model is on a periodic one-dimensional domain. A wind field is modeled using Burger's Equation with an upstream semi-lagrangian differencing. This diffusive numerical scheme is stable and forcing is provided by adding in random gaussian noise to each wind grid variable independently at each timestep. An Eulerian option with centered-in-space differencing is also provided. The Eulerian differencing is both numerically unstable and subject to shock formation. However, it can sometimes be made stable in assimilation mode (see recent work by Majda and collaborators).
The 'high-order' models supported in DART.
In roughly the order they were supported by DART.
bgrid_solo
This is a dynamical core for B-grid dynamics using the Held-Suarez forcing. The resolution is configurable, and the entire model can be run as a subroutine. Status: supported.
pe2lyr
This model is a 2-layer, isentropic, primitive equation model on a sphere. Status: orphaned.
wrf
The Weather Research and Forecasting (WRF) Model is a next-generation mesoscale numerical weather prediction system designed to serve both operational forecasting and atmospheric research needs. More people are using DART with WRF than any other model. Note: The actual WRF code is not distributed with DART. Status: supported.
cam
The Community Atmosphere Model (CAM) is the latest in a series of global atmosphere models developed at NCAR for the weather and climate research communities. CAM also serves as the atmospheric component of the Community Climate System Model (CCSM). Status: supported.
PBL_1d
The PBL model is a single column version of the WRF model. In this instance, the necessary portions of the WRF code are distributed with DART. Status: supported - but looking to be adopted.
MITgcm_annulus
The MITgcm annulus model as configured for this application within DART is a non-hydrostatic, rigid lid, C-grid, primitive equation model utilizing a cylindrical coordinate system. For detailed information about the MITgcm, see http://mitgcm.org Status: orphaned - and looking to be adopted.
rose
The rose model is for the stratosphere-mesosphere and was used by Tomoko Matsuo (now at CU-Boulder and NOAA) for research in the assimilation of observations of the Mesosphere Lower-Thermosphere (MLT). Note: the model code is not distributed with DART. Status: orphaned
MITgcm_ocean
The MIT ocean GCM version 'checkpoint59a' is the foundation of this implementation. It was modified by Ibrahim Hoteit (then of Scripps) to accomodate the interfaces needed by DART. Status: supported - but looking to be adopted.
am2
The FMS AM2 model is GFDL's atmosphere-only code using observed sea surface temperatures, time-varying radiative forcings (including volcanos) and time-varying land cover type. This version of AM2 (also called AM2.1) uses the finite-volume dynamical core (Lin 2004). Robert Pincus (CIRES/NOAA ESRL PSD1) and Patrick Hoffman (NOAA) wrote the DART interface and are currently using the system for research. Note: the model code is not distributed with DART. Status: supported
coamps
The DART interface was originally written and supported by Tim Whitcomb. The following model description is taken from the COAMPS overview web page:
The Coupled Ocean/Atmosphere Mesoscale Prediction System (COAMPS) has been developed by the Marine Meteorology Division (MMD) of the Naval Research Laboratory (NRL). The atmospheric components of COAMPS, described below, are used operationally by the U.S. Navy for short-term numerical weather prediction for various regions around the world.
Note: the model code is not distributed with DART. Status: supported
POP
The Parallel Ocean Program (POP) comes in two variants. Los Alamos National Laboratory provides POP Version 2.0 which has been modified to run in the NCAR Community Climate System Model (CCSM) framework. As of November 2009, the CCSM-POP version is being run. The LANL-POP version is nearly supported - and some extensions useful for data assimilation in general have been proposed to LANL, who have agreed in principle to implement the changes. Fundamentally, the change is an additional restart option in which the first timestep after an assimilation is a Eulerian timestep (similar to a cold start). Note: the souce code for POP is not distributed with DART. Status: actively being developed
Downloadable datasets for DART.
The code distribution was getting 'cluttered' with datasets,
boundary conditions, intial conditions, ... large files that were not
necessarily interesting to all people who downloaded the DART code.
Worse, subversion makes a local hidden copy of the original repository
contents, so the penalty for being large is doubled.
It just made sense to make all the large files available on
as 'as-needed' basis.
To keep the size of the DART distribution down we have a separate
www-site to provide some observation sequences, initial conditions,
and general datasets.
It is our intent to populate this site with some 'verification' results,
i.e. assimilations that were known to be 'good' and that should be fairly
reproducible - appropriate to test the DART installation.
Please be patient as I make time to populate this directory.
(yes, 'make', all my 'found' time is taken ...)
Observation sequences can be found at
http://www.image.ucar.edu/pub/DART/Obs_sets.
Verification experiments will be posted to
http://www.image.ucar.edu/pub/DART/VerificationData as soon as
I can get to it. These experiments will consist of initial conditions files
for testing different high-order models like CAM, WRF, POP ...
The low-order models are already distributed with verification data in
their work directories.
Useful bits for CAM can be found at
http://www.image.ucar.edu/pub/DART/CAM.
Useful bits for WRF can be found at
http://www.image.ucar.edu/pub/DART/WRF.
Creating initial conditions for DART
The idea is to generate an ensemble that has sufficient 'spread' to cover the range
of possible solutions. Insufficient spread can (and usually will) lead to poor
assimilations. Think 'filter divergence'.
Generating an ensemble of initial conditions can be done in lots of ways,
only a couple of which will be discussed here.
The first is to generate a single initial condition and let DART perturb it with
noise of a nature you specify to generate as many ensemble members as you like.
The second is to take some existing collection of model states and convert them to
DART initial conditions files and then use the restart_file_tool
to set the proper date in the files. The hard part is then coming up with the
original collection of model state(s).
Adding noise to a single model state
This method works well for some models, and fails miserably for others. As it stands, DART supplies a routine that can add gaussian noise to every element of a state vector. This can cause some models to be numerically unstable. You can supply your own model_mod:pert_model_state() if you want a more sophisticated perturbation scheme.
Using a collection of model states.
The important thing to remember is that the high-order models all come
with routines to convert a single model restart file (or the equivalent) to
a DART initial conditions file.
CAM has trans_pv_sv,
WRF has wrf_to_dart,
POP has pop_to_dart, etc. DART has the ability to read
a single file that contains initial conditions for all the ensemble members,
or a series of restart files - one for each ensemble member. Simply collect your
ensemble of restart files from your model and convert each of them to a DART
initial conditions file of the form filter_ics.####
where #### represents a 4 digit ensemble member counter.
That is, for a 50-member ensemble, they should be named:
filter_ics.0001 ... filter_ics.0050
Frequently, the initial ensemble of restart files is some climatological
collection. For CAM experiments, we usually start with N
different 'January 1' states ... from N different years.
The DART utility program restart_file_tool
is then run on each of these initial conditions files to set a consistent
date for all of the initial conditions.
Experience has shown that it takes less than a week of assimilating
4x/day to achieve a steady ensemble spread. WRF has its own method of
generating an initial ensemble. For that, it is best to go to contact
someone familiar with WRF/DART.
Initial conditions for the low-order models.
In general, there are 'restart files' for the low-order models that already exist as work/filter_ics. If you need more ensemble members than are supplied by these files, you can generate your own by adding noise to a single perfect_ics file. Simply specify
&filter_nml start_from_restart = .FALSE., restart_in_file_name = "perfect_ics", ens_size = [whatever you want]
'perfect model' experiments or 'OSSE's.
All of the workshop and tutorial examples are 'perfect model' experiments.
The ability to compare against 'the truth' is great for exploring what does
and doesn't work during experimentation.
Every low-order model has a workshop_setup.csh that
compiles all the executables needed to run an OSSE, and then actually runs them.
The (empty) observation sequence files have been specified for what, where, and when
'observations' will be needed. This was done with create_obs_sequence
and create_fixed_network_seq. Run them yourself if you want to understand
exactly what it takes to create an observation sequence file devoid of the observation values.
The examples are very run-time-output verbose - great for understanding what is going on,
but just awful for performance. The run-time verbosity can be cut down when running larger models.
Some of the models have input values that are designed to produce poor (horrible, actually)
assimilations, and some perform quite nicely.
The DART Tutorial provides
instructions on how to modify the filter input and diagnose the results.
Use DART to run a 'perfect model' experiment.
Once a model is compatible with the DART facility, all of the
functionality of DART is available. This includes 'perfect model'
experiments (also called Observing System Simulation Experiments - OSSEs).
Essentially, the model is run forward from some state and, at predefined times,
the observation forward operator is applied to the model state to harvest
synthetic observations. This model trajectory is known as the 'true state'.
The synthetic observations are then used in an
assimilation experiment. The assimilation performance can then be evaluated
precisely because the true state (of the model) is known.
The basic steps to running an OSSE from within DART are:
- Run create_obs_sequence to generate the type of observation (and observation error) desired.
- Run create_fixed_network_seq to define the temporal distribution of the desired observations.
- Run perfect_model_obs to advance the model from a known initial condition - and harvest the 'observations' (with error) from the (known) true state of the model.
- Run filter to assimilate the 'observations'. Since the true model state is known, it is possible to evaluate the performance of the assimilation.
An OSSE is explored in our Lorenz '96 example.
More information about creating observation sequence files for OSSE's
is available in
the observation sequence discussion section.
There are a set of Matlab® functions to help explore the assimilation
performance in state-space. The state-space functions are in the
DART/matlab directory.
Once you fire up Matlab® and have the netCDF support sorted out,
you will essentially follow the same procedure as that outlined in the
"Are the results correct?"
section. The most common functions are listed below. They each have a
help document available by issuing the help plot_bins command
at the Matlab® prompt (for example).
| plot_bins.m | plots the rank histograms for a set of state variables. |
| plot_total_err.m | plots the evolution of the error (un-normalized) and ensemble spread of all state variables. |
| plot_ens_mean_time_series.m | plots the evolution of a set of state variables - just the ensemble mean (and Truth, if available). |
Configuring Matlab® to work for DART
Matlab® R2008b is the first version to have native netCDF support,
with its own syntax that would require a total rewrite of the DART interfaces
that would then be incompatible with older versions of Matlab®.
As of July 2009 DART uses the snctools interface functions
for netCDF - which rely solely on the
mexnc mex-file interface
and is available for 'all' versions of Matlab®.
The migration away from the inconsistent DART use of the
netcdf_toolbox and the CSIRO toolbox
matlab_netCDF_OPeNDAP (i.e. the 'getnc' function)
is virtually complete and greatly eases the installation of the
Matlab® netCDF support needed by DART.
Find your version of Matlab® (type 'ver' at the Matlab prompt) and visit
http://mexcdf.sourceforge.net/downloads
to get the right combination of mexnc and snctools.
The netcdf_toolbox subset of functions has
been deprecated by their developers, who are now supporting the snctools set of
functions. The netcdf_toolbox is still getting
distributed with snctools, you can install them if
you like, but they are not needed by DART.
You will need the 'normal' DART/matlab functions
available to Matlab, so be sure your MATLABPATH is set such that you
have access to get_copy_index as well as
nc_varget (which comes from snctools).
This generally means you will have to manipulate your MATLABPATH with
something like:
addpath('replace_this_with_the_real_path_to/DART/matlab')
addpath('replace_this_with_the_real_path_to/DART/diagnostics/matlab')
addpath('some_netcdf_install_dir/snctools')
addpath('some_netcdf_install_dir/mexnc','-BEGIN')
addpath('some_netcdf_install_dir/netcdf_toolbox/netcdf')
addpath('some_netcdf_install_dir/netcdf_toolbox/netcdf/nctype')
addpath('some_netcdf_install_dir/netcdf_toolbox/netcdf/ncutility')
addpath('some_CSIRO_install_dir/matlab_netCDF_OPeNDAP')
which is precisely why I'm trying to shorten it. On my systems, I've bundled the first 4 commands into a function called ~/matlab/startup.m which is automatically run every time I start Matlab.