DAReS header

    The 'low-order' models supported in DART.


    ikeda

    The Ikeda model is a 2D chaotic map useful for visualization data assimilation updating directly in state space. There are three parameters: a, b, and mu. The state is 2D, x = [X Y]. The equations are:

    X(i+1) = 1 + mu * ( X(i) * cos( t ) - Y(i) * sin( t ) )
    Y(i+1) =     mu * ( X(i) * sin( t ) + Y(i) * cos( t ) ),
    

    where

    t = a - b / ( X(i)**2 + Y(i)**2 + 1 )
    

    Note the system is time-discrete already, meaning there is no delta_t. The system stems from nonlinear optics (Ikeda 1979, Optics Communications). Interface written by Greg Lawson. Thanks Greg!


    lorenz_63

    This is the 3-variable model as described in: Lorenz, E. N. 1963. Deterministic nonperiodic flow. J. Atmos. Sci. 20, 130-141.
    The system of equations is:

    X' = -sigma*X + sigma*Y
    Y' = -XZ + rX - Y
    Z' =  XY -bZ
    

    lorenz_84

    This model is based on:   Lorenz E. N., 1984: Irregularity: A fundamental property of the atmosphere. Tellus36A, 98-110.
    The system of equations is:

    X' = -Y^2 - Z^2  - aX  + aF
    Y' =  XY  - bXZ  - Y   + G
    Z' = bXY  +  XZ  - Z
    

    Where a, b, F, and G are the model parameters.


    9var

    This model provides interesting off-attractor transients that behave something like gravity waves.


    lorenz_96

    This is the model we use to become familiar with new architectures, i.e., it is the one we use 'first'. It can be called as a subroutine or as a separate executable. We can test this model both single-threaded and mpi-enabled.

    Quoting from the Lorenz 1998 paper:

    ... the authors introduce a model consisting of 40 ordinary differential equations, with the dependent variables representing values of some atmospheric quantity at 40 sites spaced equally about a latitude circle. The equations contain quadratic, linear, and constant terms representing advection, dissipation, and external forcing. Numerical integration indicates that small errors (differences between solutions) tend to double in about 2 days. Localized errors tend to spread eastward as they grow, encircling the globe after about 14 days.
    ...
    We have chosen a model with J variables, denoted by X1, ..., XJ; in most of our experiments we have let J = 40. The governing equations are:
    dXj/dt = (Xj+1 - Xj-2)Xj-1 - Xj + F         (1)
    
    for j = 1, ..., J. To make Eq. (1) meaningful for all values of j we define X-1 = XJ-1, X0 = XJ, and XJ+1 = X1, so that the variables form a cyclic chain, and may be looked at as values of some unspecified scalar meteorological quantity, perhaps vorticity or temperature, at J equally spaced sites extending around a latitude circle. Nothing will simulate the atmosphere's latitudinal or vertical extent.


    forced_lorenz_96

    The forced_lorenz_96 model implements the standard L96 equations except that the forcing term, F, is added to the state vector and is assigned an independent value at each gridpoint. The result is a model that is twice as big as the standard L96 model. The forcing can be allowed to vary in time or can be held fixed so that the model looks like the standard L96 but with a state vector that includes the constant forcing term. An option is also included to add random noise to the forcing terms as part of the time tendency computation which can help in assimilation performance. If the random noise option is turned off (see namelist) the time tendency of the forcing terms is 0.


    lorenz_96_2scale

    This is the Lorenz 96 2-scale model, documented in Lorenz (1995). It also has the option of the variant on the model from Smith (2001), which is invoked by setting local_y = .true. in the namelist. The time step, coupling, forcing, number of X variables, and the number of Ys per X are all specified in the namelist. Defaults are chosen depending on whether the Lorenz or Smith option is specified in the namelist. Lorenz is the default model. Interface written by Josh Hacker. Thanks Josh!


    lorenz_04

    The reference for these models is Lorenz, E.N., 2005: Designing chaotic models. J. Atmos. Sci.62, 1574-1587.
    Model II is a single-scale model, similar to Lorenz 96, but with spatial continuity in the waves. Model III is a two-scale model. It is fudamentally different from the Lorenz 96 two-scale model because of the spatial continuity and the fact that both scales are projected onto a single variable of integration. The scale separation is achived by a spatial filter and is therefore not perfect (i.e. there is leakage). The slow scale in model III is model II, and thus model II is a deficient form of model III. The basic equations are documented in Lorenz (2005) and also in the model_mod.f90 code. The user is free to choose model II or III with a Namelist variable.


    simple_advection

    This model is on a periodic one-dimensional domain. A wind field is modeled using Burger's Equation with an upstream semi-lagrangian differencing. This diffusive numerical scheme is stable and forcing is provided by adding in random gaussian noise to each wind grid variable independently at each timestep. An Eulerian option with centered-in-space differencing is also provided. The Eulerian differencing is both numerically unstable and subject to shock formation. However, it can sometimes be made stable in assimilation mode (see recent work by Majda and collaborators).


    [top]


    The 'high-order' models supported in DART.

    In roughly the order they were supported by DART.


    bgrid_solo

    This is a dynamical core for B-grid dynamics using the Held-Suarez forcing. The resolution is configurable, and the entire model can be run as a subroutine. Status: supported.


    pe2lyr

    This model is a 2-layer, isentropic, primitive equation model on a sphere. Status: orphaned.


    wrf

    The Weather Research and Forecasting (WRF) Model is a next-generation mesoscale numerical weather prediction system designed to serve both operational forecasting and atmospheric research needs. More people are using DART with WRF than any other model. Note: The actual WRF code is not distributed with DART. Status: supported.


    cam

    The Community Atmosphere Model (CAM) is the latest in a series of global atmosphere models developed at NCAR for the weather and climate research communities. CAM also serves as the atmospheric component of the Community Climate System Model (CCSM). Status: supported.


    PBL_1d

    The PBL model is a single column version of the WRF model. In this instance, the necessary portions of the WRF code are distributed with DART. Status: supported - but looking to be adopted.


    MITgcm_annulus

    The MITgcm annulus model as configured for this application within DART is a non-hydrostatic, rigid lid, C-grid, primitive equation model utilizing a cylindrical coordinate system. For detailed information about the MITgcm, see http://mitgcm.org Status: orphaned - and looking to be adopted.


    rose

    The rose model is for the stratosphere-mesosphere and was used by Tomoko Matsuo (now at CU-Boulder and NOAA) for research in the assimilation of observations of the Mesosphere Lower-Thermosphere (MLT). Note: the model code is not distributed with DART. Status: orphaned


    MITgcm_ocean

    The MIT ocean GCM version 'checkpoint59a' is the foundation of this implementation. It was modified by Ibrahim Hoteit (then of Scripps) to accomodate the interfaces needed by DART. Status: supported - but looking to be adopted.


    am2

    The FMS AM2 model is GFDL's atmosphere-only code using observed sea surface temperatures, time-varying radiative forcings (including volcanos) and time-varying land cover type. This version of AM2 (also called AM2.1) uses the finite-volume dynamical core (Lin 2004). Robert Pincus (CIRES/NOAA ESRL PSD1) and Patrick Hoffman (NOAA) wrote the DART interface and are currently using the system for research. Note: the model code is not distributed with DART. Status: supported


    coamps

    The DART interface was originally written and supported by Tim Whitcomb. The following model description is taken from the COAMPS overview web page:

    The Coupled Ocean/Atmosphere Mesoscale Prediction System (COAMPS) has been developed by the Marine Meteorology Division (MMD) of the Naval Research Laboratory (NRL). The atmospheric components of COAMPS, described below, are used operationally by the U.S. Navy for short-term numerical weather prediction for various regions around the world.

    Note: the model code is not distributed with DART. Status: supported


    POP

    The Parallel Ocean Program (POP) comes in two variants. Los Alamos National Laboratory provides POP Version 2.0 which has been modified to run in the NCAR Community Climate System Model (CCSM) framework. As of November 2009, the CCSM-POP version is being run. The LANL-POP version is nearly supported - and some extensions useful for data assimilation in general have been proposed to LANL, who have agreed in principle to implement the changes. Fundamentally, the change is an additional restart option in which the first timestep after an assimilation is a Eulerian timestep (similar to a cold start). Note: the souce code for POP is not distributed with DART. Status: actively being developed


    [top]


    Downloadable datasets for DART.


    The code distribution was getting 'cluttered' with datasets, boundary conditions, intial conditions, ... large files that were not necessarily interesting to all people who downloaded the DART code. Worse, subversion makes a local hidden copy of the original repository contents, so the penalty for being large is doubled. It just made sense to make all the large files available on as 'as-needed' basis.

    To keep the size of the DART distribution down we have a separate www-site to provide some observation sequences, initial conditions, and general datasets. It is our intent to populate this site with some 'verification' results, i.e. assimilations that were known to be 'good' and that should be fairly reproducible - appropriate to test the DART installation.

    Please be patient as I make time to populate this directory. (yes, 'make', all my 'found' time is taken ...)
    Observation sequences can be found at http://www.image.ucar.edu/pub/DART/Obs_sets.

    Verification experiments will be posted to http://www.image.ucar.edu/pub/DART/VerificationData as soon as I can get to it. These experiments will consist of initial conditions files for testing different high-order models like CAM, WRF, POP ...
    The low-order models are already distributed with verification data in their work directories.

    Useful bits for CAM can be found at http://www.image.ucar.edu/pub/DART/CAM.
    Useful bits for WRF can be found at http://www.image.ucar.edu/pub/DART/WRF.


    [top]


    Creating initial conditions for DART


    The idea is to generate an ensemble that has sufficient 'spread' to cover the range of possible solutions. Insufficient spread can (and usually will) lead to poor assimilations. Think 'filter divergence'.

    Generating an ensemble of initial conditions can be done in lots of ways, only a couple of which will be discussed here. The first is to generate a single initial condition and let DART perturb it with noise of a nature you specify to generate as many ensemble members as you like. The second is to take some existing collection of model states and convert them to DART initial conditions files and then use the restart_file_tool to set the proper date in the files. The hard part is then coming up with the original collection of model state(s).


    Adding noise to a single model state

    This method works well for some models, and fails miserably for others. As it stands, DART supplies a routine that can add gaussian noise to every element of a state vector. This can cause some models to be numerically unstable. You can supply your own model_mod:pert_model_state() if you want a more sophisticated perturbation scheme.


    Using a collection of model states.

    The important thing to remember is that the high-order models all come with routines to convert a single model restart file (or the equivalent) to a DART initial conditions file. CAM has trans_pv_sv, WRF has wrf_to_dart, POP has pop_to_dart, etc. DART has the ability to read a single file that contains initial conditions for all the ensemble members, or a series of restart files - one for each ensemble member. Simply collect your ensemble of restart files from your model and convert each of them to a DART initial conditions file of the form filter_ics.#### where #### represents a 4 digit ensemble member counter. That is, for a 50-member ensemble, they should be named: filter_ics.0001  ... filter_ics.0050

    Frequently, the initial ensemble of restart files is some climatological collection. For CAM experiments, we usually start with N different 'January 1' states ... from N different years. The DART utility program restart_file_tool is then run on each of these initial conditions files to set a consistent date for all of the initial conditions. Experience has shown that it takes less than a week of assimilating 4x/day to achieve a steady ensemble spread. WRF has its own method of generating an initial ensemble. For that, it is best to go to contact someone familiar with WRF/DART.


    Initial conditions for the low-order models.

    In general, there are 'restart files' for the low-order models that already exist as work/filter_ics. If you need more ensemble members than are supplied by these files, you can generate your own by adding noise to a single perfect_ics file. Simply specify

    &filter_nml
    start_from_restart   = .FALSE.,
    restart_in_file_name = "perfect_ics",
    ens_size             = [whatever you want]
    


    [top]


    'perfect model' experiments or 'OSSE's.


    All of the workshop and tutorial examples are 'perfect model' experiments. The ability to compare against 'the truth' is great for exploring what does and doesn't work during experimentation.

    Every low-order model has a workshop_setup.csh that compiles all the executables needed to run an OSSE, and then actually runs them. The (empty) observation sequence files have been specified for what, where, and when 'observations' will be needed. This was done with create_obs_sequence and create_fixed_network_seq. Run them yourself if you want to understand exactly what it takes to create an observation sequence file devoid of the observation values. The examples are very run-time-output verbose - great for understanding what is going on, but just awful for performance. The run-time verbosity can be cut down when running larger models.

    Some of the models have input values that are designed to produce poor (horrible, actually) assimilations, and some perform quite nicely. The DART Tutorial provides instructions on how to modify the filter input and diagnose the results.


    Use DART to run a 'perfect model' experiment.


    Once a model is compatible with the DART facility, all of the functionality of DART is available. This includes 'perfect model' experiments (also called Observing System Simulation Experiments - OSSEs). Essentially, the model is run forward from some state and, at predefined times, the observation forward operator is applied to the model state to harvest synthetic observations. This model trajectory is known as the 'true state'. The synthetic observations are then used in an assimilation experiment. The assimilation performance can then be evaluated precisely because the true state (of the model) is known.

    The basic steps to running an OSSE from within DART are:

    1. Run create_obs_sequence to generate the type of observation (and observation error) desired.
    2. Run create_fixed_network_seq to define the temporal distribution of the desired observations.
    3. Run perfect_model_obs to advance the model from a known initial condition - and harvest the 'observations' (with error) from the (known) true state of the model.
    4. Run filter to assimilate the 'observations'. Since the true model state is known, it is possible to evaluate the performance of the assimilation.

    An OSSE is explored in our Lorenz '96 example.

    More information about creating observation sequence files for OSSE's is available in the observation sequence discussion section.

    There are a set of Matlab® functions to help explore the assimilation performance in state-space. The state-space functions are in the DART/matlab directory. Once you fire up Matlab® and have the netCDF support sorted out, you will essentially follow the same procedure as that outlined in the "Are the results correct?" section. The most common functions are listed below. They each have a help document available by issuing the help plot_bins command at the Matlab® prompt (for example).


    plot_bins.m plots the rank histograms for a set of state variables.
    plot_total_err.m plots the evolution of the error (un-normalized) and ensemble spread of all state variables.
    plot_ens_mean_time_series.m    plots the evolution of a set of state variables - just the ensemble mean (and Truth, if available).

    [top]


    Configuring Matlab® to work for DART


    Matlab® R2008b is the first version to have native netCDF support, with its own syntax that would require a total rewrite of the DART interfaces that would then be incompatible with older versions of Matlab®.

    As of July 2009 DART uses the snctools interface functions for netCDF - which rely solely on the mexnc mex-file interface and is available for 'all' versions of Matlab®. The migration away from the inconsistent DART use of the netcdf_toolbox and the CSIRO toolbox matlab_netCDF_OPeNDAP (i.e. the 'getnc' function) is virtually complete and greatly eases the installation of the Matlab® netCDF support needed by DART.

    Find your version of Matlab® (type 'ver' at the Matlab prompt) and visit http://mexcdf.sourceforge.net/downloads to get the right combination of mexnc and snctools. The netcdf_toolbox subset of functions has been deprecated by their developers, who are now supporting the snctools set of functions. The netcdf_toolbox is still getting distributed with snctools, you can install them if you like, but they are not needed by DART.

    You will need the 'normal' DART/matlab functions available to Matlab, so be sure your MATLABPATH is set such that you have access to get_copy_index as well as nc_varget (which comes from snctools). This generally means you will have to manipulate your MATLABPATH with something like:


    addpath('replace_this_with_the_real_path_to/DART/matlab')
    addpath('replace_this_with_the_real_path_to/DART/diagnostics/matlab')
    addpath('some_netcdf_install_dir/snctools')
    addpath('some_netcdf_install_dir/mexnc','-BEGIN')
    addpath('some_netcdf_install_dir/netcdf_toolbox/netcdf')
    addpath('some_netcdf_install_dir/netcdf_toolbox/netcdf/nctype')
    addpath('some_netcdf_install_dir/netcdf_toolbox/netcdf/ncutility')
    addpath('some_CSIRO_install_dir/matlab_netCDF_OPeNDAP')
    

    which is precisely why I'm trying to shorten it. On my systems, I've bundled the first 4 commands into a function called ~/matlab/startup.m which is automatically run every time I start Matlab.



    [top]