CALLING TREE / FLOW CHART / FILE CONTENTS / OUTPUT DIRECTORY / SET-UP / HINTS / SPACE

DART-CAM OVERVIEW

The up-to-date overview will always be available at http://www.image.ucar.edu/DAReS/DART/cgd_cam.shtml

These scripts are executed using the async = 3 and do_parallel = 3 options in the namelist file input.nml. These will advance the ensemble of models states on a set of processors, and then divide the globe into regions and assimilate each region on a processor. They are designed for use on a multi-processor machine, and have been tested for a batch submission environments PBS and LSF on Linux clusters.

The strategy of this set of scripts for doing an assimilation follows. The functionality of each has been restricted to one domain; a script is specific to only a machine where the experiment is run, to a model used in the assimilation, to the filter version, or to the experiment being conducted using the choices for the previous 3.

Go to cam/model_mod page


CALLING TREE

The calling tree for these scripts (and fortran executables) is:

SCRIPT DOMAIN specific LOCATION
job.csh experiment experiment central directory where I/O and execution is organized.
   -> qsub filter.csh
filter version local disc on a compute node/processor, or a work directory in the central directory.
      -> filter executable
   -> qsub filter_server.csh
machine central directory
      -> advance_model.csh
model each ens member on a separate *node* using local disc there, or pre-existing work subdirectory of the central directory.
         -> trans_time executable
         -> trans_sv_pv executable
         -> run_pc.csh
model part of model package, not DART
            -> advance of forecast model
         -> trans_pv_sv executable
      -> assim_region.csh
filter version each region on a separate *processor* using local disc
         -> assim_region executable
filter version

FLOW CHART

The tasks of each script, and communication between them, are summarized in a flow chart . Note the key in the lower right corner. The specific mechanisms for transferring control and moving files will be found in the scripts.


FILE CONTENTS

Contents of the files which appear in the script/data file flow charts.
FILE CONTENTS or PURPOSE
assim_model_state_ic# the state vectors to be used as initial conditions for the next model advance. Contains the state vector time, the target time, and the state vector.
assim_model_state_ud# the updated state vectors returned by the model advance. Contain the state vector time (was the target time) and the state vector for one ensemble member.
filter_assim_region__in#s each contains the same region of all of the state vector ensemble members combined into one file, for the several regions to be used in the assimilation.
filter_assim_region_out#s same as filter_assim_regions__in#s, but after the corresponding observations have been assimilated.
filter_ic_old#s the initial conditions to be used by the filter for the next assimilation of a single obs_seq.out file. There may be one of these, or one for each ensemble member, named filter_ic_old.####, where the #### means a 4 digit number such as 0001.
filter_ic_new#s same as filter_ic_new#s, except that it/they are produced at the end of the assimilation, for use by the next assimilation.
go_advance_model the semaphore file from filter to tell the model to advance the state vectors.
go_assim_regions the semaphore file from filter to tell assim_region to assimilate the observations (in the form of filter_assim_region__in#s).
go_end_filter the semaphore file telling all the scripts that filter is done with the current obs_seq.out file.
input.nml the filter namelist file, containing the namelists for all the necessary modules of the filter.
model initial file such as caminput.nc, provides information about the model which the filter needs, such as state vector size, etc.
namelists the forecast model may need namelist(s) to define its advance.
obs_seq.final the innovations in observation space which result from the assimilation of all the chosen obs in obs_seq.out.
obs_seq.out the set of observations to be assimilated. How the observations are distributed in time defines when the model advances happen.
Posterior_Diag.nc the state vector in model space after each assimilation defined by the obs times in obs_seq.out.
Prior_Diag.nc the state vector in model space before each assimilation defined by the obs times in obs_seq.out. It results from the previous model advance.
rm_filter_temp semaphore file telling filter.csh that the filter_ic_new#s have been successfully copied to the Central directory, so it is safe to remove its temporary directory and exit.
state shells CAM has more fields in its initial files that we use in the DART state vector. It's useful to carry these fields along from advance to advance so that they don't need to spin-up as much at the beginning of each advance. trans_sv_pv replaces the state vector fields in these "shells" with the contents of assim_model_state_ic and leaves the other fields alone.
True_State.nc the state vector in model space resulting from an execution of perfect_model_obs. These are the model forecast values from which identity obs are derived.


OUTPUT DIRECTORY

Organization of output directories created up by job.csh
DIRECTORY CONTENTS and PURPOSE
Central directory  
(location of scripts and pass-through point for files during execution) (typically named according defining characteristics of a *set* of experiments; resolution, model, obs being assimilated, unique model state variables, etc.)
    Experiment   
(location of subdirectories of output and some diagnostics files. Typically where the obs-space diagnostics are calculated; obs_diag)
       Obs_seq subdirectory(s) 
Each holds the obs-space and model-space output from assimilating one obs_seq.out file. It should be named according to the need for obs_diag to see a name with the 2 digit month, underscore, and the number within the series of obs_seq.out files, i.e. 01_02 for the second obs_seq.final of a January case. The script job.csh will make these directories if you use it.
           DART 
holds the filter restart files (named filter_ic[.#]) created at the end of the filter run for this obs_seq.out. They're used by the next obs_seq.out file to restart.
           CAM
holds the CAM initial file shells which carry along model fields which are not DART state vector fields (preventing the repeated re-spin-up of those variables)
           CLM  
Same as CAM, but for Community Land Model initial files.

A typical pathname for a restart file in my case would be:
/scratch/cluster/raeder/T21x80/Taper1/01_03/DART/filter_ic
                        |      |      |     DART restart file directory
                        |      |      Obs_seq (Jan 3)
                        |      Experiment (reduced influence of obs above 150 hPa)
                        Central directory (resolution x num_ens_members)


You may also want to make a subdirectory within Experiment for each set of obs_space postscript files created by obs_diag and matlab.


EXPERIMENT SET-UP

Instructions for setting up a DART-CAM assimilation using these scripts.

  1. Set up an experiment central directory ("Central" here) where there's enough space for output. (See "Space" below)
  2. CAM
    1. Put the DART modifications ("Cam[version#]_DART_mods.tar") you acquire from the DART-CAM ftp site) and any other CAM modifications you have in the directory of user-provided modifications, which CAM's "configure" uses during compilation. Configure and compile CAM at the resolution desired. Do this in a directory where all your CAM versions will reside, here called CamCentral.
    2. Make sure the cam executable and config_cache.xml are in the standard CAM location: CamCentral/CAM_version/models/atm/cam/bld. job.csh has a variable CAMsrc that should point to that location.
    3. Copy .../DART/models/cam/shell_scripts/run-pc.csh (called by advance_model.csh) to CamCentral/CAM_version/models/atm/cam/bld. This may replace the original CAM run-pc.csh.
    4. Build a CAM namelist such as 'namelistin' containing (among the other/default variables defined by the CAM build-namelist):
                   &camexp
                    ncdata         = 'caminput.nc'
                    caseid         = 'whatever_you_want'
                    nsrest         = 0
                    calendar       = 'GREGORIAN'
                    inithist       = 'ENDOFRUN'
                   /
                   &clmexp
                    finidat        = 'clminput.nc'
                   /
                   
      and NOT containing ...
                   >  nhtfrq         = 4368
                   >  start_ymd      = 20020901
                   >  start_tod      = 0
                   >  stop_ymd       = 20021201
                   >  stop_tod       = 0
                   
      The CAM build-namelist script will use this to make a new namelist with the correct forecast parameters, named 'namelist'.
    5. Copy it to Central.
    6. Get access to all the CAM input files listed in namelistin, or suitable replacements.
    7. Copy a CAM initial file and CLM initial file for the chosen resolution to Central. These are used as shells, into which the DART state vector will be substituted for CAM to use.
  3. DART
    1. Get DART sandbox, modify to suit:
      Edit ...DART/models/cam/work/input.nml:preprocess_nml to choose obs_def and obs_kind source code files to load via the preprocessor. The default files will give observations from NCEP reanalysis BUFR files. In .../DART/mkmf link or copy (or make one of your own) the mkmf.template.xxxx which is appropriate for your computer to mkmf.template.
    2. Script DART/models/cam/work/workshop_setup.csh is recommended for compiling the package and moving programs to the Central directory. Compile; filter, assim_region, trans_sv_pv, trans_pv_sv, trans_time (and trans_pv_sv_time0 if you need to create filter_ic) and copy them to Central.
    3. If you need to make up synthetic observations get create_obs_sequence, create_fixed_network_seq and perfect_model_obs. Otherwise, use the obs_seq.out files provided on the DART-CAM ftp site The files will have names like Obs_YY_MM.gztar, YY = year, MM = month.
    4. Concatenate the DART namelist files (DART/models/cam/work/input.nml.XX_default, excluding the trans_ files) into one called "input.nml" (which all the DART executables want to see) This may require some inspection and manual editing to be sure that all the needed namelists are in it, and there are no redundancies.
    5. Copy it to Central
    6. Copy the suite of scripts to Central:
      • DART/shell_scripts/filter_server_MACHINE.csh
      • DART/shell_scripts/assim_region.csh
      • DART/models/cam/shell_scripts/job.csh (or DART/shell_scripts/job_??_??.csh ?)
      • DART/models/cam/shell_scripts/filter_BATCH_SYST.csh
      • DART/models/cam/shell_scripts/advance_model.csh
      • DART/models/cam/shell_scripts/run_pc.csh
  4. EXPERIMENT
    1. Edit job.csh to
      • define experiment output directory names
      • provide directory name of CAM executable
      • define which obs_seq.out files to use
      • find and link to the obs_seq.out files
      • find and link to filter_ic[.#] and assim_tools_ic
      • define which CAM and CLM initial files to use. Some initial and filter_ic files are available from raeder@ncar.ucar.edu.
    2. Edit filter_server.csh to set the number of nodes and/or processors to use, and change other machine specific aspects. Note that the subdirectories in which the ensemble members will advance (and assimilation of regions will be done) must exist before the job is run. The default names are `pwd`/filter_server/member_# (# = 1,...,num_ens) and .../region_# (# = 1,...,num_domains).
    3. Edit advance_model ?
    4. Edit input.nml to configure the assimilation of the first obs_seq.out. Be sure that filenames listed in it agree with what's required by job.csh and what's available in the Central directory.
    5. Copy it to input_1.nml (or other name signifying its obs_seq.out file, as required by job.csh)
    6. Copy input_1.nml to input_n.nml. Make sure that the restart mode for input_n.nml is "start_from_restart". Make sure that init_time_days = -1, init_time_seconds = -1.
    7. (MORE? my conventions to make them work with job.csh? or a whole new topic; )


HELPFUL HINTS

Use the ensemble size, available compute nodes, and processors/node to figure how many nodes to request. Make this request in filter_server.csh. For example, on a machine with 2 processors/node, and running an assimilation with a typical ensemble of 40 members, it's efficient to request 5 nodes. This will advance CAM in 4 batches of 10 (1 CAM/processor). Then set the number of domains in the assimilation to be num_nodes x procs/node (10), so each domain will execute on a single processor. Setting the number of domains to be large (40) is not efficient because in this current (Iceland) parallel mode the computation does not scale well.

If you're not running job.csh as a batch job, run it as 'nohup ./job.csh >& /dev/null &', to protect the job from being cutoff by the closure of the window in which it was executed.

Modify and use alias 'rmtemp' to remove the temporary files from the central directory where the experiment is run, before running another experiment.

alias rmtemp 'rm *_ud* *_ic[1-9]* cam_*_temp* c[al]minput_[1-9]*.nc filter_assim_region_* \
              *control filter_ic_old* obs_seq.out times'
Needless to say, be careful that you don't name files you want to keep in such a way that they'll be deleted by this.

Each batch of restart data can be saved to a mass store using (a modified) auto_re2ms and retrieved using .../ms2restart. Execute the commands with no arguments to see instructions. They package files of each ensemble member together, and then bundle batches of ensemble members together for efficient storage in a directory named similarly to the one where they exist on the cluster.


SPACE REQUIREMENTS

Space requirements (per ensemble member) for several CAM resolutions.

Resolution filter_ic CAM initial CLM initial Diagnostic
T5 .16 Mb.3 Mb .15 Mb 1.3 Mb + obs_seq.final
T21 2.5 Mb4.5 Mb 1.4 Mb 21. Mb + obs_seq.final
T42 10. Mb18. Mb 4.5 Mb 57. Mb + "
T85 41. Mb74. Mb 15. Mb 342 Mb + "
obs_seq.final typically ranges from 50-150 Mb, independent of model resolution