# DART software - Copyright UCAR. This open source software is provided
# by UCAR, "as is", without charge, subject to all terms of use at
# http://www.image.ucar.edu/DAReS/DART/DART_download
#
# DART $Id$

10 Nov 2016 -  TJH

I no longer have access to the LANL version of POP and cannot test these
interfaces. The scripts to advance LANL/POP will surely change. There is
one area that may be impacted by reading/writing netCDF directly. Since
we no longer have a 'dart_to_pop' executable, the communication of the
'advance_to_time' to the updated POP &time_manager_nml is untested.

30 June 2010 - TJH

The POP interface to DART is available for general use. There are two
fundamentally different POP models, however; the LANL version and the
CCSM/CESM version. The CESM/POP was tested and run in production on 
NCAR's bluefire computer from IBM.  The LANL/POP 2_0_1 version of POP ...  
STILL CANNOT BE USED for assimilation until the POP code is modified 
to do a forward euler timestep for the assimilation case.  

Essentially, there are two wildly different modes of running POP
and DART.

1) CESM/POP is invoked by inserting a few lines into the run script
   generated by CESM and DART is simply used to assimilate at a single
   timestep. DART is started and stopped at every assimilation time
   and is NOT responsible for advancing the model at all. The observation
   sequence files must be chopped up into 'daylong' chunks since the
   flux coupler stops the models at midnight - this is the assimilation time. 
   The CESM Interactive Ensemble facility is used to manage the ensembles.

2) LANL/POP is invoked the same way as any other high-order model. DART is
   invoked and POP is started/stopped multiple times. Given the wide
   range of input files and modes for running POP - the scripts will surely
   have to be modified to accomodate these different usage patterns.
   This interface was tested with the LANL/POP 2_0_1 version of POP ... 
   but STILL CANNOT BE USED for assimilation until the POP code is modified 
   to do a forward euler timestep for the assimilation case.  

   This compiles and runs on coral (SLES10) with the 
   ifort (IFORT) 10.1 20090203 compiler with the following flags:
   FFLAGS = -O0 -fpe0 -vec-report0 -assume byterecl 
   Coral is an Intel-based machine - so all binary files were little-endian.
   I like to append a ".le" suffix on those files.

   It was checked in the gx3v5 configuration.

   For coral with the openmpi framework, it is necessary to
   specify input.nml:&mpi_utilities_nml:reverse_task_layout = .true., 
   For bluefire ... it must be .false. (the default).

Tim

 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

(this should stay in the README.  at some point we might
want to move the test program to a separate test directory,
in which case this file needs to be updated to reflect that.)

There is a test program in the work directory which validates
the dipole interpolation.  It requires some data files
(about 16 Mb in size) which are not checked into subversion,
but can be found and downloaded from this web location:

http://www.image.ucar.edu/pub/DART/POP/

nancy

#----------------------------------------------------------------------
# Everything below here is just notes from the development process.
# It is highly unlikely it will mean anything to anyone but Tim H.
#----------------------------------------------------------------------

Tim : Tue Jul 21 17:45:52 MDT 2009

Working on reading the BINARY grid files instead of the (nonexistent) 
netCDF ones. Getting grid sizes from restart netCDF file and must make 
hard assumptions about variable storage order. 

dart_pop_mod is being modified to read the pop_in namelist and then set things 
like the ocean dynamics timestep. Must ensure that the model_mod then uses that 
to set a valid adv_to_time ...

dart_pop_mod must also set the time_manager_nml:stop_count (and stop_option) to
the right values.

dart_pop_mod must also create a pointer file with the expected restart file as
a way to make sure the model has advanced to the proper spot.

advance_model.csh must paste the pop_in.DART namelist on top of the pop_in namelist
for model control ... 

 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

This is how I understand (LANL) POP to work:

Given:
&init_ts_nml
   init_ts_option   = 'restart'
   init_ts_file     = 'pop.r'
   init_ts_file_fmt = 'nc'
/

The init_ts_file entry is completely ignored. A pointer file with the name
"pop_pointer.restart" contains the name of the restart file.


#----------------------------------------------------------------------
# Fri Mar 19 12:56:05 MDT 2010 -- Running CCSM/POP on bluefire
#----------------------------------------------------------------------

The initial POP ensemble are binary restarts - DART only works with 
netCDF restarts, so we cannot assimilate this first advance/job.
Depending on what we want to compare to, there may be an additional
'one-off' advance day where there is no assimilation. (Jan 3rd)
But the THIRD execution is always an assimilation ... resulting in an 
assimilated Jan 4th, 1998.

#
# create the case name for the experiment. Steve's convention is
#

"c"	is ocean only
"cam23"	explains the forcing
".2"	was some experiment number

set CASENAME = c.camorg48.inf2
set TEMPLATECASE = /gpfs/proj2/fis/cgd/oce/yeager/home/ccsm_runs/ccsm4.0_IE/c.cam48.pio
set TEMPLATECASE = /blhome/thoar/CCSM_POP/c.camorg48.inf

mkdir ~thoar/CCSM_POP/${CASENAME}

# Copy an existing experiment (CASE) that worked 
# and clean out all the bits that don't relate

cp -r ${TEMPLATECASE}/* ${CASENAME}

cd ${CASENAME}
./configure -cleanall

rm -rf poe* logs/* timing/* MachinesHist/*

# replace (almost) all instances of "yeager"  and the old CASE, case_name
# (I have been using steve's env_run.xml:CCSMROOT)
# ./env_case.xml:<entry id="CCSMROOT"   value="/blhome/yeager/ccsm4_0_beta21_iepio"  />
# These are the entries I change:

./CaseDocs/drv_in:  username      = 'yeager '
./CaseDocs/drv_in:  case_name     = 'c.cam48.pio '

./env_case.xml:<entry id="CCSMUSER" value="yeager"  />
./env_case.xml:<entry id="CASE"     value="c.cam48.pio"  />
./env_case.xml:<entry id="CASEROOT" value="/fis01/cgd/oce/yeager/home/ccsm_runs/ccsm4.0_IE/c.cam48.pio"  />

# For an initial run ... 
# the CCSM restart files are for 1998 Jan 1 00Z. 
# With env_run.xml:STOP_N==3 the first restart is available at the 
# end of Jan 3rd. The first POP advance is from binary restarts - 
# DART only works with netCDF restarts, so we cannot assimilate
# this first advance/job.
#
# hand-edit the env_run.xml file so it has the following:
<entry id="CONTINUE_RUN"    value="FALSE"  />    
<entry id="POST_DATA_ASSIM" value="FALSE"  />    
<entry id="RESUBMIT"        value="0"  />    
<entry id="STOP_N"          value="2"  />    

# After all the xml files have been configured - populate Buildconf with 
# valid namelists and run scripts and streams. There has to be a stream
# file for each ensemble member (hence make_cplhist_streams.csh)

./configure -case

cd Buildconf

cp ${TEMPLATECASE}/Buildconf/*ninst*.txt .
cp ${TEMPLATECASE}/Buildconf/make_cplhist_streams.csh .
./make_cplhist_streams.csh

# This was my sanity check ... xxdiff lives on the DASG cluster 
# this is a non-sequiter of sorts ...
# every non-comment difference was required

foreach FILE ( *buildnml* )
      xxdiff $FILE ${TEMPLATECASE}/Buildconf/$FILE
end

# build the stuff

cd ..
./${CASENAME}.bluefire.build

# modify the run script to include the assimilate.csh script after
# CSM EXECUTION HAS FINISHED and before
# FOR POSTPROCESSING
# and a bunch of other edits ...

hand edit $CASE.bluefire.run        to include assimilate.csh
hand edit $CASE.bluefire.l_archive  to have more than a 3 hour limit
hand edit assimilate.csh            to reference YOUR DART instance
hand edit the DART input.nml        to reflect your assimilation experiment
hand edit Tools/st_archive.sh       to archive what you want
comment out the following three lines from Tools/ccsm_l_archive.csh:
$UTILROOT/Tools/ccsm_msmkdir ${lsmdir0}
    $UTILROOT/Tools/ccsm_msmkdir ${lsmdir1}
    $UTILROOT/Tools/ccsm_msmkdir ${lsmdir2}

foreach FILE ( bluefire.run bluefire.l_archive )
      set OLDCASE = $TEMPLATECASE:t
      xxdiff ${CASENAME}.${FILE} ${TEMPLATECASE}/${OLDCASE}.${FILE}
end

foreach FILE ( assimilate.csh input.nml \
     Tools/st_archive.sh Tools/ccsm_l_archive.csh )
      xxdiff $FILE ${TEMPLATECASE}/$FILE
end


#
# STAGE the restart files for the N oceans
#
# Steve's repository of ocean restart files is :
# /gpfs/proj2/ccsm/ocn/DART/gx1v6_restarts/c.b12.001
# This copies the first 23 pointer files (adjacent years, btw) 
# into the execution directory. The restart files are still 
# in the restarts directory
cd /ptmp/thoar/${CASENAME}/run

ln -s /gpfs/proj2/ccsm/ocn/DART/gx1v6_restarts/c.b12.001 restarts

 - OR -

# From Steve - Wed Apr  7 12:19:03 MDT 2010
#
# We've been using successive january's from an old hindcast (c.b12.001)
# to start up the DART runs:
# /ccsm/ocn/DART/gx1v6_restarts/c.b12.001/
#
# The first 48 of these restart files (years 0002 to 0049) have the 
# following ensemble average and rms:
# /ccsm/ocn/DART/gx1v6_restarts/c.b12.001/ensavg.nc
# /ccsm/ocn/DART/gx1v6_restarts/c.b12.001/ensrms.nc
#
# I'm putting things in a new directory which will have assorted restarts 
# from assorted recent CCSM4 hindcasts which differ in their sea ice 
# treatment and salinity restoring strength:
# /ccsm/ocn/DART/gx1v6_restarts/core2/
# I'm only putting in restarts at least 10 years apart.  The avg & rms
# of the 48-member ensemble sitting there now is
# /ccsm/ocn/DART/gx1v6_restarts/core2/ensavg.nc
# /ccsm/ocn/DART/gx1v6_restarts/core2/ensrms.nc
#
# I already converted all of them to netcdf go to
# /ccsm/ocn/DART/gx1v6_restarts/core2/
# and do 'ls */*.nc'
#
# The spread is bigger in the latter and I think we should use 
# something like this in our next production run.

ln -s /gpfs/proj2/ccsm/ocn/DART/gx1v6_restarts/core2 restarts

# this was for the 23-member ensemble
cp restarts/rpointer.ocn.?.* .
cp restarts/rpointer.ocn.1?.* .
cp restarts/rpointer.ocn.2[0-3].* .

# this was for the 48-member ensemble
cp restarts/rpointer.ocn.?.* .
cp restarts/rpointer.ocn.[123]?.* .
cp restarts/rpointer.ocn.4[0-8].* .

# submit the job for the first N days

cd ~thoar/CCSM_POP/${CASENAME}
bsub < ${CASENAME}.bluefire.run

#----------------------------------------------------------------------
# For resubmissions
#----------------------------------------------------------------------

If CONTINUE_RUN == TRUE ... it will look  for the .rpointer files to
know what dates, etc.	

Modify the env_run.xml bits as follows:

< <entry id="CONTINUE_RUN"    value="FALSE"  />    
> <entry id="CONTINUE_RUN"    value="TRUE"  />    

< <entry id="POST_DATA_ASSIM" value="FALSE"  />    
> <entry id="POST_DATA_ASSIM" value="TRUE"  />    

< <entry id="RESUBMIT"        value="0"  />    
> <entry id="RESUBMIT"        value="10"  />    

< <entry id="STOP_N"          value="2"  />    
> <entry id="STOP_N"          value="1"  />  

Change the input.nml to reflect the 'start from restart' selections.

#----------------------------------------------------------------------
# To remove all traces of a failed experiment and restart 
# Steve Yeager: Tue Mar 23 11:41:26 MDT 2010:
#----------------------------------------------------------------------

To rerun, I would do the following:

1) clean up everything and start from scratch:
	rm -rf /ptmp/thoar/${CASENAME}/*
	rm -rf /ptmp/thoar/archive/${CASENAME}
	msrm -R -wpwd THOAR /THOAR/csm/${CASENAME}

2) rebuild
	./${CASENAME}.bluefire.build

3) pre-position restarts again (see notes)

4) change something(?)

5) resubmit:
	bsub < ${CASENAME}.bluefire.run

# <next few lines under version control, do not edit>
# $URL$
# $Revision$
# $Date$
