Methods for dealing with spurious covariances arising from small samples in ensemble data assimilation
Jeff Whitaker jeffrey.s.whitaker@noaa.gov
NOAA Earth System Research Lab, Boulder
  what is ensemble data assimilation?
  what are the consequences of sampling error?
  covariance localization.
  alternatives to covariance localization.

Ensemble data assimilation
Parallel forecast and analysis cycles
Background-errors estimated from sample covariances, depend on weather situation.

Ensemble Kalman Filter
Ensemble Kalman Filter
Ensemble Kalman Filter
Ensemble Kalman Filter
Consequences of Sampling Error
Mis-specification of background-error covariance
Effect of localization in a simplied GCM (1)
Effect of localization in a simplied GCM (2)
Effect of localization in a simplied GCM (3)
Covariance localization increases rank of Pb
If the ensemble has k members, then Pb describes nonzero uncertainty only in a k-dimensional subspace .
Analysis only adjusted in this subspace.
If the system is high-dimensionally unstable (if it has more than k positive Lyapunov exponents) then forecast errors will grow in directions not accounted for by the ensemble, and these errors will not be corrected by the analysis.

Alternative to localization
Localizing covariances works because it increases the dimensionalityÉ.
So, one can instead compute updates in local regions where error dynamics evolves in a lower-dimensional subspace (< k).
(LETKF - Hunt et al, 2007)

Two EnKF approaches
Serial approach  -  for each observation, update each model variable (tapering the influence of the observation to zero at a specified distance).  Used in NCAR DART.
Local approach - update each model variable one at a time, using all observations within a specified radius (increasing R with distance between observation and model variable) - we use this approach since it scales well on massively parallel computers

Outstanding issues
Both methods assume a priori that covariance is maximized at the observation location - problematic for non-local and time-lagged obs.
Both methods are flow-independent (assume same degree of locality for every situation).
Localization can destroy balance.

Localization and Balance
Analysis of single  zonal wind observation, using idealized nondivergent and geostrophically balanced covariances.
Flow Dependent Localization (Hodyss and Bishop, QJR)
Flow Dependent Localization
Slide 19
ÒSENCORPÓ Recipe
Smooth Pb = P1b
Element-wise cube of  P1b  = P2b
Normalized matrix product of P2b with itself = P3b
Use element-wise square of P3b to compute K.

Hierarchical Ensemble Filter
Proposed by Jeff Anderson (NCAR).
Evolve K coupled N-member ensemble filters.
Use differences between sample covariances to design a situation-dependent localization function.
asymptotes to optimally localized N member ensemble (not K*N).

Conclusions
Localization (tapering the impact of observations with distance from analysis grid point) makes ensemble data assimilation feasible with large NWP models.
Both model errors and localization make filter performance suboptimal.  Right now model error is the bigger problem, but improvements in localization are needed.