This strategy can reduce the number of transport models runs needed.  However, may not help to solve for many more regions than data points -- they will tend to be under-constrained and anti-correlated.  Will see another way of how adjoint can be used when discussing variational data assimilation laterÉ

What to do about storing and inverting a really big matrix, though?  Usually, all that info does not need to be together in one big matrix.  There is usually a time scale after which things become less important.  SoÉ break up the time span into shorter bits and do smaller inversions for each of these.  This idea leads to the Kalman filterÉ