A few years ago, we wrote a paper looking at the effect of covariance localization in a simplified GCM (no model error).
We varied the ensemble size and the severity of the localization.  For small ensembles, if not enough localization used, the filter diverged.  More ensemble members, less localization needed.  ThereÕs a Ôsweet spotÕ which minimizes ensemble mean error for a given ensemble size (and observation network).