IMAGe Theme-of-the-Year 2009


UCAR \| NCAR \| CISL Our Institute Research Publications Software Data
Contact Us \| Visit Us \| UCAR People Search	Numerics\| Assimilation\| Turbulence\| Statistics

The Institute for Mathematics Applied to Geosciences
Theme for 2009: The Interaction of Simulation and Numerical Models

Frontiers of Geophysical Simulation

18 - 20 August 2009 Boulder, Colorado

ABSTRACTS

Time Acceleration Methods for the Shallow Water Equations on the Cubed Sphere
Rick Archibald
Computational Mathematics Group,ORNL

Climate simulation will not grow to the ultrascale without new algorithms to overcome the scalability barriers blocking existing implementations. Until recently, climate simulations concentrated on the question of whether the climate is changing. The emphasis is now shifting to impact assessments, mitigation and adaptation strategies, and regional details. Such studies will require significant increases in spatial resolution and model complexity while maintaining adequate throughput. The barrier to progress is the resulting decrease in time step without increasing single-thread performance. In this paper we demonstrate how to overcome this time barrier for the standard tests defined for the shallow-water equations on a sphere. This paper explains how combining a multiwavelet discontinuous Galerkin method with exact linear part time-evolution schemes can overcome the time barrier for advection equations on a sphere. The discontinuous Galerkin method is a high-order method that is conservative, flexible, and scalable. The addition of multiwavelets to discontinuous Galerkin provides a hierarchical scale structure that can be exploited to improve computational efficiency in both the spatial and temporal dimensions. Exact linear part time-evolution schemes are explicit schemes that remain stable for implicit-size time steps.

Advanced Analysis, Sensitivity, and Optimization Capabilities in the Trilinos Collection
Roscoe A. Bartlett
Trilinos Software Engineering Technologies and Integration Lead
Sandia National Laboratories

Trilinos is a growing collection of various software libraries for high performance computing. In addition to the traditional (parallel) solvers and associated support code, other types of higher-level steady-state and transient analysis, sensitivity and optimization capabilities are also being developed in Trilinos. This talk will describe these capabilities that are being developed and how they can be accessed by application codes both currently and in the future.

Vortex-RBF Methods on the Surface of a Sphere
John P. Boyd
University of Michigan

Vortex blob methods, which approximate the vorticity of a fluid as a sum of Gaussian radial basis functions, are widely used. The Poisson equation for the streamfunction is solved analytically; the centers of the blobs are advected by the velocities derived from the streamfunction. We have recently solved the Poisson equation with Gaussian forcing on the sphere. Using this, we have developed a Gaussian RBF-vortex method for solving the barotropic vorticity equation. To compute the RBF coefficients, we compare interpolation (expensive but accurate) with deformation-corrected and uncorrected quasi-interpolation. The code dynamically adds and subtracts RBF vortices. We will extend this work to the shallow water equations on the sphere.

Verification through Adaptivity
Brian Carnes
Sandia National Laboratories
Joint work with Bart van Bloemen Waanders and Kevin Copps, Sandia National Laboratories

Verification of computational codes and calculations has become a major issue in establishing credibility of computational science and engineering tools. Adaptive meshes and time steps have become powerful tools for controlling numerical error and providing solution verification. This talk makes the case for using adaptivity to drive verification. Connections between numerical error and overall uncertainty are emphasized.
In the first part of the talk, an overview of verification using uniform grid studies is presented, including grid convergence and manufactured solutions. Then basic ideas of numerical error estimation and adaptivity are outlined, including goal-oriented approaches using the adjoint problem. Finally, examples are shown to illustrate the use of adaptivity to control numerical error in thermal, fluid, and solid mechanics applications.

Adjoint Sensitivity Analysis for Two-layered Shallow Water Model in a Limited Domain
Vani Cheruvu
Florida State University
Joint work with Joseph Tribbia, NCAR

Sensitivity analysis is defined as the determination of the potential impact on some quantitative measures of a forecast aspect due to arbitrary perturbations of the model dynamic fields at earlier times. Adjoints are indispensible for efficiently solving several types of optimization problems, which includes sensitivity analysis. Currently we are working on the development of the adjoint tools for a spectral-element based two-layered shallow water model in a limited domain with non-reflecting boundary conditions. This system has much in common with limited area primitive equation models used in numerical weather prediction.

Implicit-Explicit Time Stepping Methods for Multiphysics Problems
Emil M. Constantinescu
Argonne National Laboratory

We discuss a new type of implicit-explicit time-stepping method. Implicit-explicit (IMEX) schemes, also known as semi-implicit, are time-stepping methods that allow to efficiently solve problems with both stiff and nonstiff components and thus are well suited for multiphysics multiscale problems. The proposed methods are based on Richardson-type extrapolation schemes and can provide very high order discretizations of ODEs, index-1 DAEs, and PDEs in the method of lines framework. Implicit-explicit schemes based on extrapolation are simple to construct, easy to implement, and straightforward to parallelize. We establish the existence of perturbed asymptotic expansions of global errors that explains the convergence orders of these methods, and study their linear stability properties. Numerical results with stiff ODE, DAE, and PDE test problems confirm the theoretical findings and illustrate the potential of these methods to solve multiphysics multiscale problems.

Vertical Discretization in the Non-Hydrostatic Global Environmental Multiscale (GEM) Model
Jean Côté
Environment Canada
Joint work with Claude Girard and André Plante

The vertical discretization of the non-hydrostatic Global Environmental Multiscale (GEM) model is being changed from its original non-staggered grid formulation to a staggered grid formulation. In the hydrostatic limit it reverts to the Charney-Phillips discretization. The supplementary non-hydrostatic terms cannot be discretized so as to have a discrete dispersion relation that matches the analytic one. One can hope only to minimize the negative effects. At the same time the vertical mass-type hybrid coordinate is being modified to simplify the formulation of our completely implicit model.
The motivation behind our choice will be presented. Experimental results, both synthetic and real cases, will be shown that illustrate the benefits of the new vertical coordinate and discretization. The implicit time-stepping will be examined with respect to the massively parallel computer architecture on which this model will execute in the future.

Scaling High-Resolution Climate to the Next Level
John M. Dennis
National Center for Atmospheric Research

Current high-resolution climate simulations the couple a 0.5° atmosphere and land model to a 0.1° ocean and sea ice model, using Community Climate System Model (CCSM) are underway on the Cray XT5 at the National Institute for Computational Science. These runs, which have completed more than 75 years, utilize 5848 cores and currently achieves a rate of ~ 1.7 simulated years per wall-clock day. Our simulation rate falls below 5 simulated years per wall-clock day, which historically is the minimum acceptable rate to simulate climate. While this high-resolution configuration represent a significant breakthrough in both resolution and core counts versus what was possible with CCSM just a few years ago, there are still a number of remaining scalability issues. We describe the current scalability issues that are preventing our efficient use 10,000 to 100,000 cores. In particular we concentrate on modifications to the load balancing algorithm used by the Community Ice CodE, which enables a 40% reduction is computational cost versus the previous algorithm.

The Importance of Being on Time: Exploring Temporal Error in Geophysical Fluid Simulation
Katherine (Kate) Evans
Oak Ridge National Laboratory

Over the past century, extensive research and review has addressed the spatial resolution and convergence characteristics of numerical methods in fluid flow problems. However a proper analysis requires an equal consideration of the temporal plane, specifically the ability of a numerical method to resolve transient features and temporally converged steady state solutions. In this talk, the temporal error of several different operational solution methods will be evaluated in HOMME, the spectral element cubed sphere dycore option of the Community Atmosphere Model (CAM). For illustration, the extreme effects of temporal error will be presented for a general transient fluid problem.

Simulation of Stratified Mixing in Thin Aspect Ratio Domain Using a Spectral Element Method: Handling Small Scale Noise and Improving the Iterative Poisson Solver
Mohamed Iskandarani
RSMAS/MPO, University of Miami

The simulation of stratified oceanic turbulence is challenging because of its very large scale spectrum, and occurs in thin aspect ration domains. Here we visit two computational issues that limit our capacity to extend our simulation to larger horizontal domains and/or higher Reynolds number, namely handling spatially localized small scale noise, and the efficiency of the iterative solver associated with the computing the non-hydrostatic pressure. In our work we have relied on the Discontinuous Galerkin Method to control Gibbs oscillations in the density field. For moderately large spectral truncation DGM provides extra stability for the code without adversely impacting the accuracy of the computed mixing. A new "specialized" iterative solver has also been developed to take advantage of the channel like geometry to improve the code's run time. It is based on a combination of domain decomposition techniques and fast Fourier solvers. Its performance within the context of a spectral element discretization will be presented and contrasted to the performance of other iterative based solvers.

Convectively Coupled Waves in HOMME Via a Simple Multicloud Model
Boualem Khouider
University of Victoria
In collaboration with Amik St Cyr (NCAR), Andrew Majda (Courant Institute of Mathematical Sciences), and Joseph Tribbia (NCAR)

A simple multicloud convective parametrization model, motivated fully from observations of organized tropical convective systems, is used to force the two first baroclinic modes of vertical structure in HOMME. The two heating modes carry three cloud types, congestus, deep, and stratiform that interact nonlinearly with each other through midlevel moisture and boundary layer equivalent potential temperature. We conduct numerical simulation over an all tropics planet (aquaplanet) with a mask over the tropics and without any other physics. Even at high resolution of about 300 km across the coupled multicloud-HOMME model exhibits convectively coupled Kelvin waves and 2 day waves with the dynamical and physical features that resemble observations consistent with linear theory on a beta- plane. The model coupling and formulation will be presented together with some preliminary results in different parameter regimes.

GPU Metaprogramming Applied to High-Order DG and Loop Generation
Andreas Klöckner
Applied Mathematics, Brown University

Writing peak-performance and scalable GPU codes is a challenge that is aggravated by complicated and constantly changing hardware. In this talk, I propose the concept of run-time code generation and empirical optimization from a high-level language as part of a solution. I will briefly discuss our PyCUDA GPU metaprogramming toolkit and then focus on how the technique and the toolkit helped us exploit the enormous performance potential of GPUs for discontinuous Galerkin finite element solvers across a wide range of discretization parameters and equation types. Particular emphasis will be on the advantages that high-order discretizations offer on modern SIMD-like architectures. I will also discuss a recent effort seeking to automate the writing of high-performance GPU code for a large class of computational kernels that includes many of those needed for the numerical discretization of PDEs.

The Application of the Discontinuous Galerkin Method to Non-hydrostatic and Hydrostatic Atmospheric Flows
Matthias Läuter
Alfred Wegener Institute, Germany
In collaboration with Francis X. Giraldo, Naval Postgraduate School, USA
Dörthe Handorf, Alfred Wegener Institute, Germany
Klaus Dethloff, Alfred Wegener Institute, Germany

The Discontinuous Galerkin Method (DGM) is a high order accurate spatial discretization which is appropriate for non-linear geophysical flow problems, especially in the atmosphere. It features properties that are important for weather and climate applications, like high order accuracy, robustness, discrete conservation properties, applicability to structured and unstructured grids and the treatment of complex geometries.
The DGM is applied to a spherical shallow water atmosphere on unstructured triangular grids and further to a 2d mesoscale atmospheric model. High order convergence can be shown in both applications as well as important atmospheric wave features, like Rossby wave and gravity wave propagation. The experimental results give the indication, that this DG approach is an appropriate method to model atmospheric flow phenomena.

Solving the Global Shallow Water Equations in Vorticity-Divergence Form with Element-Based Galerkin Methods
Michael Levy
Department of Applied Mathematics
University of Colorado at Boulder

The vorticity-divergence formulation of the shallow water system is solved on the cubed sphere using a technique based on a combination of the discontinuous Galerkin and spectral element methods. The discontinuous Galerkin method, which is inherently conservative, is used to solve the equations governing two conservative variables - absolute vorticity and atmospheric thickness (equivalent to mass). The spectral element method is used to solve the divergence equation and the Poisson equations for the velocity potential and the stream function. Time integration is done with an explicit strong stability-preserving third-order Runge-Kutta scheme and the wind field is updated directly from the vorticity and divergence at each stage. A stable steady-state test is run and convergence results are provided, showing the method is high-order accurate. Additionally, two tests without analytic solutions are run with comparable results to previous high-resolution runs found in the literature.

Accelerating the Implicit Integration of Stiff Chemical Systems with Emerging Multi-core Technologies
John Linford
Department of Computer Science, Virginia Polytechnic and State University

Comprehensive atmospheric models trace the evolution of the atmospheric chemical state through time by the implicit integration of large systems of stiff, coupled ODEs. These integrations may account for as much as 90% of model runtime, so accelerating the integration process can greatly improve productivity. Emerging multi-core architectures achieve unprecedented performance through the use of multi-layered heterogeneous parallelism, and fortunately, these processes are embarrassingly parallel on a regular fixed grid. We present new approaches which leverage NVIDIA GPUs, the Cell Broadband Engine, and Intel Quad-Core Xeon CPUs to greatly reduce the runtime of a complex chemical integration. A comparative performance analysis for each platform in double and single precision on coarse and fine grids is presented. The implementation of a three-stage Rosenbrock solver for SIMD architectures is discussed. Speedups of 5.5x in single precision and 2.7x in double precision are observed when compared to eight Xeon cores. Compared to the serial implementation, the maximum observed speedup is 41.1x in single precision.

High Resolution Climate Variability Experiments on the Cray XT-4 and XT-5 “kraken” Systems at NICS
Rich Loft
National Center for Atmospheric Research

Over the past 30 years, the use of supercomputers to obtain numerical solutions of the equations governing weather and climate, has led, along with a multiplicity of satellite and other types of observations, to the conclusion that the Earth's climate is changing due to human activity. In general, the simulations supporting this conclusion have been derived from models run at coarse, O (100 km) resolutions. The impact of unresolved scales on these predictions is not precisely known: indeed it has been hypothesized that noise in the climate system (fluctuations on short spatial and temporal scales) could be “reddened”, thereby influencing the low-frequency components of the climate signal. If true, incorrect simulation of the noise statistics (or stochastic forcing) due to inadequate resolution or errors in the physical parameterizations can feed back onto the mean climate. If this hypothesis is true, the impact on future climate simulations could be enormous. It means that some better physical parameterization of unresolved scales, perhaps combined with higher resolution, is necessary to get climate variability right. That conclusion could increase the computational cost of future climate studies by many orders of magnitude. If the hypothesis is proven false, i.e., if increased resolution does not change climate variability significantly, then we can proceed with much of the current low-resolution research program intact. As typical in exploration, we have to go there to find out: we must run high-resolution, century-long simulations of the Earth System that are designed to test the importance of noise at unresolved scales. That is the point of our team’s successful PetaApps proposal, funded last year, and that is the objective of our work on the “kraken” XT-4 and XT5 system at the national Institute of Computational Science (NICS).
This paper will present XT-4 (and hopefully XT-5) performance and scaling data for a high resolution (0.5° atmosphere and land surface coupled to 0.1° ocean/sea ice) development version of the Community Climate System Model (CCS) in configurations capable of running efficiently on up to 6380 processors. Technical issues related to the memory and performance scalability, load balancing multiple climate components, and improving I/O performance will also be discussed.

Tracer Advection using Characteristic Discontinuous Galerkin
Robert B. Lowrie
Computational Physics Group (CCS-2), Los Alamos National Laboratory

We will describe a variant of the discontinuous Galerkin (DG) method for tracer advection. The formulation is explicit in time and fully discrete, in that both space and time are discretized using the finite-element approach. Unlike space-time DG, time variations are eliminated through local tracking of the velocity characteristics. We refer to the method as Characteristic Discontinuous Galerkin (CDG). For a constant velocity field, CDG can attain any formal order of accuracy using only nearest-neighbor information. CDG's floating-point operation count is rather high for a single tracer quantity; however, the cost of advecting additional tracers is minimal. In one space dimension, CDG using a quadratic basis reduces to Prather's moment method (Prather, JGR 1986). Prather's original formulation is dimensionally split and restricted to logically-rectangular, Cartesian meshes. When used with incremental remap, CDG can be thought of as an extension of Prather's method to unsplit advection on general meshes and to higher orders of accuracy.
Comparisons will be made with Prather's method. Progress on enforcing solution bounds and monotonicity will also be discussed.

Characteristics-Based Flux-Form Semi-Lagrangian Methods for Atmospheric Simulation
Matthew R. Norman
North Carolina State University

The semi-Lagrangian (SL) approach, regarded for its large CFL stability and low phase error, is generally not considered scalable. This is likely because it is often coupled with semi-implicit (SI) treatment of fast dynamics which requires global communication. Presented here is an upwind and fully discrete SL finite volume (FV) method which accommodates any desired spatial interpolant to achieve high-order accuracy in time and space along with non-oscillatory results. The 1-D flux-form SL (FFSL) method of Lin and Rood is implemented on a characteristic decomposition of the conservation laws governing atmospheric motion. Beneficial properties of this scheme include good FV scalability, low numerical diffusion, trivial introduction of split-explicit subcycling of fast dynamics, stability at CFL numbers up to unity, and potential extension to CFL numbers greater than unity. Multi-dimensionality is currently achieved via a second-order accurate alternating Strang splitting. Other scalable FV treatments of the dynamics are discussed and compared to the present framework.
This method is implemented to simulate 2-D (x-z slice), non-hydrostatic, fully compressible, dry atmospheric dynamics. Three test cases are run to evaluate the method:
1) a convective bubble in a neutral atmosphere,
2) a density current in a neutral atmosphere, and
3) non-hydrostatic internal gravity waves in a stable atmosphere.
High-order accurate and non-oscillatory results are experimentally verified in the test cases. Split-explicit subcycling is implemented and results appear to be numerically stable without the traditional divergence damping. The relative merits of efficiency versus splitting error are discussed. Future extension to genuinely multi-dimensional simulation is also discussed.

Fully Implicit JFNK Solvers in the HOMME Code: A Case Study on Software Toolkit Integration
Roger Pawlowski
Sandia National Laboratories
In collaboration with Andrew Salinger, Damian Rouson and Roscoe Bartlett, Sandia National Laboratories and Kate Evans, Oak Ridge National Laboratory

As scientific software becomes more complex and specialized, a lightweight "toolkit" approach to solver technology has become attractive from a software engineering standpoint. The Trilinos framework is an example of such a toolkit, designed to provide scalable and robust solver technology for parallel large-scale applications. Over the past year, we have incorporated a Jacobian-Free Newton-Krylov solver from Trilinos/NOX to enable a fully-coupled, fully implicit solver in HOMME. This presentation will discuss this software engineering effort. We will give a brief review of the current technology for fully implicit methods. We will detail the software interface design and demonstrate the ease of integration of such packages. Finally, results of the fully implicit solver will be compared with the explicit/semi-implicit solvers currently in the HOMME application.

Applications of Anisotropic Adaptive Mesh Methods in Geophysical Fluid Dynamics
Matthew Piggott
Imperial College London

Many applications in geophysical fluid dynamics include large variations in spatial scales that are important to resolve in a numerical simulation. An example is the global ocean where dynamics at spatial scales of thousands of kms has strong two-way coupling with important processes occurring at the km scale, e.g. boundary layers, eddies and buoyancy driven flows interacting with bathymetry. Adaptive and unstructured mesh methods represent an efficient means to simulate these systems. In addition, smaller scale processes often have high aspect ratios and hence anisotropic mesh methods should be considered. In this talk our work applying anisotropic adaptive methods to geophysical fluid dynamics problems will be reviewed. In particular, error measures, mesh to mesh interpolation, load-balanced parallelism, and combined mesh optimisation and mesh movement will be discussed.

Exploiting Single Precision BLAS/LAPACK for the Efficient Implementation of the Implicit Discontinuous Galerkin Method
Jean-François Remacle
Université catholique de Louvain
In collaboration with K.Hillewaert, B.Helenbrook and P.Geuzaine

This presentation shows how to obtain near-optimal CPU efficiency for the Newton-Krylov-ILU iterative strategy by exploiting the data locality of the discontinuous Galerkin method (DGM). The particular data organisation of the DGM allows one to rewrite many operations during assembly of the tangent matrix and it's decomposition in terms of BLAS/LAPACK operations on contiguous vectors and matrices of considerable size, thus leading to near-optimal flop rates. The data structures and operations required to achieve this are explained and then the floating point operation (FLOP) rate of the code is compared to the theoretical maximum as well as the rates of the basic operations attained by different implementations of the BLAS and LAPACK standards. The advantages of using single-precision data for assembly speed and linear algebra steps are shown.

Revisiting the Anticipated Potential Vorticity Method
Todd Ringler
Los Alamos National Laboratory

The Anticipated Potential Vorticity Method (APVM) was proposed by Sadourny and Basdevant in 1985 as a closure for large-scale geophysical flows. In this closure, potential enstrophy is dissipated through an insightful choice of the vorticity used to compute the nonlinear Coriolis force in the momentum equation. The notable aspect of the closure is that while it dissipates potential enstrophy, it conserves total energy. The APVM is briefly reviewed and slightly reinterpreted before a discussion of the implementation of the method within an arbitrarily-structured C-grid model. The initial results are compelling; the APVM leads to coherent vortical structures even as strong filamentation occurs. In closing, the prospects for implementing the APVM closure in the 3D global atmosphere and ocean models will be discussed.

A New Preconditioning Strategy for a Spectral-element-based Magnetohydrodynamics Solver
Duane Rosenberg
National Center for Atmospheric Research
In collaboration with Amik St-Cyr, NCAR

We present a new two-level preconditioning strategy for a pseudo-Laplacian operator that emerges from an explicit spectral element (SEM) discretization of the incompressible magnetohydrodynamics (MHD) equations. The method extends work done by Fischer [JCP 133:84 (1997)], in which a coarse solver is added to an overlapping additive Schwarz fine grid preconditioner. In this work, there are several features regarding the handling of the overlap and communication of corner data that are new. In addition, the fine grid preconditioner utilizes a recent theoretical result in [SISC, 29(6),pp 2402-2425], which enables us to trivially convert the overlapping Schwarz method to its optimized counterpart. The coarse solver uses a new finite element discretization of the weak Laplacian operator on the skeleton of the fine grid. We present preliminary results for this two--level preconditioner acting on conforming grids. We also discuss modifications to our preconditioning strategy to account for nonconforming discretizations that arise from the adaptive grid mechanics provided by the framework in which the MHD solver resides.

Modeling Atmospheric Circulations with High-resolution Methods
Piotr K. Smolarkiewicz
National Center for Atmospheric Research

Because of the enormous range of scales and physical phenomena, direct numerical simulation of the Earth's weather and climate is far beyond the reach of current computational technology. This necessitates careful selection of numerical tools suitable for simulations of atmospheric circulations. The recently documented in the literature implicit large eddy simulation (ILES) approach based on non-oscillatory finite-volume (NFV) schemes [1] appears particularly useful; as it enables the representation of high Reynolds number flows without need for explicit subgrid scale models. This talk will highlight an anelastic multiscale NFV model [2] built on the second-order-accurate high-resolution multidimensional positive definite advection transport algorithm (MPDATA) [3]. Among the chief properties of MPDATA are the preservation of sign of scalar quantities such as density or water content, the nonlinear stability of the simulation, and the suppression of unphysical oscillations. To illustrate the strengths of the ILES approach, the results will be shown of several diverse calculations of turbulent flows ranging from a canonical decaying turbulence in a triply-periodic box to idealized terrestrial (and solar) climates. These results will be contrasted with pseudospectral calculations and theoretical estimations to demonstrate that fluid models built on high-resolution methods can compare favorably with (formally) high-order codes.

[1] Implicit Large Eddy Simulation: Computing Turbulent Fluid Dynamics, Eds. Grinstein FF, Margolin LG, Rider W, Cambridge University Press. 2007; 2007, 552 pp.
[2] J.M. Prusa, P.K. Smolarkiewicz, A.A Wyszogrodzki, EULAG, a Computational Model for Multiscale Flows, Comput. Fluids, 37, 1193-1207, (2008).
[3] P.K. Smolarkiewicz, Multidimensional positive definite advection transport algorithm: an overview, Int. J. Numer. Meth. Fluids, 50, 1123-1144, (2006).

An Overview of CAM-HOMME and Recent Simulation Results
Mark Taylor
Sandia National Laboratories
In collaboration with Jim Edwards, NCAR, Kate Evans, ORNL, Aime Fournier, NCAR, Peter Lauritzen, NCAR, Amik St.Cyr, NCAR

NCAR's High-Order Method Modeling Environment (HOMME) contains several cubed-sphere based global atmospheric dynamical cores. I will give an overview of the configuration of HOMME being tested within the atmospheric component (CAM) of the Community Climate System Model (CCSM). This configuration uses the spectral elements (a continuous Galerkin finite element method) with several desirable properties:

4th order accurate dynamics with efficient leapfrog time stepping
Local conservation (mass, energy, PV in 2D)
Isotropic dissipation
Quasi-monotone tracer advection (with RK2-SSP time stepping)
Demonstrated scalability to O (100K) processes at 1/4 degree and higher resolutions.

Local conservation is obtained via a "compatible" formulation of the spectral element method, meaning it has discrete analogs of the key integral properties of the spherical div, grad and curl operators. The efficient, scale-selective filters used for dissipation in the conventional spectral element method generated too much grid imprinting in the presence of the CAM subgrid physics. This grid imprinting was eliminated by replacing the filters with a grid-independent grad4 operator modeled after that used in CAM-Eulerian. For tracer advection CAM-HOMME uses the Lagrange+remap approach (SJ Lin, MWR, 2004): A monotone piecewise cubic reconstruction in the vertical (Zerroukat et. al, QJRMS 2005), coupled with a sign-preserving or quasi-monotone horizontal advection scheme in the horizontal. (Taylor, St.Cyr, Fourner, ICCS 2009). I will present preliminary results from realistic cyclic year 2000 simulations (coupled with the CCSM land model and prescribed ocean/ice conditions) at 1 degree (100km) resolution as well some results from an ultra-high resolution (1/8 degree) simulations.

Construction and Performance of Exponential Integrators
Mayya Tokman
University of California, Merced

Exponential integrators offer an alternative way to integrate nonlinear systems of ODEs compared to explicit and implicit methods. A particularly advantageous characteristic of exponential schemes is their efficiency in integrating large stiff systems. While first exponential schemes were introduced in early 1960's they haven't attained wide popularity since they were considered prohibitively expensive. A proposal to combine Krylov projection algorithm and exponential integration alleviated this computational constraint and was followed by a resurgence of interest in these methods. In this talk we will describe what are the building blocks in construction of an efficient exponential integrator and provide an overview of research efforts in development of competitive schemes of this type. We will discuss important aspects relating to construction, analysis and performance of these methods such as optimization of computational complexity per integration step, derivation of order conditions using B-series, development of adaptive integration strategies and sources of efficiency compared to standard schemes.

A Low Storage, High Order Discontinuous Galerkin Method for Curvilinear Domains
Tim Warburton
Department of Computational and Applied Mathematics, Rice University

A range of important issues relating to the practical application of discontinuous Galerkin time-domain (DGTD) method for wave propagation will be discussed. Several issues arise in time stepping for realistic simulations using DGTD. Multiple intrinsic length and time scales are induced by geometry, material properties, mesh generation and the use of high order polynomial approximation. I will describe a simple filtering process that allows us to reduce the anomalously large gradients intrinsic to high-order polynomial approximation and consequently use a larger time step. Additionally, local time stepping algorithms will be demonstrated for meshes with a large range of element sizes.
Recently we have been testing high performance implementations for DGTD on graphics processing units. The results are very promising. I will discuss the performance of the low storage DGTD for curvilinear domains on this platform and how its design was influenced by the memory constraints of these high performance computer devices . Examples will be shown of wave propagation in curvilinear volumes and on curvilinear surfaces.

Time Integrators for Atmospheric Fluid Dynamics
Jörg Wensch
Technical University of Dresden

This talk gives an overview on the methods and techniques that have been developed for the solution of ordinary differential equations in the last decades. In particular, Runge-Kutta-type methods, multistep methods, general linear methods, linearly-implicit methods, partitioned methods, order and stability concepts and implicit vs. explicit methods are under discussion.
We present further generalizations of split-explicit Runge-Kutta methods to RK-Type methods and to cyclic multistep methods (PEER methods) with improved stability and order properties for the Euler equations on staggered grids.

Practical Issues in Comparing Explicit and Implicit Methods on Parallel Computers
Patrick Worley
Oak Ridge National Laboratory

Models characterizing the relative performance of explicit and implicit PDE solvers on parallel computers have been available for over 20 years. However, constants matter in practice, and the relative merits of explicit and implicit approaches will depend on the computer platform, problem and algorithm details, problem size, and processor count. In this talk, we describe the performance of some operators commonly used in these performance models, and discuss the implications for explicit and implicit methods on the current generation of High Performance Computing (HPC) parallel computers.