# 4. Introducing the Community Earth System Model (CESM)#

This notebook is part of The Climate Laboratory by Brian E. J. Rose, University at Albany.

### What is it?#

• CESM is one of a handful of complex coupled GCMs that are used as part of the IPCC process.

• Developed and maintained at NCAR (Boulder CO) by a group of climate scientists and software engineers.

• “Community” refers to the fact that the code is open-source, with new pieces contributed by a wide variety of users.

I use CESM in my own research. We are going to be using CESM in this course. For lots more information about CESM:

https://www.cesm.ucar.edu/models/

### Key components of CESM:#

• Atmospheric model (AGCM)

• Community Atmsophere Model (CAM)

• Ocean model (OGCM)

• Parallel Ocean Program (POP)

• Land surface model

• Community Land Model (CLM)

• Sea ice model

• Community Ice CodE (CICE)

The software is somewhat modular, so different submodels can be combined together depending on the nature of the scientific problem at hand and the available computer power.

Recall that we saw this schematic of different ways to represent the ocean in climate models:

## Our numerical experiments with CESM#

### Atmosphere#

• Horizontal resolution about 2º lat/lon

• AGCM solves the fundamental equations:

• Conservation of momentum, mass, energy, water, equation of state

• At 2º we resolve the synoptic-scale dynamics

• storm tracks and cyclones.

• We do NOT resolve the mesoscale and smaller

• thunderstorms, individual convective events, clouds

• These all must be parameterized.

• Model also solves equations of radiative transfer. This takes account of

• composition of the atmosphere and the absorption properties of different gases

### Sea ice#

• Resolution of 1º.

• Thermodynamics (conservation of energy, water and salt)

• determines freezing and melting

• Dynamics (momentum equations)

• determine ice motion and deformation.

• Complex! Sea ice is sort of a mixture of a fluid and a solid.

### Land surface model#

• Same resolution as atmosphere.

• Determines surface fluxes of heat, water, momentum (friction) based on prescribed vegetation types.

• Don’t actually know much about how it works!

• Great topic for someone to dig in to for their term project.

### Ocean#

• Same grid as sea ice, 1º.

• Exchanges heat, water, and momentum with the atmosphere and sea ice

• Receives runoff from the land surface (rivers)

• Full 3D simulation of the currents.

### Experimental setup#

Model is given realistic atmospheric composition, realistic solar radiation, etc.

We perform a preindustrial control run to get a baseline simulation, and take averages of several years (because the model has internal variability – every year is a little bit different)

We then (later) we will change something, e.g. double the atmospheric $$CO_2$$!

And allow the model to adjust toward a new equilibrium, just as we did with the toy energy balance model.

## Browsing input data with xarray#

First, let’s take a look at some of the ingredients that go into the control run. All of the necessary data will be served up by a special data server sitting in the department, so you should be able to run this code to interact with the data on any computer that is connected to the internet.

### You need to be connected to the internet to run the code in this notebook#

You can browse the available data through a web interface here:

http://thredds.atmos.albany.edu:8080/thredds/catalog.html

Within this folder called CESM archive, you will find another folder called som_input which contains all the input files.

The data are all stored in NetCDF files, a standard file format for self-describing gridded data.

%matplotlib inline
import numpy as np
import matplotlib.pyplot as plt
import xarray as xr


We are going to use a package called xarray (abbreviated here as xr) to work with the datasets.

### Boundary conditions: continents and topography#

Here we are going to load the input topography file and take a look at what’s inside.

In this case we are passing it a URL to our online dataserver. We’ll put the URL in a string variable called datapath to simplify things later on.

cesm_data_path = "http://thredds.atmos.albany.edu:8080/thredds/dodsC/CESMA/"
cesm_input_path = cesm_data_path + "som_input/"
#  Notice that in Python we can easily concatenate strings together just by adding them
fullURL = cesm_input_path + "USGS-gtopo30_1.9x2.5_remap_c050602.nc"
print( fullURL)

http://thredds.atmos.albany.edu:8080/thredds/dodsC/CESMA/som_input/USGS-gtopo30_1.9x2.5_remap_c050602.nc

#  Now we actually open the dataset
topo = xr.open_dataset( fullURL )
topo

<xarray.Dataset>
Dimensions:       (lat: 96, lon: 144)
Coordinates:
* lat           (lat) float64 -90.0 -88.11 -86.21 -84.32 ... 86.21 88.11 90.0
* lon           (lon) float64 0.0 2.5 5.0 7.5 10.0 ... 350.0 352.5 355.0 357.5
Data variables:
PHIS          (lat, lon) float64 ...
SGH           (lat, lon) float64 ...
SGH30         (lat, lon) float64 ...
LANDFRAC      (lat, lon) float64 ...
LANDM_COSLAT  (lat, lon) float64 ...
Attributes:
history:    Written on date: 20050602\ndefinesurf -remap -t /fs/cgd/csm/i...
make_ross:  true
topofile:   /fs/cgd/csm/inputdata/atm/cam/topo/USGS-gtopo30_10min_c050419.nc
gridfile:   /fs/cgd/csm/inputdata/atm/cam/coords/fv_1.9x2.5.nc
landmask:   /fs/cgd/csm/inputdata/atm/cam2/hrtopo/landm_coslat.nc

The Dataset object has several important attributes. Much of this should look familiar if you have worked with netCDF data before. The xarray package gives a very powerful and easy to use interface to the data.

We can access individual variables within the xarray.Dataset object as follows:

topo.PHIS

<xarray.DataArray 'PHIS' (lat: 96, lon: 144)>
[13824 values with dtype=float64]
Coordinates:
* lat      (lat) float64 -90.0 -88.11 -86.21 -84.32 ... 84.32 86.21 88.11 90.0
* lon      (lon) float64 0.0 2.5 5.0 7.5 10.0 ... 350.0 352.5 355.0 357.5
Attributes:
long_name:   surface geopotential
units:       m2/s2
from_hires:  true
filter:      remap

### Plotting the topography#

We will now read the geopotential and make a plot of the topography of the Earth’s surface as represented on the 2º grid. The code below makes a colorful plot of the topography. We also use the land-sea mask in order to plot nothing at grid points that are entirely ocean-covered.

Execute this code exactly as written first, and then play around with it to see how you might customize the graph.

g = 9.8  # gravity in m/s2
meters_per_kilometer = 1E3
height = topo.PHIS / g / meters_per_kilometer  # in kilometers
#  Note that we have just created a new xarray.DataArray object that preserves the axis labels
height.attrs['units'] = 'km'
height.name = 'height'
height

<xarray.DataArray 'height' (lat: 96, lon: 144)>
array([[2.85353559e+00, 2.85353559e+00, 2.85353559e+00, ...,
2.85353559e+00, 2.85353559e+00, 2.85353559e+00],
[2.77410830e+00, 2.78011779e+00, 2.78608862e+00, ...,
2.75485770e+00, 2.76155041e+00, 2.76794791e+00],
[2.61430262e+00, 2.63198888e+00, 2.64826593e+00, ...,
2.54851832e+00, 2.57282431e+00, 2.59469735e+00],
...,
[1.75532270e-06, 7.28060055e-07, 2.80413809e-07, ...,
1.60111973e-05, 8.22028425e-06, 3.93600870e-06],
[5.55512288e-06, 4.03232300e-06, 2.85670028e-06, ...,
1.25673050e-05, 9.80607842e-06, 7.47009192e-06],
[0.00000000e+00, 0.00000000e+00, 0.00000000e+00, ...,
0.00000000e+00, 0.00000000e+00, 0.00000000e+00]])
Coordinates:
* lat      (lat) float64 -90.0 -88.11 -86.21 -84.32 ... 84.32 86.21 88.11 90.0
* lon      (lon) float64 0.0 2.5 5.0 7.5 10.0 ... 350.0 352.5 355.0 357.5
Attributes:
units:    km

Let’s make a plot! xarray is able to automatically generate labeled plots. This is very handy for “quick and dirty” investigation of the data:

height.plot()

<matplotlib.collections.QuadMesh at 0x168ac0bb0>


If we want more control over the appearance of the plot, we can use features of matplotlib

#  A filled contour plot of topography with contours every 500 m
lev = np.arange(0., 6., 0.5)
fig1, ax1 = plt.subplots(figsize=(8,4))
# Here we are masking the data to exclude points where the land fraction is zero (water only)
cax1 = ax1.contourf( height.lon, height.lat,
height.where(topo.LANDFRAC>0), levels=lev)
ax1.set_title('Topography (km) and land-sea mask in CESM')
ax1.set_xlabel('Longitude')
ax1.set_ylabel('Latitude')
cbar1 = fig1.colorbar(cax1)


Note that at 2º resolution we can see many smaller features (e.g. Pacific islands). The model is given a fractional land cover for each grid point.

Here let’s plot the land-sea mask itself so we can see where there is at least “some” water:

fig2, ax2 = plt.subplots()
cax2 = ax2.pcolormesh( topo.lon, topo.lat, topo.LANDFRAC )
ax2.set_xlabel('Longitude'); ax2.set_ylabel('Latitude')
cbar2 = fig2.colorbar(cax2);


### Making nicer maps#

Notice that to make these plots we’ve just plotted the lat-lon array without using any map projection.

There are nice tools available to make better maps. We’ll leave that as a topic for another day. But if you’re keen to read ahead, check out:

http://scitools.org.uk/cartopy/

## Ocean boundary conditions#

Let’s load another file that contains some information about the ocean and its interaction with the atmosphere.

som_input = xr.open_dataset( cesm_input_path + "pop_frc.1x1d.090130.nc")
som_input

<xarray.Dataset>
Dimensions:  (lat: 180, lon: 360, time: 12)
Coordinates:
* time     (time) object 0001-01-15 00:00:00 ... 0001-12-16 00:00:00
Dimensions without coordinates: lat, lon
Data variables:
area     (lat, lon) float64 ...
yc       (lat) float32 ...
xc       (lon) float32 ...
S        (time, lat, lon) float32 ...
T        (time, lat, lon) float32 ...
U        (time, lat, lon) float32 ...
V        (time, lat, lon) float32 ...
dhdx     (time, lat, lon) float32 ...
dhdy     (time, lat, lon) float32 ...
hblt     (time, lat, lon) float32 ...
qdp      (time, lat, lon) float32 ...
Attributes:
creation_date:  Fri Jan 30 10:22:53 MST 2009
comment:        This data is on a standard 1x1d grid.
calendar:       standard
author:         D. Bailey
note3:          qdp is computed from depth summed ocean column
note2:          all fields interpolated to T-grid
note1:          fields computed from 20-yr monthly means from pop
description:    Input data for DOCN7 mixed layer model from b40.999
source:         pop_frc.ncl
conventions:    CCSM data model domain description
title:          Monthly averaged ocean forcing from POP output

The ocean / sea ice models exist on different grids than the atmosphere (1º instead of 2º resolution).

Now we are going to look at the annual mean heat flux out of the ocean.

It is stored in the field qdp in the dataset we just opened.

The sign convention in CESM is that qdp > 0 where heat is going IN to the ocean. We will change the sign to plot heat going OUT of the ocean INTO the atmosphere (a more atmosphere-centric viewpoint).

som_input.qdp

<xarray.DataArray 'qdp' (time: 12, lat: 180, lon: 360)>
[777600 values with dtype=float32]
Coordinates:
* time     (time) object 0001-01-15 00:00:00 ... 0001-12-16 00:00:00
Dimensions without coordinates: lat, lon
Attributes:
spatial_op:  Bilinear remapping: 1st order: destarea: NCL: ./map_gx1v5_to...

Unfortunately, here is a case in which the metadata are not very useful. There is no text description of what variable qdp actually is, or what its units are. (It is actually in units of W m$$^{-2}$$)

We can see that there are 12 x 180 x 360 data points. One 180 x 360 grid for each calendar month!

Now we are going to take the average over the year at each point.

We will use the power of xarray here to take the average over the time dimension, leaving us with a single grid on 180 latitude points by 360 longitude points:

(-som_input.qdp.mean(dim='time')).plot()

<matplotlib.collections.QuadMesh at 0x16959b4f0>


Now make a nice plot of the annual mean q-flux.

#  We can always set a non-standard size for our figure window
fig3, ax3 = plt.subplots(figsize=(10, 6))
lev = np.arange(-700., 750., 50.)
cax3 = ax3.contourf(som_input.xc, som_input.yc,
-som_input.qdp.mean(dim='time'),
levels=lev, cmap=plt.cm.bwr)
cbar3 = fig3.colorbar(cax3)
ax3.set_title( 'CESM: Prescribed heat flux out of ocean (W m$^{-2}$), annual mean',
fontsize=14 )
ax3.set_xlabel('Longitude', fontsize=14)
ax3.set_ylabel('Latitude', fontsize=14)
ax3.text(65, 50, 'Annual', fontsize=16 )
ax3.contour(topo.lon, topo.lat, topo.LANDFRAC, levels=[0.5], colors='k');


Notice all the spatial structure here:

• Lots of heat is going in to the oceans at the equator, particularly in the eastern Pacific Ocean.

• The red hot spots show where lots of heat is coming out of the ocean.

• Hot spots include the mid-latitudes off the eastern coasts of Asia and North America

• And also the northern North Atlantic.

All this structure is determined by ocean circulation, which we are not modeling here. Instead, we are prescribing these heat flux patterns as an input to the atmosphere.

This pattern changes throughout the year. Recall that we just averaged over all months to make this plot. We might want to look at just one month:

# select by month index (0 through 11)
som_input.qdp.isel(time=0)

<xarray.DataArray 'qdp' (lat: 180, lon: 360)>
array([[      nan,       nan,       nan, ...,       nan,       nan,       nan],
[      nan,       nan,       nan, ...,       nan,       nan,       nan],
[      nan,       nan,       nan, ...,       nan,       nan,       nan],
...,
[-5.75126 , -5.742314, -5.734764, ..., -5.792866, -5.774798, -5.761612],
[-5.533446, -5.537023, -5.540624, ..., -5.522859, -5.526363, -5.529893],
[-5.31631 , -5.314115, -5.311888, ..., -5.322703, -5.320605, -5.318473]],
dtype=float32)
Coordinates:
time     object 0001-01-15 00:00:00
Dimensions without coordinates: lat, lon
Attributes:
spatial_op:  Bilinear remapping: 1st order: destarea: NCL: ./map_gx1v5_to...
#  select by array slicing (but for this you have to know the axis order!)
som_input.qdp[0,:,:]

<xarray.DataArray 'qdp' (lat: 180, lon: 360)>
array([[      nan,       nan,       nan, ...,       nan,       nan,       nan],
[      nan,       nan,       nan, ...,       nan,       nan,       nan],
[      nan,       nan,       nan, ...,       nan,       nan,       nan],
...,
[-5.75126 , -5.742314, -5.734764, ..., -5.792866, -5.774798, -5.761612],
[-5.533446, -5.537023, -5.540624, ..., -5.522859, -5.526363, -5.529893],
[-5.31631 , -5.314115, -5.311888, ..., -5.322703, -5.320605, -5.318473]],
dtype=float32)
Coordinates:
time     object 0001-01-15 00:00:00
Dimensions without coordinates: lat, lon
Attributes:
spatial_op:  Bilinear remapping: 1st order: destarea: NCL: ./map_gx1v5_to...

Here we got just the first month (January) by specifying [0,:,:] after the variable name. This is called slicing or indexing an array. We are saying “give me everything for month number 0”. Now make the plot:

fig4, ax4 = plt.subplots(figsize=(10,4))
cax4 = ax4.contourf( som_input.xc, som_input.yc,
-som_input.qdp.isel(time=0),
levels=lev, cmap=plt.cm.bwr)
cbar4 = plt.colorbar(cax4)
ax4.set_title( 'CESM: Prescribed heat flux out of ocean (W m$^{-2}$)',
fontsize=14 )
ax3.set_xlabel('Longitude', fontsize=14)
ax3.set_ylabel('Latitude', fontsize=14)
ax4.text(65, 50, 'January', fontsize=12 );
ax4.contour(topo.lon, topo.lat, topo.LANDFRAC, levels=[0.5], colors='k');


For lots more help with using xarray to slice and dice your dataset, look at the online documentation:

http://xarray.pydata.org

## The “pre-industrial” control run#

Our control run is set up to simulate the climate of the “pre-industrial era”, meaning before significant human-induced changes to the composition of the atmosphere, nominally the year 1850.

Output from the control run is available on the same data server as above. Look in the folder called cpl_1850_f19 (Here cpl stands for “coupled model” with interactive ocean, 1850 indicated pre-industrial conditions, and f19 is a code for 2º the horizontal grid resolution).

There are output files for each active model component:

• atmosphere

• ocean

• sea ice

• land surface

The model produces monthly average output files for each component. We can load datasets from individual months, but there are also large concatenated files available that contain the entire output.

Let’s take a look at the atmosphere file. The file is called

cpl_1850_f19.cam.h0.nc

(the file extension .nc is used to indicate NetCDF format).

atm_control = xr.open_dataset(cesm_data_path + "cpl_1850_f19/concatenated/cpl_1850_f19.cam.h0.nc")
atm_control

<xarray.Dataset>
Dimensions:       (lev: 26, ilev: 27, time: 240, lat: 96, lon: 144, slat: 95,
slon: 144, nbnd: 2)
Coordinates:
* lev           (lev) float64 3.545 7.389 13.97 23.94 ... 929.6 970.6 992.6
* ilev          (ilev) float64 2.194 4.895 9.882 18.05 ... 956.0 985.1 1e+03
* time          (time) object 0001-02-01 00:00:00 ... 0021-01-01 00:00:00
* lat           (lat) float64 -90.0 -88.11 -86.21 -84.32 ... 86.21 88.11 90.0
* lon           (lon) float64 0.0 2.5 5.0 7.5 10.0 ... 350.0 352.5 355.0 357.5
* slat          (slat) float64 -89.05 -87.16 -85.26 ... 85.26 87.16 89.05
* slon          (slon) float64 -1.25 1.25 3.75 6.25 ... 351.2 353.8 356.2
Dimensions without coordinates: nbnd
Data variables: (12/128)
hyam          (lev) float64 ...
hybm          (lev) float64 ...
hyai          (ilev) float64 ...
hybi          (ilev) float64 ...
P0            float64 ...
date          (time) int32 ...
...            ...
VTH3d         (time, ilev, lat, lon) float32 ...
VU            (time, lev, lat, lon) float32 ...
VV            (time, lev, lat, lon) float32 ...
W2d           (time, lat, lon) float32 ...
WTH3d         (time, ilev, lat, lon) float32 ...
Z3            (time, lev, lat, lon) float32 ...
Attributes: (12/16)
Conventions:                     CF-1.0
source:                          CAM
case:                            cpl_1850_f19
title:                           UNSET
logname:                         br546577
host:                            snow-30.rit.alba
...                              ...
history:                         Tue Feb 26 17:17:15 2019: ncrcat atm/his...
NCO:                             4.6.8
DODS.strlen:                     8
DODS.dimName:                    chars
DODS_EXTRA.Unlimited_Dimension:  time

Lots of different stuff! These are all the different quantities that are calculated as part of the model simulation. Every quantity represents a monthly average.

atm_control.co2vmr

<xarray.DataArray 'co2vmr' (time: 240)>
[240 values with dtype=float64]
Coordinates:
* time     (time) object 0001-02-01 00:00:00 ... 0021-01-01 00:00:00
Attributes:
long_name:  co2 volume mixing ratio

This is the amount of CO2 in the atmosphere (about 285 parts per million by volume). It is prescribed in these simulations and does not change.

One nice thing about xarray.DataArray objects is that we can do simple arithmetic with them (already seen several examples of this in the notes above). For example, change the units of CO2 amount to ppm:

atm_control.co2vmr * 1E6

<xarray.DataArray 'co2vmr' (time: 240)>
array([284.7, 284.7, 284.7, 284.7, 284.7, 284.7, 284.7, 284.7, 284.7,
284.7, 284.7, 284.7, 284.7, 284.7, 284.7, 284.7, 284.7, 284.7,
284.7, 284.7, 284.7, 284.7, 284.7, 284.7, 284.7, 284.7, 284.7,
284.7, 284.7, 284.7, 284.7, 284.7, 284.7, 284.7, 284.7, 284.7,
284.7, 284.7, 284.7, 284.7, 284.7, 284.7, 284.7, 284.7, 284.7,
284.7, 284.7, 284.7, 284.7, 284.7, 284.7, 284.7, 284.7, 284.7,
284.7, 284.7, 284.7, 284.7, 284.7, 284.7, 284.7, 284.7, 284.7,
284.7, 284.7, 284.7, 284.7, 284.7, 284.7, 284.7, 284.7, 284.7,
284.7, 284.7, 284.7, 284.7, 284.7, 284.7, 284.7, 284.7, 284.7,
284.7, 284.7, 284.7, 284.7, 284.7, 284.7, 284.7, 284.7, 284.7,
284.7, 284.7, 284.7, 284.7, 284.7, 284.7, 284.7, 284.7, 284.7,
284.7, 284.7, 284.7, 284.7, 284.7, 284.7, 284.7, 284.7, 284.7,
284.7, 284.7, 284.7, 284.7, 284.7, 284.7, 284.7, 284.7, 284.7,
284.7, 284.7, 284.7, 284.7, 284.7, 284.7, 284.7, 284.7, 284.7,
284.7, 284.7, 284.7, 284.7, 284.7, 284.7, 284.7, 284.7, 284.7,
284.7, 284.7, 284.7, 284.7, 284.7, 284.7, 284.7, 284.7, 284.7,
284.7, 284.7, 284.7, 284.7, 284.7, 284.7, 284.7, 284.7, 284.7,
284.7, 284.7, 284.7, 284.7, 284.7, 284.7, 284.7, 284.7, 284.7,
284.7, 284.7, 284.7, 284.7, 284.7, 284.7, 284.7, 284.7, 284.7,
284.7, 284.7, 284.7, 284.7, 284.7, 284.7, 284.7, 284.7, 284.7,
284.7, 284.7, 284.7, 284.7, 284.7, 284.7, 284.7, 284.7, 284.7,
284.7, 284.7, 284.7, 284.7, 284.7, 284.7, 284.7, 284.7, 284.7,
284.7, 284.7, 284.7, 284.7, 284.7, 284.7, 284.7, 284.7, 284.7,
284.7, 284.7, 284.7, 284.7, 284.7, 284.7, 284.7, 284.7, 284.7,
284.7, 284.7, 284.7, 284.7, 284.7, 284.7, 284.7, 284.7, 284.7,
284.7, 284.7, 284.7, 284.7, 284.7, 284.7, 284.7, 284.7, 284.7,
284.7, 284.7, 284.7, 284.7, 284.7, 284.7])
Coordinates:
* time     (time) object 0001-02-01 00:00:00 ... 0021-01-01 00:00:00

Here’s another variable:

atm_control.SOLIN

<xarray.DataArray 'SOLIN' (time: 240, lat: 96, lon: 144)>
[3317760 values with dtype=float32]
Coordinates:
* time     (time) object 0001-02-01 00:00:00 ... 0021-01-01 00:00:00
* lat      (lat) float64 -90.0 -88.11 -86.21 -84.32 ... 84.32 86.21 88.11 90.0
* lon      (lon) float64 0.0 2.5 5.0 7.5 10.0 ... 350.0 352.5 355.0 357.5
Attributes:
units:              W/m2
long_name:          Solar insolation
cell_methods:       time: mean

Apparently this is the incoming solar radiation or insolation, with shape (240,96,144) meaning it’s got 240 months, 96 latitude points and 144 longitude points.

## Exercise: Taking a time average#

• Take the time average of the SOLIN field. Store the result as a new variable.

• What are the dimensions of the resulting data array? What would be a good way to visualize this quantity?

atm_control.SOLIN.mean(dim='time')

<xarray.DataArray 'SOLIN' (lat: 96, lon: 144)>
array([[172.81734, 172.81734, 172.81734, ..., 172.81734, 172.81734,
172.81734],
[173.0204 , 173.0204 , 173.02022, ..., 173.01988, 173.02003,
173.02028],
[173.62914, 173.62895, 173.62877, ..., 173.62932, 173.6288 ,
173.62898],
...,
[172.12259, 172.123  , 172.123  , ..., 172.12276, 172.12286,
172.12274],
[171.51128, 171.51137, 171.5114 , ..., 171.51158, 171.51155,
171.5115 ],
[171.30817, 171.30817, 171.30817, ..., 171.30817, 171.30817,
171.30817]], dtype=float32)
Coordinates:
* lat      (lat) float64 -90.0 -88.11 -86.21 -84.32 ... 84.32 86.21 88.11 90.0
* lon      (lon) float64 0.0 2.5 5.0 7.5 10.0 ... 350.0 352.5 355.0 357.5

## Exercise: Plotting the time average insolation#

1. Make a well-labeled plot of the time-averaged insolation (using the variable you stored above).

2. Is there a way to further reduce the dimensionality of the data, and plot the information in a different way?

Remember that you can apply the .mean() operation across any number of named dimensions in a data array.

## Comparing the control run with the observed energy budget#

Recall that our investigations so far have been guided by this figure of the observed annual, global mean energy budget:

## Exercise: Thinking about how to compute a global average#

In order to compare these numbers with the control run, we need to take global averages of the data. What do we mean by global average?

Before proceeding with these notes, try to answer the following question:

Why does it not make sense to simply average over each data point on a latitude-longitude grid?

## Weighting for global average#

The global average needs to weighted by the area of each grid cell, which is proportional to the cosine of latitude (do you understand why?)

We can implement this in xarray as follows:

#  Take the cosine of latitude (first converting to radians)
coslat

<xarray.DataArray 'lat' (lat: 96)>
array([6.12323400e-17, 3.30633693e-02, 6.60905843e-02, 9.90455303e-02,
1.31892171e-01, 1.64594590e-01, 1.97117027e-01, 2.29423920e-01,
2.61479941e-01, 2.93250037e-01, 3.24699469e-01, 3.55793847e-01,
3.86499169e-01, 4.16781860e-01, 4.46608807e-01, 4.75947393e-01,
5.04765538e-01, 5.33031729e-01, 5.60715057e-01, 5.87785252e-01,
6.14212713e-01, 6.39968541e-01, 6.65024572e-01, 6.89353409e-01,
7.12928448e-01, 7.35723911e-01, 7.57714870e-01, 7.78877279e-01,
7.99187997e-01, 8.18624815e-01, 8.37166478e-01, 8.54792713e-01,
8.71484244e-01, 8.87222819e-01, 9.01991230e-01, 9.15773327e-01,
9.28554038e-01, 9.40319390e-01, 9.51056516e-01, 9.60753676e-01,
9.69400266e-01, 9.76986831e-01, 9.83505075e-01, 9.88947871e-01,
9.93309266e-01, 9.96584493e-01, 9.98769969e-01, 9.99863305e-01,
9.99863305e-01, 9.98769969e-01, 9.96584493e-01, 9.93309266e-01,
9.88947871e-01, 9.83505075e-01, 9.76986831e-01, 9.69400266e-01,
9.60753676e-01, 9.51056516e-01, 9.40319390e-01, 9.28554038e-01,
9.15773327e-01, 9.01991230e-01, 8.87222819e-01, 8.71484244e-01,
8.54792713e-01, 8.37166478e-01, 8.18624815e-01, 7.99187997e-01,
7.78877279e-01, 7.57714870e-01, 7.35723911e-01, 7.12928448e-01,
6.89353409e-01, 6.65024572e-01, 6.39968541e-01, 6.14212713e-01,
5.87785252e-01, 5.60715057e-01, 5.33031729e-01, 5.04765538e-01,
4.75947393e-01, 4.46608807e-01, 4.16781860e-01, 3.86499169e-01,
3.55793847e-01, 3.24699469e-01, 2.93250037e-01, 2.61479941e-01,
2.29423920e-01, 1.97117027e-01, 1.64594590e-01, 1.31892171e-01,
9.90455303e-02, 6.60905843e-02, 3.30633693e-02, 6.12323400e-17])
Coordinates:
* lat      (lat) float64 -90.0 -88.11 -86.21 -84.32 ... 84.32 86.21 88.11 90.0
Attributes:
long_name:  latitude
units:      degrees_north
#  And divide by its mean value
weight_factor = coslat / coslat.mean(dim='lat')
#  Want to see what we just created?
weight_factor

<xarray.DataArray 'lat' (lat: 96)>
array([9.72048516e-17, 5.24872953e-02, 1.04917196e-01, 1.57232372e-01,
2.09375617e-01, 2.61289912e-01, 3.12918491e-01, 3.64204898e-01,
4.15093052e-01, 4.65527308e-01, 5.15452516e-01, 5.64814085e-01,
6.13558038e-01, 6.61631075e-01, 7.08980627e-01, 7.55554920e-01,
8.01303024e-01, 8.46174915e-01, 8.90121527e-01, 9.33094803e-01,
9.75047755e-01, 1.01593451e+00, 1.05571035e+00, 1.09433178e+00,
1.13175659e+00, 1.16794383e+00, 1.20285394e+00, 1.23644875e+00,
1.26869152e+00, 1.29954700e+00, 1.32898144e+00, 1.35696266e+00,
1.38346006e+00, 1.40844466e+00, 1.43188916e+00, 1.45376790e+00,
1.47405697e+00, 1.49273418e+00, 1.50977911e+00, 1.52517311e+00,
1.53889936e+00, 1.55094285e+00, 1.56129040e+00, 1.56993071e+00,
1.57685432e+00, 1.58205366e+00, 1.58552305e+00, 1.58725870e+00,
1.58725870e+00, 1.58552305e+00, 1.58205366e+00, 1.57685432e+00,
1.56993071e+00, 1.56129040e+00, 1.55094285e+00, 1.53889936e+00,
1.52517311e+00, 1.50977911e+00, 1.49273418e+00, 1.47405697e+00,
1.45376790e+00, 1.43188916e+00, 1.40844466e+00, 1.38346006e+00,
1.35696266e+00, 1.32898144e+00, 1.29954700e+00, 1.26869152e+00,
1.23644875e+00, 1.20285394e+00, 1.16794383e+00, 1.13175659e+00,
1.09433178e+00, 1.05571035e+00, 1.01593451e+00, 9.75047755e-01,
9.33094803e-01, 8.90121527e-01, 8.46174915e-01, 8.01303024e-01,
7.55554920e-01, 7.08980627e-01, 6.61631075e-01, 6.13558038e-01,
5.64814085e-01, 5.15452516e-01, 4.65527308e-01, 4.15093052e-01,
3.64204898e-01, 3.12918491e-01, 2.61289912e-01, 2.09375617e-01,
1.57232372e-01, 1.04917196e-01, 5.24872953e-02, 9.72048516e-17])
Coordinates:
* lat      (lat) float64 -90.0 -88.11 -86.21 -84.32 ... 84.32 86.21 88.11 90.0

### An alternative: use weights already provided in the dataset#

You will find that many gridded datasets already provide a field that gives accurate area weighting.

In the case of the CESM output, the field is called gw

weight_factor2 = atm_control.gw / atm_control.gw.mean(dim='lat')

weight_factor2

<xarray.DataArray 'gw' (lat: 96)>
array([0.00656136, 0.05248012, 0.10490285, 0.15721088, 0.209347  ,
0.26125419, 0.31287572, 0.36415511, 0.41503631, 0.46546367,
0.51538206, 0.56473688, 0.61347417, 0.66154063, 0.70888371,
0.75545164, 0.80119349, 0.84605925, 0.88999985, 0.93296725,
0.97491447, 1.01579563, 1.05556604, 1.09418219, 1.13160188,
1.16778418, 1.20268952, 1.23627974, 1.2685181 , 1.29936936,
1.32879977, 1.35677717, 1.38327095, 1.40825214, 1.43169343,
1.45356918, 1.47385547, 1.49253013, 1.50957273, 1.52496463,
1.538689  , 1.55073084, 1.56107698, 1.56971611, 1.57663877,
1.5818374 , 1.58530632, 1.58704173, 1.58704173, 1.58530632,
1.5818374 , 1.57663877, 1.56971611, 1.56107698, 1.55073084,
1.538689  , 1.52496463, 1.50957273, 1.49253013, 1.47385547,
1.45356918, 1.43169343, 1.40825214, 1.38327095, 1.35677717,
1.32879977, 1.29936936, 1.2685181 , 1.23627974, 1.20268952,
1.16778418, 1.13160188, 1.09418219, 1.05556604, 1.01579563,
0.97491447, 0.93296725, 0.88999985, 0.84605925, 0.80119349,
0.75545164, 0.70888371, 0.66154063, 0.61347417, 0.56473688,
0.51538206, 0.46546367, 0.41503631, 0.36415511, 0.31287572,
0.26125419, 0.209347  , 0.15721088, 0.10490285, 0.05248012,
0.00656136])
Coordinates:
* lat      (lat) float64 -90.0 -88.11 -86.21 -84.32 ... 84.32 86.21 88.11 90.0

### Compute the global, time average insolation#

#  Compute the global, time average insolation using our two different weight factors
#  Notice that we can apply the .mean() operation simultaneously over several dimensions!
(atm_control.SOLIN * weight_factor).mean(dim=('time', 'lon', 'lat'))
(atm_control.SOLIN * weight_factor2).mean(dim=('time', 'lon', 'lat'))

<xarray.DataArray ()>
array(340.30586262)

These numbers should both be very close to 340.3

This value is the global average insolation in units of W m$$^{-2}$$.

## Exercise: plotting a global average timeseries#

Plot a timeseries of the global average surface temperature in the control simulation.

Surface temperature is called 'TS' in the dataset.

Make a plot of the global average TS with time on the x axis. Make sure your global average is properly weighted as discussed above.

TSglobal = (atm_control.TS * weight_factor).mean(dim=('lon','lat'))
TSglobal

<xarray.DataArray (time: 240)>
array([285.72898682, 285.49401299, 286.27904211, 287.41386972,
288.27003459, 289.09844124, 289.31503214, 289.14779089,
288.48446959, 287.43825379, 286.30005108, 285.4822201 ,
285.52587345, 285.70978305, 286.51907145, 287.39663259,
288.43036138, 289.27239843, 289.49334981, 289.2270018 ,
288.56324199, 287.45506122, 286.36152137, 285.68297431,
285.64115059, 286.02903336, 286.66517814, 287.57771538,
288.5114103 , 289.15209553, 289.58562369, 289.51435914,
288.63846921, 287.62349476, 286.53952026, 285.71252715,
285.37900612, 285.68432086, 286.42280295, 287.53516323,
288.52624785, 289.36963343, 289.70397247, 289.38845523,
288.56153317, 287.61485545, 286.58915711, 286.02656523,
285.6876509 , 285.80444733, 286.77147652, 287.6000804 ,
288.47466646, 289.30900949, 289.54640526, 289.27115635,
288.60119008, 287.59500245, 286.47227001, 285.94176738,
285.50164567, 286.04132674, 286.69947865, 287.55425306,
288.51615009, 289.240104  , 289.53590774, 289.28245353,
288.4526808 , 287.4798987 , 286.32498383, 285.64667703,
285.53638463, 285.85984702, 286.48845149, 287.36777131,
288.34274536, 289.12537178, 289.46937558, 289.41924937,
...
288.66900064, 289.31232564, 289.71310487, 289.44685327,
288.71353149, 287.64889492, 286.73669719, 285.88254536,
285.93817813, 286.05794423, 286.63321729, 287.70598391,
288.73680573, 289.24659283, 289.6523757 , 289.31808086,
288.48550632, 287.3645237 , 286.46258606, 285.84163214,
285.38701708, 285.81055181, 286.62392007, 287.61150279,
288.43171261, 289.12028406, 289.4036265 , 289.16730135,
288.38935887, 287.26655169, 286.38843632, 285.56931263,
285.34475367, 285.70284077, 286.53049737, 287.37161667,
288.31928104, 288.98644684, 289.40416911, 289.18403561,
288.53231764, 287.34320592, 286.2297441 , 285.59461117,
285.26074355, 285.66897701, 286.59709518, 287.33668608,
288.406746  , 289.1222094 , 289.49321334, 289.45735775,
288.57569049, 287.47754096, 286.72450398, 286.1310169 ,
285.91070994, 286.24678875, 286.69563966, 287.79494103,
288.87472496, 289.55314895, 289.76593884, 289.43723345,
288.66578906, 287.63208369, 286.60205948, 286.15867963,
285.85908296, 285.81623454, 286.62021016, 287.67040383,
288.6576103 , 289.17532218, 289.57529292, 289.43251147,
288.62634617, 287.4513469 , 286.34874094, 285.93312728])
Coordinates:
* time     (time) object 0001-02-01 00:00:00 ... 0021-01-01 00:00:00
# Note we could just do a TSglobal.plot() here!
#  But this is an example of using matplotlib directly to get more control over the figure

fig, ax = plt.subplots()
ax.plot(TSglobal)
ax.set_xlabel('Months')
ax.set_ylabel('Temperature (K)')
ax.grid()
ax.set_title('Global mean temperature in the CESM slab ocean control simulation');


### Discussion point#

What do you see in this graph? Do you have any ideas about why the global average temperature looks like this?

Also, what is the time average global-average surface temperature in this simulation?

TSglobal.mean(dim='time')

<xarray.DataArray ()>
array(287.54590571)

### Finding the radiative fluxes in the model output#

Now that you can take global averages and time averages, we can compare some energy budget values against observations.

The model output contains lots of diagnostics about the radiative fluxes. Here are some CESM naming conventions to help you find the appropriate output fields:

• All variables whose names being with 'F' are an energy flux of some kind.

• Most have a four-letter code, e.g. 'FLNT'

• 'FL' means longwave flux (i.e. terrestrial)

• 'FS' means shortwave flux (i.e. solar)

• The third letter indicates direction of the flux:

• 'U' = up

• 'D' = down

• 'N' = net

• The fourth letter indicates the location of the flux:

• 'T' = top of atmosphere

• 'S' = surface

• So 'FLNT' means ‘net longwave flux at the top of atmosphere’, i.e. the outgoing longwave radiation or OLR.

You wil see that these are all 240 x 96 x 144 – i.e. a two-dimensional grid for every month in the simulation.

atm_control.FLNT

<xarray.DataArray 'FLNT' (time: 240, lat: 96, lon: 144)>
[3317760 values with dtype=float32]
Coordinates:
* time     (time) object 0001-02-01 00:00:00 ... 0021-01-01 00:00:00
* lat      (lat) float64 -90.0 -88.11 -86.21 -84.32 ... 84.32 86.21 88.11 90.0
* lon      (lon) float64 0.0 2.5 5.0 7.5 10.0 ... 350.0 352.5 355.0 357.5
Attributes:
units:              W/m2
long_name:          Net longwave flux at top of model
cell_methods:       time: mean

## Exercise: compute terms in the planetary energy budget#

Compute annual, global averages of the following four quantities.

1. Incoming solar radiation (or insolation)

3. Planetary albedo (remember this is a ratio of outgoing to incoming solar radiation)

Compare your results briefly to the observations.

## A few more tidbits#

Feel free to keep exploring the data!

Many other fields are four-dimensional (time, level, latitude, longitude).

For example, here the air temperature at every point and every month:

atm_control.T

<xarray.DataArray 'T' (time: 240, lev: 26, lat: 96, lon: 144)>
[86261760 values with dtype=float32]
Coordinates:
* lev      (lev) float64 3.545 7.389 13.97 23.94 ... 867.2 929.6 970.6 992.6
* time     (time) object 0001-02-01 00:00:00 ... 0021-01-01 00:00:00
* lat      (lat) float64 -90.0 -88.11 -86.21 -84.32 ... 84.32 86.21 88.11 90.0
* lon      (lon) float64 0.0 2.5 5.0 7.5 10.0 ... 350.0 352.5 355.0 357.5
Attributes:
mdims:         1
units:         K
long_name:     Temperature
cell_methods:  time: mean

Often we want to sample the data at a particular place and time. xarray gives us simple ways to do that.

For example, we can interpolate to a particular location in latitude and longitude (here it’s the coordinates of Albany NY):

Tlocal = atm_control.T.interp(lat=42.75, lon=(360-73.8))
Tlocal


We can also use time indexing to pick out a particular year and month:

#  The .sel notation mean "select" along the given coordinate
#  The string that follows is year-month. Our simulation begins in year 0001.
Tlocal.sel(time='0020-01')  # a particular January


Now, for example, we can plot the temperature as a function of pressure at this place and time:

Tlocal.sel(time='0020-01').plot()


## Credits#

This notebook is part of The Climate Laboratory, an open-source textbook developed and maintained by Brian E. J. Rose, University at Albany.