The GriddedData class

This notebook introduces the GriddedData class of pyaerocom. The GriddedData class is the fundamental base class for the analysis of model data. The underlying data type is iris.cube.Cube which was extended, for instance by allowing direct imports of netCDF files when creating an instance of GriddedData (i.e. by passing the filename and specifying the variable name on initialisation). This notebook introduces some of the features of the GriddedData class. Starting with some imports:

import pyaerocom as pya
from warnings import filterwarnings
Initating pyaerocom configuration
Checking database access...
Checking access to: /lustre/storeA
Access to lustre database: True
Init data paths for lustre
Expired time: 0.016 s

Let’s get a test file to load

test_files =
for name, filepath in test_files["models"].items(): print("%s\n%s\n" %(name, filepath))


Let’s pick out the ECMWF OSUITE test file and load the data directly into an instance of the GriddedData class. The GriddedData class takes either preloaded instances of the iris.cube.Cube class as input, or a valid netCDF file path. The latter requires specification of the variable name which is then filtered from the data stored in the netCDF file (which may contain multiple variables. The following example imports the data for the aerosol optical density at 550 nm. The string representation of the GriddedData class (see print at end of following code cell) was slitghtly adapted from the underlying Cube object.


fpath = test_files["models"]["ecmwf_osuite"]
data = pya.GriddedData(input=fpath, var_name="od550aer", data_id="ECMWF_OSUITE")
Overwriting unit unknown in cube od550aer with value "1"
pyaerocom.GriddedData: ECMWF_OSUITE
Grid data: Dust Aerosol Optical Depth at 550nm / (1) (time: 365; latitude: 451; longitude: 900)
     Dimension coordinates:
          time                                 x              -               -
          latitude                             -              x               -
          longitude                            -              -               x
          Conventions: CF-1.0
          NCO: 4.7.2
          computed: False
          concatenated: False
          data_id: ECMWF_OSUITE
          from_files: ['/lustre/storeA/project/aerocom/aerocom1/ECMWF_OSUITE_NRT_test/rename...
          history: Tue Mar 20 13:08:49 2018: ncks -7 -O -o -x -v time_bnds
          history_of_appended_files: Tue Mar 20 02:09:15 2018: Appended file /lustre/storeA/project/aerocom/aerocom1/ECMWF_OSUITE_NRT/renamed//
          invalid_units: ~
          is_at_stations: False
          nco_openmp_thread_number: 1
          outliers_removed: False
          reader: None
          region: None
          regridded: False
          ts_type: daily
          var_name: od550aer
          var_name_read: n/d
          year: 2018
     Cell methods:
          mean: time

Remark on longitude definition

If the longitudes in the original NetCDF file are defined as:

\[0 \leq\,\text{lon}\,\leq360\]

then, pyaerocom converts automatically to:


when an instance of the GriddedData class is created (see print statment above Rolling longitudes to -180 -> 180 definition). This is, for instance, the case for the ECMWF OSUITE data files.

Basic attributes of the GriddedData class

In the following cells, some of the most important attributes are introduced. These are mostly reimplementations of the underlying iris.Cube data object, which is stored in and can be accessed via the GriddedData.cube attribute. For instance the attribute GriddedData.longitude will access GriddedData.grid.coord("longitude"), GriddedData.latitude will return GriddedData.grid.coord("latitude") and GriddedData.time will return the time dimension array (GriddedData.grid.coord("time")).


Side note: the unit is obviously not specified in this dataset, which is part of the game, unfortunately, when working with external data…

data.longitude.points.min(), data.longitude.points.max()
(-180.0, 179.60000610351562)
data.latitude.points.min(), data.latitude.points.max()
(-90.0, 90.0)
data.time.points.min(), data.time.points.max()
(0.0, 364.0)
tstamps = data.time_stamps()
print(tstamps[0], tstamps[-1])
2018-01-01T00:00:00.000000 2018-12-31T00:00:00.000000

If you do not specify the variable type, an Exception is raised, that will get you some information about what variables are available in the file (if the file is readable using the iris.load method).

    data = pya.GriddedData(input=fpath)
except pya.exceptions.NetcdfError as e:
    print("This did not work...error message: %s" %repr(e))
This did not work...error message: NetcdfError("Could not load single cube from /lustre/storeA/project/aerocom/aerocom1/ECMWF_OSUITE_NRT_test/renamed/ Please specify var_name. Input file contains the following variables: ['od550dust', 'od550oa', 'od550aer', 'od550so4', 'od550bc']")

Also, if you parse an invalid variable name, you will get some hint.

    data = pya.GriddedData(input=fpath, var_name="Blaaa")
except Exception as e:
    print("This also did not work...error message: %s" %repr(e))
This also did not work...error message: NetcdfError('Variable Blaaa not available in file /lustre/storeA/project/aerocom/aerocom1/ECMWF_OSUITE_NRT_test/renamed/')

You can have a quick look at the data using the class-own quickplot method

data.quickplot_map(time_idx=0, vmin=0, vmax=1, c_over="r");

Why not load some of the other variables…

data_bc = pya.GriddedData(fpath, var_name="od550bc", data_id="ECMWF_OSUITE")
data_so4 = pya.GriddedData(fpath, var_name="od550so4", data_id="ECMWF_OSUITE")
Overwriting unit unknown in cube od550bc with value "1"
Overwriting unit unknown in cube od550so4 with value "1"

… and plot them as well


Apply custom crop and plot

data_so4_cropped = data_so4.crop(lon_range=(-30, 30),
                                 lat_range=(10, 60))

data_so4_cropped.quickplot_map(time_idx='25-2-2018', xlim=(-100, 100), ylim=(-70, 70));

Change resolution

Downscale to 2x2 resolution:

lons = np.arange(-180, 180, 2)
lats = np.arange(-90, 90, 2)

data_lowres = data.interpolate(longitude=lons, latitude=lats)
Interpolating data of shape (365, 451, 900). This may take a while.
Successfully interpolated cube

And plot:

fig =data_lowres.quickplot_map()

Area weighted mean

Retrieve area weighted mean from data

area_mean = data.get_area_weighted_timeseries()
Trying to infer ts_type in StationData ECMWF_OSUITE for variable od550aer

… more to come

This tutorial is not yet completed as the GriddedData class is currently under development.