Rainfall anomalies from Climate Hazards Group InfraRed Precipitation with Station data (CHIRPS)

Keywords: datasets; CHIRPS, climate, rainfall, monthly


Rainfall anomalies are deviations of rainfall from long-run averages. They are useful for identifying wet and dry periods which can be linked to climatically influenced patterns such as flooding, river flows, and agricultural production.


In this real world example we will calculate rainfall anomalies for a selected African country using the CHIRPS monthly rainfall dataset. Standardised anomaly is calculated as:

\begin{equation} \text{Standardised anomaly }=\frac{x-m}{s} \end{equation}

x is the seasonal mean, m is the long-term mean, and s is the long-term standard deviation.

This means we need a long-term reference period (m) and a period of interest (x) for which we’ll calculate the anomalies. This notebook names datasets ds_rf_m and ds_rf_x accordingly.

The notebook outlines:

  1. Loading a shapefile for African countries and selecting a single country

  2. Loading and rainfall data and masking it to the selected country.

  3. Calculating monthly rainfall anomalies and plotting the result, aggregated over space, as a bar chart.

  4. Calculating and plotting monthly rainfall anomalies spatially.

Getting started

To run this analysis, run all the cells in the notebook, starting with the “Load packages” cell.

Load packages

%matplotlib inline

# Force GeoPandas to use Shapely instead of PyGEOS
# In a future release, GeoPandas will switch to using Shapely by default.
import os
os.environ['USE_PYGEOS'] = '0'

import datacube
import numpy as np
import pandas as pd
import geopandas as gpd
import xarray as xr
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
from datacube.utils.geometry import Geometry, CRS
from deafrica_tools.spatial import xr_rasterize
from deafrica_tools.dask import create_local_dask_cluster

Set up a Dask cluster

Dask can be used to better manage memory use and conduct the analysis in parallel. For an introduction to using Dask with Digital Earth Africa, see the Dask notebook.

Note: We recommend opening the Dask processing window to view the different computations that are being executed; to do this, see the Dask dashboard in DE Africa section of the Dask notebook.

To use Dask, set up the local computing cluster using the cell below.

[ ]:

Analysis parameters

The following cell sets important parameters for the analysis:

  • country: In this analysis, we’ll select an African country to mask the dataset and analysis.

  • time_m: CHIRPS monthly rainfall is available from 1981. The long-term mean for rainfall anomalies is often calculated on a 30-year period, so we’ll use 1981 to 2011 in this example.

  • time_x: This is the period for which we want to calculate anomalies.

  • resolution: We’ll use 5,000 m, which is approximately equal to the default resolution shown above.

  • dask_chunks: the size of the dask chunks, dask breaks data into manageable chunks that can be easily stored in memory, e.g. dict(x=1000,y=1000)

Standardised anomaly is calculated as:

\begin{equation} \text{Standardised anomaly }=\frac{x-m}{s} \end{equation}

\(x\) is the seasonal mean, \(m\) is the long-term mean, and \(s\) is the long-term standard deviation.

This means we need a long-term reference period (m) and a period of interest (x) for which we’ll calculate the anomalies. This notebook names datasets ds_rf_m and ds_rf_x accordingly.

If running the notebook for the first time, keep the default settings below. This will demonstrate how the analysis works and provide meaningful results.

# Select a country, for the example we will use Kenya, a complete list of countries is available below.
country = "Kenya"

# Set the range of dates for the climatology, this will be the reference period (m) for the anomaly calculation.
# Standard practice is to use a 30 year period, so we've used 1981 to 2011 in this example.
time_m = ('1981', '2011')

# time period for monthly anomaly (x)
time_x = ('1981', '2021')

# CHIRPS has a spatial resolution of ~5x5 km
resolution = (-5000, 5000)

#size of dask chunks
dask_chunks = dict(x=500,y=500)

Connect to the datacube

Connect to the datacube so we can access DE Africa data. The app parameter is a unique name for the analysis which is based on the notebook file name.

dc = datacube.Datacube(app='rainfall_anomaly')

Load the African Countries shapefile

This shapefile contains polygons for the boundaries of African countries and will allows us to calculate rainfall anomalies within a chosen country

african_countries = gpd.read_file('../Supplementary_data/Rainfall_anomaly_CHIRPS/african_countries.geojson')
Make this Notebook Trusted to load map: File -> Trust Notebook