Mapping water extent and rainfall using WOfS and CHIRPS
Products used: wofs_ls, rainfall_chirps_monthly
Background
The United Nations have prescribed 17 “Sustainable Development Goals” (SDGs). This notebook attempts to monitor SDG Indicator 6.6.1 - change in the extent of water-related ecosystems. Indicator 6.6.1 has 4 sub-indicators:
i. The spatial extent of water-related ecosystems
ii. The quantity of water contained within these ecosystems
iii. The quality of water within these ecosystems
iv. The health or state of these ecosystems
This notebook primarily focuses on the first sub-indicator - spatial extents.
Description
The notebook loads WOfS feature layers to map the spatial extent of water bodies. It also loads and plots monthly total rainfall from CHIRPS. The last section will compare the water extent between two periods to allow visulazing where change is occuring.
Load packages
Import Python packages that are used for the analysis.
[1]:
%matplotlib inline
import datacube
import matplotlib.pyplot as plt
from deafrica_tools.dask import create_local_dask_cluster
from deafrica_tools.datahandling import wofs_fuser
from long_term_water_extent import (
load_vector_file,
get_resampled_labels,
resample_water_observations,
resample_rainfall_observations,
calculate_change_in_extent,
compare_extent_and_rainfall,
)
Set up a Dask cluster
Dask can be used to better manage memory use and conduct the analysis in parallel.
[ ]:
create_local_dask_cluster()
Connect to Data Cube
[3]:
dc = datacube.Datacube(app="long_term_water_extent")
Analysis parameters
The following cell sets the parameters, which define the area of interest and the length of time to conduct the analysis over.
Upload a vector file for your water extent and your catchment to the
data
folder.Set the time range you want to use.
Set the resampling strategy. Possible options include:
"1Y"
- Annual resampling, use this option for longer term monitoring"QS-DEC"
- Quarterly resampling from December"3M"
- Three-monthly resampling"1M"
- Monthly resampling
For more details on resampling timeframes, see the xarray and pandas documentation.
[4]:
water_extent_vector_file = "data/lake_baringo_extent.geojson"
water_catchment_vector_file = "data/lake_baringo_catchment.geojson"
time_range = ("2018-07", "2021")
resample_strategy = "Q-DEC"
dask_chunks = dict(x=1000, y=1000)
Get waterbody and catchment geometries
The next cell will extract the waterbody and catchment geometries from the supplied vector files, which will be used to load Water Observations from Space and the CHIRPS rainfall products.
[5]:
extent, extent_geometry = load_vector_file(water_extent_vector_file)
catchment, catchment_geometry = load_vector_file(water_catchment_vector_file)
Load Water Observation from Space for Waterbody
The first step is to load the Water Observations from Space product using the extent geometry.
[6]:
extent_query = {
"time": time_range,
"resolution": (-30, 30),
"output_crs": "EPSG:6933",
"geopolygon": extent_geometry,
"group_by": "solar_day",
"dask_chunks":dask_chunks
}
wofs_ds = dc.load(product="wofs_ls", fuse_func=wofs_fuser, **extent_query)
Identify water in each resampling period
The second step is to resample the observations to get a consistent measure of the waterbody, and then calculate the classified as water for each period.
[7]:
resampled_water_ds, resampled_water_area_ds = resample_water_observations(
wofs_ds, resample_strategy
)
date_range_labels = get_resampled_labels(wofs_ds, resample_strategy)
/usr/local/lib/python3.10/dist-packages/rasterio/warp.py:344: NotGeoreferencedWarning: Dataset has no geotransform, gcps, or rpcs. The identity matrix will be returned.
_reproject(
Plot the change in water area over time
[8]:
fig, ax = plt.subplots(figsize=(15, 5))
ax.plot(
date_range_labels,
resampled_water_area_ds.values,
color="red",
marker="^",
markersize=4,
linewidth=1,
)
plt.xticks(date_range_labels, rotation=65)
plt.title(f"Observed Area of Water from {time_range[0]} to {time_range[1]}")
plt.ylabel("Waterbody area (km$^2$)")
plt.tight_layout()

Load CHIRPS monthly rainfall
[9]:
catchment_query = {
"time": time_range,
"resolution": (-5000, 5000),
"output_crs": "EPSG:6933",
"geopolygon": catchment_geometry,
"group_by": "solar_day",
"dask_chunks":dask_chunks
}
rainfall_ds = dc.load(product="rainfall_chirps_monthly", **catchment_query)
Resample to estimate rainfall for each time period
This is done by taking calculating the average rainfall over the extent of the catchment, then summing these averages over the resampling period to estimate the total rainfall for the catchment.
[10]:
catchment_rainfall_resampled_ds = resample_rainfall_observations(
rainfall_ds, resample_strategy, catchment
)
Compare waterbody area to catchment rainfall
This step plots the summed average rainfall for the catchment area over each period as a histogram, overlaid with the waterbody area calculated previously.
[11]:
figure = compare_extent_and_rainfall(
resampled_water_area_ds, catchment_rainfall_resampled_ds, "mm", date_range_labels
)

Save the figure
[12]:
figure.savefig("waterarea_and_rainfall.png", bbox_inches="tight")
Compare water extent for two different periods
For the next step, enter a baseline date, and an analysis date to construct a plot showing where water appeared, as well as disappeared, by comparing the two dates.
[13]:
baseline_time = "2018-07-01"
analysis_time = "2021-10-01"
[14]:
figure = calculate_change_in_extent(baseline_time, analysis_time, resampled_water_ds)

Save figure
[15]:
figure.savefig("waterarea_change.png", bbox_inches="tight")
Additional information
License: The code in this notebook is licensed under the Apache License, Version 2.0. Digital Earth Africa data is licensed under the Creative Commons by Attribution 4.0 license.
Contact: If you need assistance, please post a question on the Open Data Cube Slack channel or on the GIS Stack Exchange using the open-data-cube
tag (you can view previously asked questions here). If you would like to report an issue with this notebook, you can file one on
Github.
Compatible datacube version:
[16]:
print(datacube.__version__)
1.8.15
Last Tested:
[17]:
from datetime import datetime
datetime.today().strftime('%Y-%m-%d')
[17]:
'2023-08-21'