Irrigated crop water requirements
Keywords data used; wapor, data used; crop mask
Background
The water requirement for irrigated crops can be estimated using a range of spatial data related to actual, potential, and reference evapotranspiration. The estimation can assist in optimising the volume and timing of irrigation events and ensure that water resources are appropriately managed. This can enhance overall water use efficiency and support data-driven, effective use of water resources.
This notebook collates a few key concepts:
Evapotranspiration (ET) is the process of water moving from the ground to the atmosphere via evaporation from soil and water vapor through the leaves of plants. Actual evapotranspiration (ETact) is the observed, or in the case of Earth Observation, estimated, rate of ET.
Reference evapotranspiration (ETref) is the ET from a hypothetical, well-watered short grass reference crop going under ideal conditions, representing the climatic demand for water based on prevailing weather conditions.
Crop evapotranspiration (ETc) is reference evapotranspiration applied to a specific crop type and growth stage, representing the maximum ET from a given crop under given weather conditions. ETc is produced by multiplying ETref by a crop coefficient (Kc).
These concepts will be explored further in this notebook. They are also described in various materials produced by the Food and Agriculture Organisation (FAO) including the introduction to ET.
Description
This notebook shows how actual and reference evapotranspiration can be used to identify how adequate previous irrigation has been for a given area and calculate the water deficit, if any.
First, cropland area is identified using the cropland extent map.
Second, actual water usage for 2024 is estimated.
Finally, the water deficit is plotted and calculated to inform irrigation requirements for future seasons.
Getting started
To run this analysis, run all the cells in the notebook, starting with the “Load packages” cell.
Load packages
Import Python packages that are used for the analysis.
Use standard import commands; some are shown below. Begin with any iPython magic commands, followed by standard Python packages, then any additional functionality you need from the Tools package.
[1]:
%matplotlib inline
import datacube
import matplotlib.dates as mdates
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
import rioxarray
import xarray as xr
import geopandas as gpd
from datacube.utils.masking import mask_invalid_data
from deafrica_tools.areaofinterest import define_area
from deafrica_tools.plotting import display_map
from deafrica_tools.load_wapor import get_all_WaPORv3_mapsets, get_WaPORv3_info, load_wapor_ds
from odc.geo.geom import Geometry
from odc.geo.xr import xr_reproject
from wapordl import wapor_map, wapor_ts
INFO: WaPORDL (`1.2.1`)
Connect to the datacube
Connect to the datacube so we can access DE Africa data. The app parameter is a unique name for the analysis which is based on the notebook file name.
[2]:
dc = datacube.Datacube(app='Irrigation_water')
Analysis parameters
The cell below specifies the folder where the downloaded data will be stored. If you are using this script repeatedly, it is recommended you empty this folder from time to time to reduce storage on the Sandbox volume.
Please note that all data in this folder will be lost when you log out of the Sandbox environment. Best practice: Download or move any important files to external storage (e.g. Google Drive or your local machine) before logging out to avoid losing your work as described in the landing page.
[3]:
folder = "../Supplementary_data/WaPOR" # folder that the data will be sent to
Next, the area of interest is defined. This can also be a .geojson file which the loading function accepts. Otherwise, there are two methods available:
By specifying the latitude, longitude, and buffer. This method requires you to input the central latitude, central longitude, and the buffer value in square degrees around the center point you want to analyze. For example,
lat = 30.75,lon = 31.35, andbuffer = 0.1will select an area with a radius of 0.1 square degrees around the point with coordinates (30.75, 31.35).Alternatively, you can provide separate buffer values for latitude and longitude for a rectangular area. For example,
lat = 30.75,lon = 31.35, andlat_buffer = 0.1andlon_buffer = 0.08will select a rectangular area extending 0.1 degrees north and south, and 0.08 degrees east and west from the point(30.75, 31.35).For reasonable loading times, set the buffer as
0.1or lower.By uploading a polygon as an
Esri Shapefile. If you choose this option, you will need to upload the geojson or ESRI shapefile into the Sandbox using Upload Files button
in the top left corner of the Jupyter Notebook interface. ESRI shapefiles must be uploaded with all the related files (.cpg, .dbf, .shp, .shx). Once uploaded, you can use the shapefile or geojson to define the area of interest. Remember to update the code to call the file you have uploaded.
To use one of these methods, you can uncomment the relevant line of code and comment out the other one. To comment out a line, add the "#" symbol before the code you want to comment out. By default, the first option which defines the location using latitude, longitude, and buffer is being used.
If running the notebook for the first time, keep the default settings below. This will demonstrate how the analysis works and provide meaningful results.
As for the loading WaPOR data notebook, this demonstration notebook loads an area of cropland in the Nile Delta, Egypt. The Nile Delta supports irrigated agriculture in a very arid climate. This means it has very low cloud cover and easily distinguishable cropping patterns from satellite imagery, making it a useful testing area for Earth Observation based analyses.
[4]:
# Method 1: Specify the latitude, longitude, and buffer
aoi = define_area(lat=30.75, lon=31.35, buffer=0.03)
# Method 2: Use a polygon as a GeoJSON or Esri Shapefile.
# aoi = define_area(vector_path='aoi.shp')
#Create a geopolygon and geodataframe of the area of interest
geopolygon = Geometry(aoi["features"][0]["geometry"], crs="epsg:4326")
geopolygon_gdf = gpd.GeoDataFrame(geometry=[geopolygon], crs=geopolygon.crs)
# Get the latitude and longitude range of the geopolygon
lat_range = (geopolygon_gdf.total_bounds[1], geopolygon_gdf.total_bounds[3])
lon_range = (geopolygon_gdf.total_bounds[0], geopolygon_gdf.total_bounds[2])
region = [geopolygon_gdf.total_bounds[0], geopolygon_gdf.total_bounds[1], geopolygon_gdf.total_bounds[2], geopolygon_gdf.total_bounds[3]]
display_map(x=lon_range, y=lat_range)
[4]:
Load the cropland mask
We are interested in irrigation, so we limit our analysis to cropland only using the Digital Earth Africa cropland extent map. This map has been validated as having good accuracy and high resolution for Africa, among other cropland extent products. We are making the assumption that all cropland in the area of interest is irrigated, which is reasonably safe in Egypt given its very dry climate and reliance on irrigation. In other regions, other irrigated area or land use/land cover classification layers may be required to isolate the analysis to irrigated agriculture.
Firstly, we load the cropland mask for the area of interest.
[5]:
cm = dc.load(
product="crop_mask",
y = (region[1],region[3]),
x = (region[0],region[2]),
time=("2019"),
resolution = (-30,30),
measurements="filtered",
resampling="nearest"
).filtered.squeeze()
cm.where(cm < 255).plot.imshow(
add_colorbar=False, figsize=(6, 6)
) # we filter to <255 to omit missing data
plt.title("Cropland Extent");
Load reference ET
As identified earlier, ETref is the amount of water vapour that would be lost to the atmosphere from well-watered grass. It represents the ET that would apply to short crops in ideal growing conditions under given climate conditions.
We are ingesting the level 1 global ETref data from WaPOR v3 at monthly timesteps. It is low resolution (~30km) because it depends on climate information, such as temperature and wind speed, which are generally available at low resolution. For this reason, we are taking data from a larger area by adding a degree to both the latitude and longitude of our area of interest.
[6]:
variable = "L1-RET-M"
period= ["2024-01-01", "2024-12-31"]
region_big = [geopolygon_gdf.total_bounds[0] - 0.5, geopolygon_gdf.total_bounds[1] - 0.5, # create a bigger region due to low resolution of product and abundance of NaNs
geopolygon_gdf.total_bounds[2] + 0.5, geopolygon_gdf.total_bounds[3] + 0.5]
etref = wapor_map(region_big, variable, period, folder, extension = '.nc')
INFO: Found 12 files for L1-RET-M.
INFO: Converting from `.tif` to `.nc`.
Once the data is available in our Supplementary_data/WaPOR folder as NetCDF, we can load it into an xarray object, below. We immediately filter out the no data values which are stored as -999, so we use a >0 filter. We note that the scale_factor of 0.1 has been included in the load_wapor_ds() function so our data are in mm/month.
[7]:
etref_xr = load_wapor_ds(filename=etref, variable=variable)
etref_xr = etref_xr.where(etref_xr['L1-RET-M'] > 0) # filter out NaNs
etref_xr
[7]:
<xarray.Dataset> Size: 3kB
Dimensions: (time: 12, y: 6, x: 5)
Coordinates:
* time (time) datetime64[ns] 96B 2024-01-01 2024-02-01 ... 2024-12-01
* y (y) float64 48B 31.38 31.12 30.88 30.62 30.38 30.12
* x (x) float64 40B 30.78 31.09 31.41 31.72 32.03
spatial_ref int32 4B 4326
Data variables:
L1-RET-M (time, y, x) float64 3kB 74.3 73.5 72.8 73.6 ... nan nan nan
Attributes: (12/13)
lat#long_name: latitude
lat#standard_name: latitude
lat#units: degrees_north
lon#long_name: longitude
lon#standard_name: longitude
lon#units: degrees_east
... ...
overview: NONE
temporal_resolution: Month
units: mm/month
scale_factor: 0.1
_FillValue: -9999
add_offset: 0.0We can see that 12 months of data are available. The low resolution means we are not really interested in the spatial pattern of ETref, so we look at the timeseries of monthly ETref values for our area of interest (noting we have taken data from a larger area for practical purposes).
We can see that in 2024, ETref was highest in June and July (summer in Egypt) and lowest in December and January (winter).
[8]:
etref_ts = etref_xr.mean(["x", "y"])
(etref_ts['L1-RET-M']).plot()
[8]:
[<matplotlib.lines.Line2D at 0x7f8839104200>]
Load actual evapotranspiration
Next, we load ETact for the area of interest. Again, this data is the ET that actually occurred and is therefore based on higher resolution observations like surface reflectance and temperature, and available in higher spatial resolution (20m for the WaPOR Northern Egypt level 3 data we will load). We will also use monthly (mm/month) timesteps for ETact.
The same process is followed for loading WaPOR data where it is first ingest as a NetCDF then loaded as an xarray object. No data values (stored as -999.9) are filtered out and the scale_factor is applied within the load_wapor_ds() function.
[9]:
variable = "L3-AETI-M"
period= ["2024-01-01", "2024-12-31"]
etact = wapor_map(region, variable, period, folder, extension = '.nc')
etact_xr = load_wapor_ds(filename=etact, variable=variable)
etact_xr = etact_xr.where(etact_xr['L3-AETI-M'] > 0) # filter out NaN values
etact_xr
WARNING: `region` intersects with multiple L3 regions (['ENO', 'ZAN']), continuing with ENO only.
INFO: Found 12 files for L3-AETI-M.
INFO: Converting from `.tif` to `.nc`.
[9]:
<xarray.Dataset> Size: 10MB
Dimensions: (time: 12, y: 338, x: 294)
Coordinates:
* time (time) datetime64[ns] 96B 2024-01-01 2024-02-01 ... 2024-12-01
* y (y) float64 3kB 3.406e+06 3.406e+06 ... 3.4e+06 3.4e+06
* x (x) float64 2kB 3.391e+05 3.392e+05 ... 3.45e+05 3.45e+05
spatial_ref int32 4B 32636
Data variables:
L3-AETI-M (time, y, x) float64 10MB nan nan nan nan ... nan nan nan nan
Attributes:
long_name: Actual EvapoTranspiration and Interception
overview: NONE
temporal_resolution: Month
units: mm/month
scale_factor: 0.1
_FillValue: -9999
add_offset: 0.0The higher spatial resolution ETact product is nice to plot spatially for a selected month (June below).
[10]:
(etact_xr.isel(time=5)['L3-AETI-M']).plot()
[10]:
<matplotlib.collections.QuadMesh at 0x7f8882a1d160>
Mask ETact to cropland
Next, we need to limit our analyses to cropland and exclude other land cover classes. We do this by reprojecting the ETact data to match the cropland extent data, then performing a masking step. The masking can be confirmed with a plot.
[11]:
etact_xr_reprojected = etact_xr.odc.reproject(how=cm.odc.geobox, resampling="average")
etact_xr_reprojected = mask_invalid_data(etact_xr_reprojected)
etact_xr_crop = etact_xr_reprojected.where(cm == 1)
etact_xr_crop.isel(time=5)['L3-AETI-M'].plot()
[11]:
<matplotlib.collections.QuadMesh at 0x7f883158d1c0>
Plot ETact and ETref
The progression of ETact and ETref through 2024 can be visualised together. ETref is consistently higher, as expected, because soil water and crop condition is insufficient to allow maximum ET for a hypothetical well-watered grass crop (ETref). The dip in ETact in May is likely due to a seasonal transition between crops.
[12]:
etact_crop_ts = etact_xr_crop.mean(['x', 'y'])
fig, ax = plt.subplots()
ax.plot(etref_ts.time, etref_ts['L1-RET-M'], label="ETref")
ax.plot(etact_crop_ts.time, etact_crop_ts['L3-AETI-M'], label="ETact")
ax.fill_between(etref_ts.time, etact_crop_ts['L3-AETI-M'], etref_ts['L1-RET-M'], alpha=0.2)
ax.set_xlabel('Date')
ax.legend()
ax.set_ylabel('mm/month')
[12]:
Text(0, 0.5, 'mm/month')
Calculate irrigation deficit
Next, we can calculate crop water requirements for a given month. To do this, we have to make some assumptions about crop type and growth stage. Given that wheat is a major crop in Egypt and is often planted to grow in the later part of the year, we will assume that we have a mid-season wheat crop in August 2024 across cropland in our area of interest. Of course, this assumption could be adapted for different times and periods of interest.
This allows us to assign a crop coefficient (Kc) of 1.15 based on Table 12 in the FAO Chapter on crop coefficients. There are many other sources of Kc values which could be drawn upon.
As mentioned in the Background section, crop evapotranspiration (ETc) is reference ET for a specific crop at a given growth stage and is the product of ETref and Kc. We therefore multiply ETref by our Kc value 1.15 to give ETc for August 2024. Then, we can find the difference between ETc and ETact which is effectively the deficit as printed below.
[13]:
etc = etref_ts.sel(time='2024-08-01')['L1-RET-M'].values * 1.15
etact = etact_crop_ts.sel(time='2024-08-01')['L3-AETI-M'].values
et_def = etc - etact
print(f"The crop evapotranspiration for mid-season wheat in August 2024 was {round(etc)}mm while the actual evapotranspiration was {round(float(etact))}mm, giving a deficit of {round(et_def)}mm.")
The crop evapotranspiration for mid-season wheat in August 2024 was 262mm while the actual evapotranspiration was 191mm, giving a deficit of 71mm.
Calculate crop water use
If we know the area of cropland we are dealing with we can estimate total water use and requirements.
First, we need to estimate the area of cropland in our area of interest. The first step is counting the number of pixels classified as cropland in the cropland extent product. Next, we multiply the area of each pixel by the number of pixels. This gives us the area of cropland in square metres which we can convert to hectares.
[14]:
res_x, res_y = cm.rio.resolution()
pixel_area = abs(res_x) * abs(res_y)
cm_pixels = cm.where(cm == 1).count().item()
crop_sq_m = pixel_area * cm_pixels
crop_ha = crop_sq_m * 0.0001 # 0.0001 (sq. m to ha)
print(f"Our area of interest has a total cropped area of {crop_ha:.2f} ha.")
Our area of interest has a total cropped area of 3235.59 ha.
Finally, we can estimate the total volume of irrigated crop water consumption based on 1mm being equal to 1L/sq m.
[15]:
annual_av_etact = (etact_crop_ts.sum('time')['L3-AETI-M']).values
total_eta_megl = (annual_av_etact * crop_sq_m) / 1e+6 # Actual ET converted to megalitres
print(f"Based on a spatial average annual ETact of {round(float(annual_av_etact))}mm and an area of {crop_ha} ha, {round(total_eta_megl)} megalitres of water was transferred to the atmosphere from cropland for our area of interest in 2024.")
Based on a spatial average annual ETact of 1302mm and an area of 3235.59 ha, 42121 megalitres of water was transferred to the atmosphere from cropland for our area of interest in 2024.
Limitations and next steps
This workflow has ignored rainfall, which decreases irrigation requirements, and drainage, which increases irrigation requirements. A next step in analysis could be incorporating monthly rainfall data to calculate the proportion of crop water needs supplied by rainfall versus irrigation.
The workflow also provides the first steps in understanding water requirements for specific crops and stages of growth. Users with information on a particular area of interest will be able to leverage this notebook as a starting point to further analysis of irrigated crop performance and water consumption.
Additional information
License: The code in this notebook is licensed under the Apache License, Version 2.0. Digital Earth Africa data is licensed under the Creative Commons by Attribution 4.0 license.
Contact: If you need assistance, please post a question on the Open Data Cube Slack channel or on the GIS Stack Exchange using the open-data-cube tag (you can view previously asked questions here). If you would like to report an issue with this notebook, you can file one on
Github.
Compatible datacube version:
[16]:
print(datacube.__version__)
1.8.20
Last Tested:
[17]:
from datetime import datetime
datetime.today().strftime('%Y-%m-%d')
[17]:
'2025-11-25'