Sentinel-5P Tropomi Level-2

Keywords data used; sentinel-5P,:index:datasets; sentinel-5P

Background

Sentinel-5P is an Earth observation satellite launched in 2017 and operated by the European Space Agency under the Copernicus Programme. It was developed as a precursor mission to the future Sentinel-5 mission to ensure continuity of atmospheric observations following earlier European missions.

The core objective of Sentinel-5P is to provide near-real-time, global measurements of atmospheric composition and air quality. Its data supports scientific research, environmental monitoring, and policy implementation related to air pollution, climate change, ozone layer depletion, and human health. The mission produces datasets that allow scientists to track pollutants and trace gases spatially and temporally across the globe.

Sentinel-5P carries a single instrument—the TROPOMI (Tropospheric Monitoring Instrument)—which measures solar radiation reflected from the Earth’s atmosphere and surface across ultraviolet, visible, near-infrared, and shortwave infrared spectral bands. From these measurements, concentration values for various trace gases and aerosol properties are retrieved. Compared to previous instruments, TROPOMI offers unprecedented spatial resolution, enabling detection of emission sources such as urban pollution, biomass burning, and industrial activities.

Level-2 products are geophysical datasets derived from measured radiances and converted into physically meaningful variables such as gas concentrations and aerosol properties. Each product addresses a specific atmospheric component relevant to air quality, weather, and climate processes.

Sentinel-5P TROPOMI Level-2 Products and Their Functions

  1. UV Aerosol Index (UVAI) detects the presence of light-absorbing aerosol particles such as smoke, dust, and volcanic ash in the atmosphere, helping to monitor air pollution and transboundary aerosol transport.

  2. Methane (CH₄) Column-Averaged Mixing Ratio measures the average concentration of methane in the atmosphere and is used to identify major greenhouse gas emission sources and support climate change research.

  3. Cloud fraction quantifies the proportion of each satellite pixel covered by clouds, which is essential for correcting trace-gas retrievals and analyzing cloud coverage patterns.

  4. Cloud albedo represents how much sunlight is reflected by clouds and helps assess their cooling or warming effect on the Earth’s climate system.

  5. Cloud top pressure indicates cloud height and is used to study atmospheric structure, weather systems, and radiative impacts of clouds.

  6. Carbon Monoxide (CO) total column measures atmospheric CO concentration, serving as an indicator of combustion processes such as biomass burning, industrial activity, and traffic emissions.

  7. Formaldehyde (HCHO) total column is used to detect emissions from vegetation and human activities and acts as a proxy for volatile organic compounds (VOCs) that contribute to ozone formation.

  8. Nitrogen Dioxide (NO₂) total column measures total atmospheric NO₂ including both stratospheric and surface contributions and is used in pollution assessment and atmospheric chemistry studies.

  9. Tropospheric Nitrogen Dioxide (NO₂) isolates surface-level NO₂ levels that directly reflect urban pollution from vehicles, power plants, and industries.

  10. Ozone (O₃) total column measures the total ozone layer thickness and is essential for monitoring ultraviolet radiation exposure and atmospheric health.

  11. Sulfur Dioxide (SO₂) total column detects trace amounts of SO₂ released from power plants and volcanoes and is used in air-quality regulation and disaster monitoring.

This notebook demonstrates how each Sentinel-5P TROPOMI product can be accessed and loaded through the Digital Earth Africa (DE Africa) platform.


Description

This notebook will cover following topics:

  1. Inspecting the products and measurements available in the datacube

  2. Loading Sentinel-5P_*** Datasets.

  3. Plotting the results


Getting started

To run this analysis, run all the cells in the notebook, starting with the “Load packages” cell.

Load packages

Import Python packages that are used for the analysis.

[1]:
%matplotlib inline

import numpy as np
import xarray as xr
import pandas as pd
import datacube
import geopandas as gpd
from odc.geo.geom import Geometry

from deafrica_tools.plotting import display_map
from deafrica_tools.areaofinterest import define_area

Connect to the datacube

Connect to the datacube so we can access DE Africa data.

[2]:
dc = datacube.Datacube(app="Sentinel_5P")

List products

We can use datacube’s list_products functionality to inspect DE Africa’s products that are available in the datacube. The table below shows the product names that we will use to load the data, a brief description of the data, and the satellite instrument that acquired the data.

[3]:
dc.list_products().loc[dc.list_products()['description'].str.contains('Sentinel-5')]
[3]:
name description license default_crs default_resolution
name
s5p_tropomi_l2_aer_ai s5p_tropomi_l2_aer_ai Sentinel-5p TROPOMI Level 2 UV Aerosol Index. CC-BY-4.0 EPSG:4326 Resolution(x=0.01, y=-0.01)
s5p_tropomi_l2_ch4 s5p_tropomi_l2_ch4 Sentinel-5P TROPOMI Level 2 Methane (CH4) colu... CC-BY-4.0 EPSG:4326 Resolution(x=0.01, y=-0.01)
s5p_tropomi_l2_cloud s5p_tropomi_l2_cloud Sentinel-5p TROPOMI Level 2 Cloud fraction, al... CC-BY-4.0 EPSG:4326 Resolution(x=0.01, y=-0.01)
s5p_tropomi_l2_co s5p_tropomi_l2_co Sentinel-5p TROPOMI Level 2 Carbon Monoxide (C... CC-BY-4.0 EPSG:4326 Resolution(x=0.01, y=-0.01)
s5p_tropomi_l2_hcho s5p_tropomi_l2_hcho Sentinel-5p TROPOMI Level 2 Formaldehyde (HCHO... CC-BY-4.0 EPSG:4326 Resolution(x=0.01, y=-0.01)
s5p_tropomi_l2_no2 s5p_tropomi_l2_no2 Sentinel-5p TROPOMI Level 2 Nitrogen Dioxide (... CC-BY-4.0 EPSG:4326 Resolution(x=0.01, y=-0.01)
s5p_tropomi_l2_o3 s5p_tropomi_l2_o3 Sentinel-5p TROPOMI Level 2 Ozone (O3) total c... CC-BY-4.0 EPSG:4326 Resolution(x=0.01, y=-0.01)
s5p_tropomi_l2_so2 s5p_tropomi_l2_so2 Sentinel-5p TROPOMI Level 2 Sulfur Dioxide (SO... CC-BY-4.0 EPSG:4326 Resolution(x=0.01, y=-0.01)

List measurements

We can further inspect the data available for the Sentinel-5P Land product using datacube’s list_measurements functionality.

To retrieve the measurement information for any product listed above, please select the product by specifying its name. For example, we will use s5p_tropomi_l2_aer_ai.

[4]:
product = "s5p_tropomi_l2_so2"

The table below lists each of the measurements available in the data.

[5]:
measurements = dc.list_measurements()
measurements.loc[product]
[5]:
name dtype units nodata aliases flags_definition add_offset scale_factor
measurement
SO2 SO2 float32 mol/m^2 -9999.0 NaN NaN 0.0 1.0
dataMask dataMask uint8 1 255.0 NaN NaN 0.0 1.0

Summary of the Measurements

Sentinel-5P TROPOMI Level-2 products are provided as fully geophysical variables, meaning the values are already delivered in standard scientific units such as mol/m², Pascals, Parts per Billion, or unitless indices, with no additional scale factors or offsets required.Each product represents a different atmospheric property, so the expected numerical ranges vary according to the physical quantity being measured—for example, trace-gas total columns typically fall within small molar concentrations, while cloud variables are reported as dimensionless fractions between 0 and 1. Understanding these units and their typical value ranges is important for interpreting the magnitude and behavior of each atmospheric parameter, identifying abnormal events, and performing consistent regional or temporal comparisons. The table below summarises the units, typical data ranges, and scale/offset requirements for all Sentinel-5P Level-2 products used in this study.

Product

Units

Typical Range

Scale/Offset

UV Aerosol Index (UVAI)

Unitless

-1 to 5

No

Methane (CH₄) Column-Averaged Mixing Ratio

Parts Per Billion (PPB)

1600 to 2000

No

Cloud Fraction

Unitless (0–1)

0 to 1

No

Cloud Albedo

Unitless (0–1)

0.05 to 0.8

No

Cloud Top Pressure

Pascals

1000 to 110000

No

Carbon Monoxide (CO) Total Column

mol/m²

0 to 0.1

No

Formaldehyde (HCHO) Total Column

mol/m²

0 to 0.001

No

Nitrogen Dioxide (NO₂) Total Column

mol/m²

0 to 0.0003

No

Tropospheric NO₂ Column

mol/m²

0 to 0.0003

No

Ozone (O₃) Total Column

mol/m²

0 to 0.36

No

Sulfur Dioxide (SO₂) Total Column

mol/m²

0 to 0.01

No

Analysis parameters

The following cell sets the parameters, which define the area of interest to conduct the analysis over. #### Select location To define the area of interest, there are two methods available:

  1. By specifying the latitude, longitude, and buffer, or separate latitude and longitude buffers, this method allows you to define an area of interest around a central point. You can input the central latitude, central longitude, and a buffer value in degrees to create a square area around the center point. For example, lat = 10.338, lon = -1.055, and buffer = 0.1 will select an area with a radius of 0.1 square degrees around the point with coordinates (10.338, -1.055).

    Alternatively, you can provide separate buffer values for latitude and longitude for a rectangular area. For example, lat = 10.338, lon = -1.055, and lat_buffer = 0.1 andlon_buffer = 0.08 will select a rectangular area extending 0.1 degrees north and south, and 0.08 degrees east and west from the point (10.338, -1.055).

    For reasonable loading times, set the buffer as 0.1 or lower.

  2. By uploading a polygon as a GeoJSON or Esri Shapefile. If you choose this option, you will need to upload the geojson or ESRI shapefile into the Sandbox using Upload Files button f7e94a839f1c419190be84a935ca4364 in the top left corner of the Jupyter Notebook interface. ESRI shapefiles must be uploaded with all the related files (.cpg, .dbf, .shp, .shx). Once uploaded, you can use the shapefile or geojson to define the area of interest. Remember to update the code to call the file you have uploaded.

To use one of these methods, you can uncomment the relevant line of code and comment out the other one. To comment out a line, add the "#" symbol before the code you want to comment out. By default, the first option which defines the location using latitude, longitude, and buffer is being used.

If running the notebook for the first time, keep the default settings below. This will demonstrate how the analysis works and provide meaningful results.

The example focuses on the Mpumalanga Highveld in South Africa. It combines massive industrial emissions, intense power generation, methane-rich mining zones, biomass burning, dust, and photochemical ozone production—meaning every atmospheric gas S5P measures appears with strong, clear signals. The region remains observable through most of the year, with August–October providing the richest multi-gas scenario. The Mpumalanga Highveld therefore serves as an ideal demonstration site for the strengths of Sentinel-5P Level-2 products in complex and dynamic environmental conditions.

To run the notebook for a different area, make sure Sentinel-5 Precursor’s TROPOMI data is available for the chosen area using the DEAfrica Explorer.

[6]:
# Method 1: Specify the latitude, longitude, and buffer)
aoi = define_area(lat=-25.6, lon=29.3, buffer=1)
# Method 2: Use a polygon as a GeoJSON or Esri Shapefile.
# aoi = define_area(vector_path='aoi.shp')

#Create a geopolygon and geodataframe of the area of interest
geopolygon = Geometry(aoi["features"][0]["geometry"], crs="epsg:4326")
geopolygon_gdf = gpd.GeoDataFrame(geometry=[geopolygon], crs=geopolygon.crs)

# Get the latitude and longitude range of the geopolygon
lat_range = (geopolygon_gdf.total_bounds[1], geopolygon_gdf.total_bounds[3])
lon_range = (geopolygon_gdf.total_bounds[0], geopolygon_gdf.total_bounds[2])
[7]:
display_map(x=lon_range, y=lat_range)
[7]:
Make this Notebook Trusted to load map: File -> Trust Notebook

Accessing Sentinel 5P data through DE Africa

Now that we know what products and measurements are available for the product, we can load data from the datacube using dc.load. We will load data from spectral satellite bands. By specifying output_crs='EPSG:4326' and resolution=(-0.01, 0.01), we are loading data in its native projection and resolution. Finally, group_by='solar_day' ensures that overlapping images taken within seconds of each other as the satellite passes over are combined into a single time step in the data. The query defined below will be used for all the other products.

[8]:
time_start = '2023-07-18'
time_end = '2023-07-19'
query = {
    'x': (lon_range),
    'y': (lat_range),
    'time':(time_start, time_end),
    'output_crs': 'EPSG:4326',
    'resolution': (-0.01, 0.01)}

1. Sentinel-5P TROPOMI Level 2 UV Aerosol Index

Sentinel-5P TROPOMI Level-2 UV Aerosol Index (UVAI) is a satellite-derived indicator from Sentinel-5 Precursor’s TROPOMI that highlights the presence of light-absorbing aerosols (such as desert dust, wildfire smoke, and volcanic ash) in the atmosphere by comparing measured ultraviolet radiation with what would be expected from a clear molecular atmosphere. UVAI is used to rapidly detect and track aerosol plumes, support air-quality monitoring and hazard response (e.g., dust storms and eruptions), assess transboundary pollution transport, and study how aerosols influence Earth’s radiation balance and climate.

[9]:
ds_aer_ai = dc.load(product='s5p_tropomi_l2_aer_ai',
                group_by="solar_day",
                **query)

ds_aer_ai
[9]:
<xarray.Dataset> Size: 723kB
Dimensions:         (time: 2, latitude: 200, longitude: 200)
Coordinates:
  * time            (time) datetime64[ns] 16B 2023-07-18T05:57:16.500000 2023...
  * latitude        (latitude) float64 2kB -24.61 -24.62 ... -26.59 -26.59
  * longitude       (longitude) float64 2kB 28.3 28.32 28.32 ... 30.29 30.29
    spatial_ref     int32 4B 4326
Data variables:
    AER_AI_340_380  (time, latitude, longitude) float32 320kB -0.1253 ... -0....
    AER_AI_354_388  (time, latitude, longitude) float32 320kB -0.1178 ... 0.0...
    dataMask        (time, latitude, longitude) uint8 80kB 1 1 1 1 1 ... 1 1 1 1
Attributes:
    crs:           EPSG:4326
    grid_mapping:  spatial_ref

Masking

Before visualisation, we use the dataMask band to mask values affected by cloud or other issues. The code below keeps data for pixels where the data mask value is 1.

[10]:
ds_aer_ai = ds_aer_ai.where(ds_aer_ai.dataMask == 1)

Visualising the band

The wavelength pairs 340–380 nm (AER_AI_340_380) and 354–388 nm (AER_AI_354_388) are the two spectral combinations used by Sentinel-5P/TROPOMI to compute the UV Aerosol Index. The 340–380 nm pair is a legacy configuration aligned with earlier instruments such as OMI, making it useful for historical continuity and long-term aerosol trend analysis. In contrast, the 354–388 nm pair is optimized for TROPOMI’s radiometric characteristics and produces more stable, lower-noise UVAI retrievals, especially over bright surfaces like deserts, making it the recommended operational product for most applications. In essence, both wavelength pairs detect absorbing aerosols, but 354–388 nm is better matched to TROPOMI’s calibration and offers improved performance for scientific and monitoring use. The cell below visualizes one of the available bands; the AER_AI_354_388 band will be used. This is the recommended UV Aerosol Index product for Sentinel-5P TROPOMI because it is optimized for the instrument’s spectral bands and provides more stable, operationally supported aerosol monitoring. The AER_AI_340_380 band is mainly retained for historical comparison with older satellite missions.

[11]:
ds_aer_ai['AER_AI_354_388'].plot(robust=True, col="time")
[11]:
<xarray.plot.facetgrid.FacetGrid at 0x7f6125d9be90>
../../../_images/sandbox_notebooks_Datasets_Sentinel_5P_28_1.png

2. Sentinel-5P TROPOMI Level-2 Methane (CH₄) Column-Averaged Mixing Ratio

Sentinel-5P TROPOMI Level-2 Methane (CH₄) Column-Averaged Mixing Ratio is a satellite product from Sentinel-5 Precursor’s TROPOMI that measures the average amount of methane in the atmosphere by analyzing reflected shortwave-infrared sunlight. It is used to detect and map methane emission hotspots, quantify regional sources from oil and gas operations, agriculture and wetlands, monitor trends in a major greenhouse gas, and support climate policy and mitigation efforts.

[12]:
ds_l2_ch4 = dc.load(product='s5p_tropomi_l2_ch4',
                    group_by="solar_day",
                    **query)

ds_l2_ch4
[12]:
<xarray.Dataset> Size: 403kB
Dimensions:      (time: 2, latitude: 200, longitude: 200)
Coordinates:
  * time         (time) datetime64[ns] 16B 2023-07-18T05:57:16.500000 2023-07...
  * latitude     (latitude) float64 2kB -24.61 -24.62 -24.62 ... -26.59 -26.59
  * longitude    (longitude) float64 2kB 28.3 28.32 28.32 ... 30.27 30.29 30.29
    spatial_ref  int32 4B 4326
Data variables:
    CH4          (time, latitude, longitude) float32 320kB 1.871e+03 ... nan
    dataMask     (time, latitude, longitude) uint8 80kB 1 1 1 1 1 ... 0 0 0 0 0
Attributes:
    crs:           EPSG:4326
    grid_mapping:  spatial_ref

Masking

Before visualisation, we use the dataMask band to mask values affected by cloud or other issues. The code below keeps data for pixels where the data mask value is 1.

[13]:
ds_l2_ch4 = ds_l2_ch4.where(ds_l2_ch4.dataMask == 1)

Visualising the band

Methane (CH₄) is a potent greenhouse gas emitted by wetlands, livestock, oil and gas production, landfills, and rice fields. Sentinel-5P measures CH₄ in parts per billion (PPB). It plays a major role in climate warming and atmospheric chemical reactions.

[14]:
ds_l2_ch4['CH4'].plot(robust=True, col="time")
[14]:
<xarray.plot.facetgrid.FacetGrid at 0x7f6124f67890>
../../../_images/sandbox_notebooks_Datasets_Sentinel_5P_34_1.png

3. Sentinel-5p TROPOMI Level 2 Cloud fraction, albedo, top pressure

Sentinel-5P TROPOMI Level-2 Cloud Fraction, Cloud Albedo, and Cloud Top Pressure are atmospheric products from Sentinel-5 Precursor’s TROPOMI that describe cloud coverage, brightness, and height for each satellite pixel, derived from reflected solar radiation measurements. These cloud parameters are used to improve the accuracy of trace-gas retrievals (by correcting for cloud contamination), to study cloud distribution and radiative effects, and to support weather and climate analysis by quantifying how clouds influence Earth’s energy balance and atmospheric structure

[15]:
ds_l2_cloud = dc.load(product='s5p_tropomi_l2_cloud',
                    group_by="solar_day",
                      **query)

ds_l2_cloud
[15]:
<xarray.Dataset> Size: 2MB
Dimensions:                  (time: 2, latitude: 200, longitude: 200)
Coordinates:
  * time                     (time) datetime64[ns] 16B 2023-07-18T05:57:16.50...
  * latitude                 (latitude) float64 2kB -24.61 -24.62 ... -26.59
  * longitude                (longitude) float64 2kB 28.3 28.32 ... 30.29 30.29
    spatial_ref              int32 4B 4326
Data variables:
    CLOUD_BASE_PRESSURE      (time, latitude, longitude) float32 320kB 6.278e...
    CLOUD_TOP_PRESSURE       (time, latitude, longitude) float32 320kB 5.527e...
    CLOUD_BASE_HEIGHT        (time, latitude, longitude) float32 320kB 4.128e...
    CLOUD_TOP_HEIGHT         (time, latitude, longitude) float32 320kB 5.128e...
    CLOUD_OPTICAL_THICKNESS  (time, latitude, longitude) float32 320kB 3.105 ...
    CLOUD_FRACTION           (time, latitude, longitude) float32 320kB 0.1183...
    dataMask                 (time, latitude, longitude) uint8 80kB 1 1 ... 0 0
Attributes:
    crs:           EPSG:4326
    grid_mapping:  spatial_ref

Masking

Before visualisation, we use the dataMask band to mask values affected by cloud or other issues. The code below keeps data for pixels where the data mask value is 1

[16]:
ds_l2_cloud = ds_l2_cloud.where(ds_l2_cloud.dataMask == 1)

Visualising the band

Cloud base pressure is expressed in Pascals and typically ranges from 1,000 to 110,000, indicating the pressure level at the bottom of the cloud layer. Cloud top pressure is measured in Pascals and usually falls between 1,000 to 110,000, with lower values representing higher cloud tops. Cloud base height is given in meters (m) and generally spans 0 to 20,000, marking the altitude of the cloud’s lower boundary. Cloud top height is also in meters (m) and commonly ranges from 0 to 20,000, capturing the altitude of the highest part of the cloud. Cloud optical thickness is a unitless index typically ranging from 0 to 250. Cloud fraction is a unitless value between 0 and 1, representing the proportion of the satellite pixel covered by cloud. For the visualization, the CLOUD_FRACTION band will be used. To visualize any of the other bands, simply replace CLOUD_FRACTION with the desired band name from the list above.

[17]:
ds_l2_cloud['CLOUD_FRACTION'].plot(robust=True, col="time")
[17]:
<xarray.plot.facetgrid.FacetGrid at 0x7f612430f8f0>
../../../_images/sandbox_notebooks_Datasets_Sentinel_5P_40_1.png

4. Sentinel-5p TROPOMI Level 2 Carbon Monoxide (CO) total column

Sentinel-5P TROPOMI Level-2 Carbon Monoxide (CO) Total Column is an atmospheric product from Sentinel‑5 Precursor’s TROPOMI that measures the vertically integrated amount of CO in the atmosphere by analyzing reflected shortwave-infrared radiation. It is used to track combustion-related pollution from sources such as biomass burning, industry, and transport, to monitor long-range transport of pollution plumes, and to support air-quality assessment and atmospheric chemistry studies.

[18]:
ds_l2_co = dc.load(product='s5p_tropomi_l2_co',
                    group_by="solar_day",
                      **query)

ds_l2_co
[18]:
<xarray.Dataset> Size: 403kB
Dimensions:      (time: 2, latitude: 200, longitude: 200)
Coordinates:
  * time         (time) datetime64[ns] 16B 2023-07-18T05:57:16.500000 2023-07...
  * latitude     (latitude) float64 2kB -24.61 -24.62 -24.62 ... -26.59 -26.59
  * longitude    (longitude) float64 2kB 28.3 28.32 28.32 ... 30.27 30.29 30.29
    spatial_ref  int32 4B 4326
Data variables:
    CO           (time, latitude, longitude) float32 320kB 0.02195 ... nan
    dataMask     (time, latitude, longitude) uint8 80kB 1 1 1 1 1 ... 0 0 0 0 0
Attributes:
    crs:           EPSG:4326
    grid_mapping:  spatial_ref

Masking

Before visualisation, we use the dataMask band to mask values affected by cloud or other issues. The code below keeps data for pixels where the data mask value is 1

[19]:
ds_l2_co = ds_l2_co.where(ds_l2_co.dataMask == 1)

Visualising the band

Carbon monoxide (CO) is produced by incomplete combustion from vehicles, wildfires, and household burning, and it serves as a tracer for pollution and fire activity. It is measured in mol/m², with typical values between 0 to 0.1 mol/m², increasing during intense fire events. Certain events (wildfires) may cause these limits to be exceeded.

[20]:
ds_l2_co['CO'].plot(robust=True, col="time")
[20]:
<xarray.plot.facetgrid.FacetGrid at 0x7f612445c830>
../../../_images/sandbox_notebooks_Datasets_Sentinel_5P_46_1.png

5. Sentinel-5p TROPOMI Level 2 Formaldehyde (HCHO) total column

Sentinel-5P TROPOMI Level-2 Formaldehyde (HCHO) Total Column is an atmospheric composition product from Sentinel‑5 Precursor’s TROPOMI that measures the vertically integrated concentration of formaldehyde in the atmosphere using ultraviolet spectral observations. It is used as a proxy for volatile organic compound (VOC) emissions to identify polluted urban areas, biomass-burning regions, and biogenic sources from vegetation, and to support studies of ozone formation and air-quality dynamics.

[21]:
ds_l2_hcho = dc.load(product='s5p_tropomi_l2_hcho',
                    group_by="solar_day",
                      **query)

ds_l2_hcho
[21]:
<xarray.Dataset> Size: 403kB
Dimensions:      (time: 2, latitude: 200, longitude: 200)
Coordinates:
  * time         (time) datetime64[ns] 16B 2023-07-18T05:57:16.500000 2023-07...
  * latitude     (latitude) float64 2kB -24.61 -24.62 -24.62 ... -26.59 -26.59
  * longitude    (longitude) float64 2kB 28.3 28.32 28.32 ... 30.27 30.29 30.29
    spatial_ref  int32 4B 4326
Data variables:
    HCHO         (time, latitude, longitude) float32 320kB 3.575e-05 ... 0.00...
    dataMask     (time, latitude, longitude) uint8 80kB 1 1 1 1 1 ... 1 1 1 1 1
Attributes:
    crs:           EPSG:4326
    grid_mapping:  spatial_ref

Masking

Before visualisation, we use the dataMask band to mask values affected by cloud or other issues. The code below keeps data for pixels where the data mask value is 1

[22]:
ds_l2_hcho = ds_l2_hcho.where(ds_l2_hcho.dataMask == 1)

Visualising the band

Formaldehyde (HCHO) is formed by the oxidation of volatile organic compounds from vegetation, human activities, and biomass burning, making it a useful indicator of ozone-producing chemistry. Sentinel-5P reports HCHO in mol/m², typically ranging from 0 to 0.001 mol/m², depending on fire activity and biogenic emissions. Certain events (wildfires) may cause these limits to be exceeded.

[23]:
ds_l2_hcho['HCHO'].plot(robust=True, col="time")
[23]:
<xarray.plot.facetgrid.FacetGrid at 0x7f612430d2b0>
../../../_images/sandbox_notebooks_Datasets_Sentinel_5P_52_1.png

6. Sentinel-5p TROPOMI Level 2 Nitrogen Dioxide (NO2), total and tropospheric columns

Sentinel-5P TROPOMI Level-2 Nitrogen Dioxide (NO₂) Total and Tropospheric Columns are atmospheric products from Sentinel-5 Precursor’s TROPOMI that measure the vertically integrated amount of nitrogen dioxide for the entire atmosphere (total column) and specifically within the troposphere (tropospheric column) using visible and ultraviolet spectral data. These products are used to monitor surface-level air pollution from vehicles, power plants, and industry, to evaluate air-quality trends and population exposure, and to support environmental regulation and urban pollution management

[24]:
ds_l2_no2 = dc.load(product='s5p_tropomi_l2_no2',
                    group_by="solar_day",
                      **query)

ds_l2_no2
[24]:
<xarray.Dataset> Size: 403kB
Dimensions:      (time: 2, latitude: 200, longitude: 200)
Coordinates:
  * time         (time) datetime64[ns] 16B 2023-07-18T05:57:16.500000 2023-07...
  * latitude     (latitude) float64 2kB -24.61 -24.62 -24.62 ... -26.59 -26.59
  * longitude    (longitude) float64 2kB 28.3 28.32 28.32 ... 30.27 30.29 30.29
    spatial_ref  int32 4B 4326
Data variables:
    NO2          (time, latitude, longitude) float32 320kB 2.239e-05 ... 0.00...
    dataMask     (time, latitude, longitude) uint8 80kB 1 1 1 1 1 ... 1 1 1 1 1
Attributes:
    crs:           EPSG:4326
    grid_mapping:  spatial_ref

Masking

Before visualisation, we use the dataMask band to mask values affected by cloud or other issues. The code below keeps data for pixels where the data mask value is 1

[25]:
ds_l2_no2 = ds_l2_no2.where(ds_l2_no2.dataMask == 1)

Visualising the band

Nitrogen dioxide (NO₂) is a major air pollutant produced mainly by vehicles, power plants, and industrial combustion, contributing to smog and ozone formation. It is measured in mol/m², ranging from 0 - 0.0003 mol/m². Sometimes Peak values for polluted cities may reach two or three times the upper value.

[26]:
ds_l2_no2['NO2'].plot(robust=True, col="time")
[26]:
<xarray.plot.facetgrid.FacetGrid at 0x7f612414cce0>
../../../_images/sandbox_notebooks_Datasets_Sentinel_5P_58_1.png

7. Sentinel-5p TROPOMI Level 2 Ozone (O3) total column

Sentinel-5P TROPOMI Level-2 Ozone (O₃) Total Column is an atmospheric composition product from Sentinel-5 Precursor’s TROPOMI that measures the total amount of ozone throughout the atmosphere by analyzing ultraviolet and visible radiation absorbed by ozone molecules. It is used to monitor the ozone layer, assess exposure to harmful ultraviolet radiation, track long-term ozone trends, and support international environmental agreements related to atmospheric protection.

[27]:
ds_l2_o3 = dc.load(product='s5p_tropomi_l2_o3',
                    group_by="solar_day",
                      **query)

ds_l2_o3
[27]:
<xarray.Dataset> Size: 403kB
Dimensions:      (time: 2, latitude: 200, longitude: 200)
Coordinates:
  * time         (time) datetime64[ns] 16B 2023-07-18T05:57:16.500000 2023-07...
  * latitude     (latitude) float64 2kB -24.61 -24.62 -24.62 ... -26.59 -26.59
  * longitude    (longitude) float64 2kB 28.3 28.32 28.32 ... 30.27 30.29 30.29
    spatial_ref  int32 4B 4326
Data variables:
    O3           (time, latitude, longitude) float32 320kB 0.1133 ... 0.1192
    dataMask     (time, latitude, longitude) uint8 80kB 1 1 1 1 1 ... 1 1 1 1 1
Attributes:
    crs:           EPSG:4326
    grid_mapping:  spatial_ref

Masking

[28]:
ds_l2_o3 = ds_l2_o3.where(ds_l2_o3.dataMask == 1)

Visualising the band

Ozone (O₃) protects life in the stratosphere by absorbing ultraviolet radiation but becomes a harmful pollutant when present near the surface. Sentinel-5P measures ozone in mol/m², with typical column values ranging from 0 to 0.36 mol/m².

[29]:
ds_l2_o3['O3'].plot(robust=True, col="time")
[29]:
<xarray.plot.facetgrid.FacetGrid at 0x7f611ca7bd40>
../../../_images/sandbox_notebooks_Datasets_Sentinel_5P_64_1.png

8. Sentinel-5p TROPOMI Level 2 Sulfur Dioxide (SO2) total column

Sentinel-5P TROPOMI Level-2 Sulfur Dioxide (SO₂) Total Column is an atmospheric product from Sentinel-5 Precursor’s TROPOMI that measures the vertically integrated amount of sulfur dioxide in the atmosphere using ultraviolet spectral observations. It is used to detect emissions from power plants, oil refineries, and smelters, to monitor volcanic eruptions in near-real-time, and to assess the contribution of sulfur pollution to air quality degradation and acid rain formation.

[30]:
ds_l2_so2 = dc.load(product='s5p_tropomi_l2_so2',
                    group_by="solar_day",
                      **query)

ds_l2_so2
[30]:
<xarray.Dataset> Size: 403kB
Dimensions:      (time: 2, latitude: 200, longitude: 200)
Coordinates:
  * time         (time) datetime64[ns] 16B 2023-07-18T05:57:16.500000 2023-07...
  * latitude     (latitude) float64 2kB -24.61 -24.62 -24.62 ... -26.59 -26.59
  * longitude    (longitude) float64 2kB 28.3 28.32 28.32 ... 30.27 30.29 30.29
    spatial_ref  int32 4B 4326
Data variables:
    SO2          (time, latitude, longitude) float32 320kB -0.0003887 ... 0.0...
    dataMask     (time, latitude, longitude) uint8 80kB 1 1 1 1 1 ... 1 1 1 1 1
Attributes:
    crs:           EPSG:4326
    grid_mapping:  spatial_ref

Masking

Before visualisation, we use the dataMask band to mask values affected by cloud or other issues. The code below keeps data for pixels where the data mask value is 1

[31]:
ds_l2_so2 = ds_l2_so2.where(ds_l2_so2.dataMask == 1)

Visualising the band

Sulfur dioxide (SO₂) is a pollutant released from volcanoes, coal-fired power plants, refineries, and smelting activities, and it contributes to acid rain and respiratory problems. Sentinel-5P reports SO₂ in mol/m², with values from 0 to 0.01 mol/m² or more during explosive volcanic eruptions which can exceed 0.35 mol/m^2 and instrumental noise can produce negative values.

[32]:
ds_l2_so2['SO2'].plot(robust=True, col="time")
[32]:
<xarray.plot.facetgrid.FacetGrid at 0x7f611c90ca70>
../../../_images/sandbox_notebooks_Datasets_Sentinel_5P_70_1.png

Calculating Total Atmospheric Gas Mass (Tonnes) from Sentinel-5P TROPOMI Data

Sentinel-5P TROPOMI Level-2 products provide atmospheric trace-gas measurements as vertical column densities in units of mol/m² and Part Per Billions (PPB). These values represent the number of moles of a gas contained in an atmospheric column above each satellite pixel.

Why Compute Tonnes?

Expressing gas amounts in tonnes provides a policy-relevant and intuitive metric for:

  • Evaluating emission sources (urban, industrial, biomass burning, wetlands)

  • Quantifying atmospheric loading during pollution episodes

  • Comparing different gases on the same scale

  • Reporting mass-based metrics for climate and air-quality frameworks

  • Supporting SDG and AU Agenda 2063 environmental assessments

Converting trace-gas columns into physically meaningful mass values allows researchers and decision-makers to translate satellite observations into actionable insights.


Molecular Weights Used in the Conversion

The molecular weight determines how many grams one mole of a gas weighs:

Gas

Molecular Weight (g/mol)

CH₄ (Methane)

16.04

CO (Carbon Monoxide)

28.01

NO₂ (Nitrogen Dioxide)

46.005

SO₂ (Sulfur Dioxide)

64.066

HCHO (Formaldehyde)

30.026

O₃ (Ozone)

48.00

These values are provided to the conversion function so that any Sentinel-5P gas product can be processed consistently.


The function below performs the full conversion from Sentinel-5P TROPOMI Level-2 gas column data (provided in mol/m²) into total atmospheric mass expressed in tonnes for your area of interest (AOI).

To run the calculation, you only need to supply:

  1. The gas column variable from your dataset

    • Examples: no2_tropospheric_column, co_column, so2_column, hcho_column, ozone_total_vertical_column,

    • Because ch4_column is in ppb, it must be converted to an appropriate column amount (e.g., mol/m²) before converting it to tonnes.

  2. The molecular weight of the gas (in g/mol)

The function automatically:

  • Computes pixel areas from the latitude/longitude grid

  • Applies cloud masking if needed

  • Converts mol/m² → mol → grams → tonnes

  • Sums across all valid pixels

This allows you to transform satellite-observed column densities into a meaningful physical metric (tonnes) for environmental analysis, climate studies, emission tracking, and reporting.

[33]:
def compute_tonnes_timeseries(ds, molecular_weight):
    """
    Computes total atmospheric mass (tonnes) for every timestep in a Sentinel-5P
    Dataset or DataArray. Automatically detects the gas variable if needed.
    Returns a DataFrame with 'timestep' and 'tonnes' columns.
    """

    # -------------------------------------------------------
    # Case 1: Input is a DataArray
    # -------------------------------------------------------
    if isinstance(ds, xr.DataArray):
        gas_da_all = ds

    # -------------------------------------------------------
    # Case 2: Input is a Dataset
    # -------------------------------------------------------
    elif isinstance(ds, xr.Dataset):
        non_dimension_vars = [
            v for v in ds.data_vars
            if v not in ["latitude", "longitude", "time", "dataMask"]
        ]

        if len(non_dimension_vars) != 1:
            raise ValueError(f"Could not detect gas variable. Found: {non_dimension_vars}")

        gas_var = non_dimension_vars[0]
        gas_da_all = ds[gas_var]

    else:
        raise TypeError("Input must be an xarray Dataset or DataArray")

    R = 6371000  # Earth radius
    results = []

    # -------------------------------------------------------
    # Loop over timesteps
    # -------------------------------------------------------
    for idx, t in enumerate(gas_da_all.time.values):
        da = gas_da_all.isel(time=idx)

        # Apply dataMask if available
        if isinstance(ds, xr.Dataset) and "dataMask" in ds:
            da = da.where(ds["dataMask"].isel(time=idx) == 1)

        lat = da.latitude
        lon = da.longitude

        # Compute resolution
        dlat = np.abs(lat.diff("latitude").mean().item())
        dlon = np.abs(lon.diff("longitude").mean().item())

        # Convert to radians
        lat_rad = np.deg2rad(lat)
        dlat_rad = np.deg2rad(dlat)
        dlon_rad = np.deg2rad(dlon)

        # Pixel area (1-D latitude)
        area_1d = R**2 * np.cos(lat_rad) * dlat_rad * dlon_rad
        area_da = xr.DataArray(area_1d, coords={"latitude": lat}, dims=["latitude"])

        # Expand to full grid
        area_full = area_da.broadcast_like(da)

        # mol/m² → mol → grams → tonnes
        mol_per_pixel = da * area_full
        grams_per_pixel = mol_per_pixel * molecular_weight
        tonnes_per_pixel = grams_per_pixel / 1e6

        total_tonnes = tonnes_per_pixel.sum(skipna=True).item()

        results.append([pd.to_datetime(t), total_tonnes])

    # -------------------------------------------------------
    # Return a DataFrame with headers
    # -------------------------------------------------------
    df = pd.DataFrame(results, columns=["timestep", "tonnes"])
    df["tonnes"] = df["tonnes"].apply(lambda x: f"{x:,.0f}")
    return df

In the cell below, we demonstrate how to compute the total atmospheric mass of Carbon Monoxide (CO) total column over the selected Area of Interest (AOI) using Sentinel-5P TROPOMI Level-2 data from the CO dataset already loaded above. The calculation applies the previously defined conversion function, which transforms to total mass (tonnes) by combining pixel area and molecular weight, with optional cloud masking. This workflow quantifies the total Carbon Monoxide present within the AOI, and the same approach can be applied to other Sentinel-5P trace-gas products by supplying the relevant gas variable and molecular weight. Note: Methane(CH4) is reported in ppb, an additional conversion step will be included to convert it to mol/m² before computing tonnes. The dataMask flags whether a pixel contains a valid gas retrieval, while the s5p_tropomi_l2_cloud product provides cloud information—specifically the cloud_fraction—which we use to exclude pixels where cloud contamination would reduce the accuracy of atmospheric gas measurements. A cloud threshold of 0.3 is recommended because it effectively removes pixels where cloud contamination begins to significantly degrade gas retrieval accuracy while still preserving enough clear-sky observations for reliable analysis.

[34]:
cloud_mask = ds_l2_cloud['CLOUD_FRACTION'] < 0.3
[35]:
mask_CO = ds_l2_co['CO'].where(cloud_mask)

Visualising the CO Total Column After Removing Cloud-Contaminated Pixel

The plot below shows the CO total column values over the Area of Interest (AOI) after applying a cloud-fraction filter to remove pixels with excessive cloud contamination. Sentinel-5P Carbon Monoxide retrievals can be strongly affected by cloud cover, so we retain only pixels where cloud_fraction < 0.3. This ensures that the visualisation reflects clearer and more reliable Carbon Monoxide observations for the selected 2-day time period.

[36]:
mask_CO.plot(robust=True, col="time")
[36]:
<xarray.plot.facetgrid.FacetGrid at 0x7f611c82cf50>
../../../_images/sandbox_notebooks_Datasets_Sentinel_5P_79_1.png
[37]:
CO_tonnes_ts = compute_tonnes_timeseries(
    mask_CO,
    molecular_weight=28.01,
)

CO_tonnes_ts
[37]:
timestep tonnes
0 2023-07-18 05:57:16.500 12,055
1 2023-07-19 05:47:43.500 977

Conclusion

Through Digital Earth Africa, data from Sentinel-5 Precursor and its TROPOMI are transforming access to atmospheric information across Africa by making air-quality and climate datasets readily available and easy to use. DE Africa removes technical barriers to satellite data by offering open access, cloud processing, and analysis-ready products, enabling researchers and governments to monitor pollution, greenhouse gases, and ozone without specialised infrastructure. This directly supports the goals of the United Nations Sustainable Development Goals and the African Union Agenda 2063 by strengthening environmental governance, public-health planning, climate action, and regional cooperation, helping countries design informed policies for a healthier and more resilient Africa.


Additional information

License The code in this notebook is licensed under the Apache License, Version 2.0.

Digital Earth Africa data is licensed under the Creative Commons by Attribution 4.0 license.

Contact If you need assistance, please post a question on the DE Africa Slack channel or on the GIS Stack Exchange using the open-data-cube tag (you can view previously asked questions here).

If you would like to report an issue with this notebook, you can file one on Github.

Compatible datacube version

[38]:
print(datacube.__version__)
1.9.13

Last Tested:

[39]:
from datetime import datetime
datetime.today().strftime('%Y-%m-%d')
[39]:
'2026-02-25'