Landsat Surface Reflectance 8ec3e82b77bc4ea8bf1bb5132b7b905b

Keywords data used; landsat 8, data used; landsat 7, data used; landsat 5, datasets; landsat 8, datasets; landsat 7, datasets; landsat 5,

Background

The United States Geological Survey’s (USGS) Landsat satellite program has been capturing images of the African continent for more than 30 years. These data are highly useful for land and coastal mapping studies.

DE Africa’s Landsat data is ingested from the USGS Collection 2, Level 2 archive and forms a single, cohesive Analysis Ready Data (ARD) package, which allows you to analyse surface reflectance data as-is without the need to apply additional corrections.

Important details:

  • Surface reflectance product

    • Native scaling range: 1 65,455 (0 is no-data)

    • To achieve surface reflectance values, normalise values to 0 - 1 using ds = ds * 2.75e-5 - 0.2

    • Using dc.load will load data in the native scaling range 1 65,455, while using load_ard will scale the values

  • CFMask used as cloud mask

  • Native pixel alignment is centre

  • Date-range: 1984 – present

  • Spatial resolution: 30 x 30 m

For a detailed description of DE Africa’s Landsat archive, see the DE Africa’s Landsat surface reflectance technical specifications documentation.

Description

In this notebook we will load Landsat data using two methods. Firstly, we will use dc.load() to return a time series of satellite images from a single sensor.

Secondly, we will load a time series using the load_ard() function, which is a wrapper function around the dc.load module. This function will load all the images from Landsat 5,7 & 8, combine them, and then apply a cloud mask. The returned xarray.Dataset will contain analysis ready images with the cloudy and invalid pixels masked out.

Topics covered include: 1. Inspecting the Landsat products and measurements available in the datacube 2. Using the native dc.load() function to load in Landsat data from a single satellite 3. Using the load_ard() wrapper function to load in a concatenated, sorted, and cloud masked time series from Landsat 5, 7 & 8.


Getting started

To run this analysis, run all the cells in the notebook, starting with the “Load packages” cell.

Load packages

[1]:
import datacube

from deafrica_tools.datahandling import load_ard
from deafrica_tools.plotting import rgb
/env/lib/python3.6/site-packages/geopandas/_compat.py:88: UserWarning: The Shapely GEOS version (3.7.2-CAPI-1.11.0 ) is incompatible with the GEOS version PyGEOS was compiled with (3.9.0-CAPI-1.16.2). Conversions between both will be slow.
  shapely_geos_version, geos_capi_version_string

Connect to the datacube

[2]:
dc = datacube.Datacube(app="Landsat_Surface_Reflectance")

Available products and measurements

List products

We can use datacube’s list_products functionality to inspect DE Africa’s Landsat products that are available in the datacube. The table below shows the product names that we will use to load the data, a brief description of the data, and the satellite instrument that acquired the data.

We can search for Landsat Collection 2 Surface Reflectance data by using the search term sr. sr stands for “surface reflectance”. The datacube is case-sensitive so this must be typed in lower case.

[3]:
# List Landsat products available in DE Africa
dc_products = dc.list_products()
display_columns = ['name', 'description']
dc_products[dc_products.name.str.contains(
    'sr').fillna(
        False)][display_columns].set_index('name')
[3]:
description
name
dem_srtm 1 second elevation model
ls5_sr USGS Landsat 5 Collection 2 Level-2 Surface Re...
ls7_sr USGS Landsat 7 Collection 2 Level-2 Surface Re...
ls8_sr USGS Landsat 8 Collection 2 Level-2 Surface Re...

List measurements

We can further inspect the data available for each Landsat product using datacube’s list_measurements functionality. The table below lists each of the measurements available in the data.

[4]:
dc_measurements = dc.list_measurements()
dc_measurements.loc['ls8_sr']
[4]:
name dtype units nodata aliases flags_definition
measurement
SR_B1 SR_B1 uint16 1 0 [band_1, coastal_aerosol] NaN
SR_B2 SR_B2 uint16 1 0 [band_2, blue] NaN
SR_B3 SR_B3 uint16 1 0 [band_3, green] NaN
SR_B4 SR_B4 uint16 1 0 [band_4, red] NaN
SR_B5 SR_B5 uint16 1 0 [band_5, nir] NaN
SR_B6 SR_B6 uint16 1 0 [band_6, swir_1] NaN
SR_B7 SR_B7 uint16 1 0 [band_7, swir_2] NaN
QA_PIXEL QA_PIXEL uint16 bit_index 1 [pq, pixel_quality] {'snow': {'bits': 5, 'values': {'0': 'not_high...
QA_RADSAT QA_RADSAT uint16 bit_index 0 [radsat, radiometric_saturation] {'nir_saturation': {'bits': 4, 'values': {'0':...
SR_QA_AEROSOL SR_QA_AEROSOL uint8 bit_index 1 [qa_aerosol, aerosol_qa] {'water': {'bits': 2, 'values': {'0': False, '...

Load Landsat using dc.load()

Now that we know what products and measurements are available for the products, we can load data from the datacube using dc.load.

In the example below, we will load data from Landsat 8 from Cape Town for South Africa in January 2018. We will load data from three spectral satellite bands, as well as cloud masking data ('qa_aerosol'). By specifying output_crs='EPSG:6933' and resolution=(-30, 30), we request that datacube reproject our data to the African Albers coordinate reference system (CRS), with 30 x 30 m pixels. Finally, group_by='solar_day' ensures that overlapping images taken within seconds of each other as the satellite passes over are combined into a single time step in the data.

Note: For a more general discussion of how to load data using the datacube, refer to the Introduction to loading data notebook.

[5]:
# load data
ds = dc.load(product="ls8_sr",
             measurements=['red', 'green', 'blue',
                           'qa_aerosol'],
             output_crs='EPSG:6933',
             y=(-34.31, -34.36),
             x=(18.44, 18.50),
             time=("2018-01", "2018-01"),
             resolution=(-30, 30),
             group_by="solar_day",
             )

print(ds)
<xarray.Dataset>
Dimensions:      (time: 2, x: 194, y: 177)
Coordinates:
  * time         (time) datetime64[ns] 2018-01-14T08:35:40.945747 2018-01-30T...
  * y            (y) float64 -4.126e+06 -4.126e+06 ... -4.131e+06 -4.131e+06
  * x            (x) float64 1.779e+06 1.779e+06 ... 1.785e+06 1.785e+06
    spatial_ref  int32 6933
Data variables:
    red          (time, y, x) uint16 15433 17277 18036 ... 27079 27221 27377
    green        (time, y, x) uint16 13942 16492 17130 ... 26999 27147 27336
    blue         (time, y, x) uint16 12139 14631 15385 ... 28116 28352 28136
    qa_aerosol   (time, y, x) uint8 224 194 224 224 128 ... 192 224 224 224 192
Attributes:
    crs:           EPSG:6933
    grid_mapping:  spatial_ref

Plotting Landsat data

We can plot the data we loaded using the rgb function. By default, the function will plot data as a true colour image using the ‘red’, ‘green’, and ‘blue’ bands.

[6]:
rgb(ds, index=0)
../../../_images/sandbox_notebooks_Datasets_Landsat_Surface_Reflectance_17_0.png

Load Landsat using load_ard

load_ard applies the linear scaling and offset which converts the native uint16 data to actual surface reflectance values. load_ard will additionally concatenate and sort the observations by time, and apply a cloud mask. The result is an analysis-ready dataset.

In the example below, we load Landsat 8 data for the same time and place as above. Note a cloud mask has now been applied and the data converted to decimal surface reflectance values.

This function will also load images from all the Landsat sensors if they are added to the products argument as a list.

You can find more information on this function from the Using load ard notebook.

[7]:
ds = load_ard(dc=dc,
              products=["ls8_sr"],
              measurements=['red', 'green', 'blue'],
              output_crs='EPSG:6933',
              y=(-34.31, -34.36),
              x=(18.44, 18.50),
              time=("2018-01", "2018-01"),
              resolution=(-30, 30),
              group_by="solar_day"
             )

print(ds)
Using pixel quality parameters for USGS Collection 2
Finding datasets
    ls8_sr
Applying pixel quality/cloud mask
Re-scaling Landsat C2 data
Loading 2 time steps
<xarray.Dataset>
Dimensions:      (time: 2, x: 194, y: 177)
Coordinates:
  * time         (time) datetime64[ns] 2018-01-14T08:35:40.945747 2018-01-30T...
  * y            (y) float64 -4.126e+06 -4.126e+06 ... -4.131e+06 -4.131e+06
  * x            (x) float64 1.779e+06 1.779e+06 ... 1.785e+06 1.785e+06
    spatial_ref  int32 6933
Data variables:
    red          (time, y, x) float32 nan nan nan nan nan ... nan nan nan nan
    green        (time, y, x) float32 nan nan nan nan nan ... nan nan nan nan
    blue         (time, y, x) float32 nan nan nan nan nan ... nan nan nan nan
Attributes:
    crs:           EPSG:6933
    grid_mapping:  spatial_ref

Plot the cloud masked landsat data:

[8]:
rgb(ds, index=0)
../../../_images/sandbox_notebooks_Datasets_Landsat_Surface_Reflectance_21_0.png

Additional information

License: The code in this notebook is licensed under the Apache License, Version 2.0. Digital Earth Africa data is licensed under the Creative Commons by Attribution 4.0 license.

Contact: If you need assistance, please post a question on the Open Data Cube Slack channel or on the GIS Stack Exchange using the open-data-cube tag (you can view previously asked questions here). If you would like to report an issue with this notebook, you can file one on Github.

Compatible datacube version:

[9]:
print(datacube.__version__)
1.8.4.dev52+g07bc51a5

Last Tested:

[10]:
from datetime import datetime
datetime.today().strftime('%Y-%m-%d')
[10]:
'2021-05-20'