Identifying ships with Sentinel-1 2d9dde7793494ddb9ee8af6615de4511

**Tags**: :index:`data used; sentinel-1`, :index:`analysis; ship detection`


Being able to spot ships and shipping lanes from satellite imagery can be useful for gaining a holistic picture of maritime traffic. The properties of radar data can be utilised to detect where ships appear over time, as well as highlighting the presence of shipping lanes.

When working with radar data, water commonly appears dark due to its relatively smooth surface resulting in very low backscatter, and consequently, low intensities are recorded by the satellite in both polarisation bands. However, if a large ship is on the water, the backscatter at the ship’s location will be much higher than the water due to double-bounce scattering effects.

The ESA/EC Copernicus Sentinel-1 mission (Sentinel-1A and 1B) has a frequent revisit time of a few days. This helps to build a large catalogue of observations that can be leveraged to identify shipping lanes.

In this example, we attempt to identify ships around the Suez Canal in Egypt during March 2021. Significant changes in the number and distribution pattern of the ships are detected, showing the impact of a blockage. More about the event can be found in this thread on Twitter.


Ships are identified by taking advantage of the fact that ships on the water result in a large radar backscatter signal.

The steps demonstrated in this notebook include:

  1. Loading Sentinel-1 backscatter image for an area of interest.

  2. Extracting open water mask using the Water Observations from Space (WOfS) annual summary.

  3. Counting the number of vessels before and after the event and save the results.

  4. Visualiinge the maximum backscatter values from a time series to identify shipping lanes.

Getting started

To run this analysis, run all the cells in the notebook, starting with the “Load packages” cell.

Load packages

Import Python packages that are used for the analysis.

%matplotlib inline

import xarray as xr
import numpy as np
import matplotlib.pyplot as plt

import datacube
from deafrica_tools.spatial import xr_vectorize, xr_rasterize
from deafrica_tools.plotting import display_map
from deafrica_tools.datahandling import load_ard, wofs_fuser, dilate
/env/lib/python3.6/site-packages/geopandas/ UserWarning: The Shapely GEOS version (3.7.2-CAPI-1.11.0 ) is incompatible with the GEOS version PyGEOS was compiled with (3.9.1-CAPI-1.14.2). Conversions between both will be slow.
  shapely_geos_version, geos_capi_version_string

Connect to the datacube

Connect to the datacube so we can access DE Africa data. The app parameter is a unique name for the analysis which is based on the notebook file name.

dc = datacube.Datacube(app="Ship_detection")
/env/lib/python3.6/site-packages/datacube/drivers/postgres/ SADeprecationWarning: Calling URL() directly is deprecated and will be disabled in a future release.  The public constructor for URL is now the URL.create() method.
  username=username, password=password,

Analysis parameters

The following cell sets the parameters, which define the area of interest and the length of time to conduct the analysis over. The parameters are

  • lat: The central latitude to analyse (e.g. 10.338).

  • lon: The central longitude to analyse (e.g. -1.055).

  • buffer: The number of square degrees to load around the central latitude and longitude. For reasonable loading times, set this as 0.1 or lower.

  • time_range: The date range to analyse (e.g. ('2021'))

If running the notebook for the first time, keep the default settings below. This will demonstrate how the analysis works and provide meaningful results. The example covers the Suez Canel in Egypt during March 2021.

# Define the area of interest
lat = 29.95
lon = 32.536
buffer = 0.1

# Compute the bounding box for the study area
lat_range = (lat-buffer, lat+buffer)
lon_range = (lon-buffer, lon+buffer)

# timeframe
timerange = ('2021-03-21', '2021-03-25')

View the selected location

The next cell will display the selected area on an interactive map. Feel free to zoom in and out to get a better understanding of the area you’ll be analysing. Clicking on any point of the map will reveal the latitude and longitude coordinates of that point.

display_map(x=lon_range, y=lat_range)

Load and view Sentinel-1 data

The first step in the analysis is to load Sentinel-1 backscatter data for the specified area of interest.

# Load the Sentinel-1 data
S1 = load_ard(dc=dc,
              measurements=['vv', 'vh'],
Using pixel quality parameters for Sentinel 1
Finding datasets
Applying pixel quality/cloud mask
Loading 3 time steps
Dimensions:      (latitude: 1000, longitude: 1000, time: 3)
  * time         (time) datetime64[ns] 2021-03-21T03:44:49.596587 ... 2021-03...
  * latitude     (latitude) float64 30.05 30.05 30.05 ... 29.85 29.85 29.85
  * longitude    (longitude) float64 32.44 32.44 32.44 ... 32.64 32.64 32.64
    spatial_ref  int32 4326
Data variables:
    vv           (time, latitude, longitude) float32 0.045829795 ... 0.15558568
    vh           (time, latitude, longitude) float32 0.009783918 ... 0.004633242
    crs:           EPSG:4326
    grid_mapping:  spatial_ref

Optional speckle filtering

Specke filtering is not applied in this example because there is a high contrast between water and ship signals. Applying speckle filtering with a small smoothing window may help improve sensitivity for smaller ships.

An example of how to apply a speckle filter can be found in the radar water detection notebook.

Convert data to decibels (dB)

Sentinel-1 backscatter data has two measurements, VV and VH, which correspond to the polarisation of the light sent and received by the satellite. VV refers to the satellite sending out vertically-polarised light and receiving vertically-polarised light back, whereas VH refers to the satellite sending out vertically-polarised light and receiving horizontally-polarised light back. These two measurement bands can tell us different information about the area we’re studying.

When working with radar backscatter, it is common to work with the data in units of decibels (dB), rather than linear intensity. To convert from recorded Digital Number (DN) to dB in Sentinel-1 imagery, we use the following formula:

\[\begin{aligned} \text{dB} = 10 \times \log_{10}(\text{DN}). \end{aligned}\]
# Scale to plot data in decibels
S1["vh_dB"] = 10 * np.log10(S1.vh)
S1["vv_dB"] = 10 * np.log10(S1.vv)
/env/lib/python3.6/site-packages/xarray/core/ RuntimeWarning: divide by zero encountered in log10
  result_data = func(*input_data)

Visualise data before and after the event

We focus on the first and the last observations within this period of time. The ship blockage incident started on 23 March 2021 and lasted almost a week, so we inspect one image from before the event, and one during.

Images below show a high constrast between dark water surface and bright ships.

fig, (ax1, ax2) = plt.subplots(nrows=1, ncols=2, figsize=(18, 7))

# Visualise baseline image before the event
S1.vv_dB.isel(time=0).plot(robust=True, ax=ax1)

# Visualise the image after the event
S1.vv_dB.isel(time=2).plot(robust=True, ax=ax2);

Extract open water area

Surface water can be mapped by thresholding radar backscatter. An detailed example is provided in the radar water detection notebook. To eliminate the impact of ships and waves, both causing elevated backscatter, minimum backscatter values detected over time for each pixel can be used. e.g.

water_mask = S1.vv.min(dim="time")<0.015

For this notebook, however, we use another readily available product in DE Africa, namely the Water Observations from Space (WOfS) annual summary to extract the open water area.

The water detection frequency measurement of WOfS annual summary from the latest available year is loaded to match the Sentinel-1 pixel grid using the option like in dc.load() and bilinear resampling.

# Load WOfS summary through the datacube
wofs = dc.load(product='ga_ls8c_wofs_2_annual_summary',

Open water surface is extracted where water has been detected more than 80% of the year. For an optimal result, the mask is further adjusted to remove gaps and small water bodies.

# Select pixels that are classified as water > 80 % of the year
water_mask = wofs.frequency > 0.8
# close small holes within the water mask and remove a few pixels on the edge for cleaner result
water_mask = xr.DataArray(dilate(~dilate(water_mask, 3, invert=False), 5, invert=True),
# optional step to select only the largest water body

water_bodies = xr_vectorize(water_mask,
                  ,  # wofs crs is not recoganized, so using S1 instead as they are the same
                            mask=water_mask.values == 1)

largest = water_bodies[water_bodies.area == water_bodies.area.max()]

# create mask
water_mask = xr_rasterize(largest, S1)
/env/lib/python3.6/site-packages/pyproj/crs/ FutureWarning: '+init=<authority>:<code>' syntax is deprecated. '<authority>:<code>' is the preferred initialization method. When making the change, be mindful of axis order changes:
  return _prepare_from_string(" ".join(pjargs))
/env/lib/python3.6/site-packages/ UserWarning: Geometry is in a geographic CRS. Results from 'area' are likely incorrect. Use 'GeoSeries.to_crs()' to re-project geometries to a projected CRS before this operation.

# final water mask

Apply water mask and threshold for ship detection

In this example, a threshold of 0 dB is chosen. Ships are detected where backscatter values are higher than this threshold. The binary image is vectorised so pixels from the same ship are grouped as one object.

Visual inspection confirms reasonable detection of large cargo ships. The threshold can be adjusted for different area and vessel types. With known ship locations, the threshold can be optimised using training data.

# set ship detection threshold in vv to 0 dB
ship_vv_db = 0
def detect_ships(da, time_idx, thresh,, transform=S1.geobox.transform):
    S1_ships = da.isel(time=time_idx) > thresh
    ships_vector = xr_vectorize(S1_ships.values,
                                mask=S1_ships.values == 1)
    return ships_vector
time_idx = 0
ships_time0 = detect_ships(S1.vv_dB.where(water_mask), time_idx, ship_vv_db)
    f'ships_{str(S1.time.values[time_idx])[0:10]}.geojson', driver='GeoJSON')
print("Number of ships detected at this time:", len(ships_time0))
Number of ships detected at this time: 55
/env/lib/python3.6/site-packages/pyproj/crs/ FutureWarning: '+init=<authority>:<code>' syntax is deprecated. '<authority>:<code>' is the preferred initialization method. When making the change, be mindful of axis order changes:
  return _prepare_from_string(" ".join(pjargs))
time_idx = 2
ships_time2 = detect_ships(S1.vv_dB.where(water_mask), time_idx, ship_vv_db)
    f'ships_{str(S1.time.values[time_idx])[0:10]}.geojson', driver='GeoJSON')
print("Number of ships detected at this time:", len(ships_time2))
Number of ships detected at this time: 71
# visualize the ship locations

fig, ax = plt.subplots(1, 2, figsize=(10, 10), sharex=True, sharey=True)
ax[0].set_title('vessels before the event')
ax[1].set_title('vessels after the event');

Caveats and possible improvements

We have applied a simple thresholding method to idenfity ships in the above sections. Only VV backscatter has been used and no speckle filtering has been done. This method is based on the assumption that ships produce very high backscatter signals and all bright objects within the water area are ships.

Additional analysis may be help improve the method: * Threshold for ship pixels are chosen based on visual assessment. The threshold is relatively high so smaller ships may be missed. This threshold may be optimzed with labeled training data for specific use cases. * It is not clear if all the bright objects near the piers are ships. The location and shape of the objects may be used to remove false positives. * Rigid structures onboard the ships may result in multiple disconnected bright spots over one ship. These smaller objects may be grouped to give more reliable ship count.

Despite the above limitations, we demonstrate that with a few analysis steps, DE Africa’s Sentinel-1 backscatter can be used to detect and count large ships.

Identify shipping lanes

Ship locations detected across time can be used to map out popular routes.

In the cells below, we load all Sentinel-1 observations from 2020. Plotting maximum backscatter values over time allows clear identification of the shipping lanes.

Data is lazy-loaded using the dask_chunks options to reduce memory requirement.

# Load the Sentinel-1 data
S1 = load_ard(dc=dc,
              measurements=['vv', 'vh'],
Using pixel quality parameters for Sentinel 1
Finding datasets
Applying pixel quality/cloud mask
Returning 222 time steps as a dask array
S1.vh.where(water_mask).max(dim='time').plot.imshow(robust=True, size=10);
/env/lib/python3.6/site-packages/dask/ RuntimeWarning: All-NaN slice encountered
  return func(*args, **kwargs)
/env/lib/python3.6/site-packages/toolz/ RuntimeWarning: All-NaN slice encountered
  ret = f(ret)

Next steps

When you are done, return to the “Set up analysis” cell, modify some values (e.g. latitude and longitude) and rerun the analysis.

There are a number of key ports covered by Sentinel-1 data in Africa. The available data can be viewed on the DEAfrica Explorer, but we also list the latitude and longitude coordinates for a few key ports below.

Port of Durban in South Africa

latitude = -29.87
longitude = 31.03

Port of Dar Es Salaam in Tanzania

latitude = -6.83
longitude = 39.29

Port de Tanger Med in Morocco

latitude = 35.86
longitude = -5.53

Additional information

License: The code in this notebook is licensed under the Apache License, Version 2.0. Digital Earth Africa data is licensed under the Creative Commons by Attribution 4.0 license.

Contact: If you need assistance, please post a question on the Open Data Cube Slack channel or on the GIS Stack Exchange using the open-data-cube tag (you can view previously asked questions here). If you would like to report an issue with this notebook, you can file one on Github.

Compatible datacube version:


Last Tested:

from datetime import datetime'%Y-%m-%d')