A DE Africa customized version of the tiledsegsingle.py module implemented by the python package RSGISlib. It has been adapted to run a tiled, parallel image segmentation across a specified number of cpus.
NOTE: the only function that 99% of users will need to call is “performTiledSegmentation”
Documentation for the RSGISlib Image Segmentation Module can be found here: https://www.rsgislib.org/rsgislib_segmentation.html
The code in this notebook is licensed under the Apache License, Version 2.0 (https://www.apache.org/licenses/LICENSE-2.0). Digital Earth Africa data is licensed under the Creative Commons by Attribution 4.0 license (https://creativecommons.org/licenses/by/4.0/).
If you need assistance, please post a question on the Open Data Cube Slack channel (http://slack.opendatacube.org/) or on the GIS Stack Exchange (https://gis.stackexchange.com/questions/ask?tags=open-data-cube) using the open-data-cube tag (you can view previously asked questions here: https://gis.stackexchange.com/questions/tagged/open-data-cube).
If you would like to report an issue with this script, file one on Github: https://github.com/GeoscienceAustralia/dea-notebooks/issues/new
Utility function to call the segmentation algorithm of Shepherd et al. (2014) using the tiled process outlined in Clewley et al (2015).
A class for running the tiled version of the shepherd segmentation algorithm.
A class for running the tiled version of the shepherd segmentation algorithm. This can process larger images than the single scene version with a smaller memory footprint.
This version has been adapted to run over multiple cpus.
It is not intended that this class will be directly used. Please use the function performTiledSegmentation to call this functionality.
performTiledSegmentation(inputImage, clumpsImage, tmpDIR='segtmp', tileWidth=2000, tileHeight=2000, validDataThreshold=0.3, numClusters=60, minPxls=100, distThres=100, bands=None, sampling=100, kmMaxIter=200, ncpus=1)
Utility function to call the segmentation algorithm of Shepherd et al. (2014) using the tiled process outlined in Clewley et al (2015). Adapted here to run tiles across multiple cpus. Use this function to conduct image segmentation on very large geotiffs.
inputImage (str) – is a string containing the name of the input file.
clumpsImage (str) – is a string containing the name of the output clump file.
tmpath (str) – is a file path for intermediate files (default is to create a directory ‘segtmp’). If path does current not exist then it will be created and deleted afterwards.
tileWidth (int) – is an int specifying the width of the tiles used for processing (Default 2000)
tileHeight (int) – is an int specifying the height of the tiles used for processing (Default 2000)
validDataThreshold (float) – is a float (value between 0 - 1) used to specify the amount of valid image pixels (i.e., not a no data value of zero) are within a tile. Tiles failing to meet this threshold are merged with ones which do (Default 0.3).
numClusters (int) – is an int which specifies the number of clusters within the KMeans clustering (default = 60).
minPxls (int) – is an int which specifies the minimum number pixels within a segments (default = 100).
distThres (int) – specifies the distance threshold for joining the segments (default = 100, set to large number to turn off this option).
bands (array-like) – is an array providing a subset of image bands to use (default is None to use all bands).
sampling – specify the subsampling of the image for the data used within the KMeans (default = 100; 1 == no subsampling).
kmMaxIter – maximum iterations for KMeans (Default 200).
- Return type
Segmented .kea file stored on disk at the location of ‘clumpsImage’