Not logged in
PANGAEA.
Data Publisher for Earth & Environmental Science

Tait, Alexander M; Brumby, Steven P; Hyde, Samantha Brooks; Mazzariello, Joseph; Corcoran, Melanie (2021): Dynamic World training dataset for global land use and land cover categorization of satellite imagery [dataset]. PANGAEA, https://doi.org/10.1594/PANGAEA.933475

Always quote citation above when using data! You can download the citation in several formats below.

RIS CitationBibTeX CitationShow MapGoogle Earth

Abstract:
The Dynamic World Training Data is a dataset of over 5 billion pixels of human-labeled ESA Sentinel-2 satellite image, distributed over 24000 tiles collected from all over the world. The dataset is designed to train and validate automated land use and land cover mapping algorithms. The 10m resolution 5.1km-by-5.1km tiles are densely labeled using a ten category classification schema indicating general land use land cover categories. The dataset was created between 2019-08-01 and 2020-02-28, using satellite imagery observations from 2019, with approximately 10% of observations extending back to 2017 in very cloudy regions of the world. This dataset is a component of the National Geographic Society - Google - World Resources Institute Dynamic World project.
The dataset consists of two file types: GeoTIFF files of 510x510 pixel 10m resolution satellite image tiles markup provided by human labelers, and Excel (.xlsx) tables of metadata and class statistics for the above GeoTIFF files. The data is organized into three main folders. One folder contains training data labeled by a team of 25 expert human labelers recruited by National Geographic Society specifically for this project. A second folder contains training data labeled by a larger group of commissioned labelers provided by a commercial crowd-labeler service. The data in these folders is organized by hemisphere and biome number from the RESOLVE Ecoregions2017 biomes categories (https://ecoregions2017.appspot.com/). A third folder contains a validation dataset. This is a holdout set of training data for assessing model accuracy. None of this data is intended to be used in the formulation of the model. Each validation tile was independently labeled by three experts. The validation set contains two versions: the individual markup from each expert labeler, and the image composites of the individual markups.
Each GeoTIFF file encodes information on the location of landscape feature classes as determined by a given labeler. Classes were labeled by visual examination of true color (RGB) composites of Sentinel-2 MultiSpectral Level-2A scenes. The Tier 1 class values used in this phase of the project are as follows: 0 No data (left unmarked), 1 Water, 2 Trees, 3 Grass, 4 Flooded Vegetation, 5 Crops, 6 Scrub, 7 Built Area, 8 Bare Ground, 9 Snow/Ice, 10 Cloud. This dataset does not include the original Sentinel-2 imagery tiles, but metadata on the exact image ID and date is provided The original Sentinel-2 imagery was obtained via Google Earth Engine.
This data is available under a Creative Commons BY-4.0 license and requires the following attribution: This dataset is produced for the Dynamic World Project by National Geographic Society in partnership with Google and the World Resources Institute. Development of the Dynamic World training data was funded in part by the Gordon and Betty Moore Foundation.
Keyword(s):
land use and land cover; satellite image analysis
Coverage:
Median Latitude: 12.671000 * Median Longitude: -177.328000 * South-bound Latitude: -55.508000 * West-bound Longitude: 179.626000 * North-bound Latitude: 80.850000 * East-bound Longitude: -174.282000
Date/Time Start: 2017-03-28T00:00:00 * Date/Time End: 2019-12-12T00:00:00
Event(s):
LandUseCover_2017_2019 * Latitude Start: -55.508000 * Longitude Start: -174.282000 * Latitude End: 80.850000 * Longitude End: 179.626000 * Date/Time Start: 2017-03-28T00:00:00 * Date/Time End: 2019-12-12T00:00:00 * Method/Device: Satellite imagery (SATI)
Parameter(s):
#NameShort NameUnitPrincipal InvestigatorMethod/DeviceComment
1File contentContentTait, Alexander M
2Binary Object (File Size)Binary (Size)BytesTait, Alexander M
3Binary ObjectBinaryTait, Alexander M
Status:
Curation Level: Basic curation (CurationLevelB)
Size:
10 data points

Download Data

Download dataset as tab-delimited text — use the following character encoding:

View dataset as HTML