Not logged in
Data Publisher for Earth & Environmental Science

Maus, Victor; Giljum, Stefan; Gutschlhofer, Jakob; da Silva, Dieison M; Probst, Michael; Gass, Sidnei L B; Luckeneder, Sebastian; Lieber, Mirko; McCallum, Ian (2020): Global-scale mining polygons (Version 1). PANGAEA,

Always quote citation above when using data! You can download the citation in several formats below.

RIS CitationBibTeX Citation

This data set provides spatially explicit estimates of the area directly used for surface mining on a global scale. It contains more than 21,000 polygons of activities related to mining, mainly of coal and metal ores. Several data sources were compiled to identify the approximate location of mines active at any time between the years 2000 to 2017. This data set does not cover all existing mining locations across the globe. The polygons were delineated by experts using Sentinel-2 cloudless ( by EOX IT Services GmbH (contains modified Copernicus Sentinel data 2017 & 2018)) and very high-resolution satellite images available from Google Satellite and Bing Imagery. The derived polygons cover the direct land used by mining activities, including open cuts, tailing dams, waste rock dumps, water ponds, and processing infrastructure. The main data set consists of a GeoPackage (GPKG) file, including the following variables: ISO3_CODE<string>, COUNTRY_NAME<string>, AREA<double> in squared kilometres, FID<integer> with the feature ID, and geom<polygon> in geographical coordinates WGS84. The summary of the mining area per country is available in comma-separated values (CSV) file, including the following variables: ISO3_CODE<string>, COUNTRY_NAME<string>, AREA<double> in squared kilometers, and N_FEATURES<integer> number of mapped features. Grid data sets with the mining area per cell were derived from the polygons. The grid data is available at 30 arc-second resolution (approximately 1x1 km at the equator), 5 arc-minute (approximately 10x10 km at the equator), and 30 arc-minute resolution (approximately 55x55 km at the equator). We performed an independent validation of the mining data set using control points. For that, we draw a 1,000 random samples stratified between two classes: mine and no-mine. The control points are also available as a GPKG file, including the variables: MAPPED<string>, REFERENCE<string>, FID<integer> with the feature ID, and geom<point> in geographical coordinates WGS84. The overall accuracy calculated from the control points was 88.4%, other accuracy metrics are shown below.
Confusion Matrix and Statistics
Prediction Mine No-mine
Mine 394 106
No-mine 10 490
Accuracy : 0.884
95% CI : (0.8625, 0.9032)
No Information Rate : 0.596
P-Value [Acc > NIR] : < 2.2e-16
Kappa : 0.768
Mcnemar's Test P-Value : < 2.2e-16
Sensitivity : 0.9752
Specificity : 0.8221
Pos Pred Value : 0.7880
Neg Pred Value : 0.9800
Precision : 0.7880
Recall : 0.9752
F1 : 0.8717
Prevalence : 0.4040
Detection Rate : 0.3940
Detection Prevalence : 0.5000
Balanced Accuracy : 0.8987
This work was supported by the European Research Council (ERC) under the European Union's Horizon 2020 research and innovation programme grant number 725525 FINEPRINT project (
coal; land-use; metal ores; minerals; raw material extraction
Supplement to:
18.4 MBytes

Download Data

Download dataset