Machine Learning-Based Processing Proof-of-Concept Pipeline for Semi-Automatic Sentinel-2 Imagery Download, Cloudiness Filtering, Classifications, and Updates of Open Land Use/Land Cover Datasets
Authors | |
---|---|
Year of publication | 2021 |
Type | Article in Periodical |
Magazine / Source | ISPRS International Journal of Geo-Information |
MU Faculty or unit | |
Citation | |
web | https://doi.org/10.3390/ijgi10020102 |
Doi | http://dx.doi.org/10.3390/ijgi10020102 |
Keywords | machine learning; land use; land cover; satellite imagery; Sentinel 2; image classification; cloud masking; LightGBM estimator |
Description | Land use and land cover are continuously changing in today's world. Both domains, therefore, have to rely on updates of external information sources from which the relevant land use/land cover (classification) is extracted. Satellite images are frequent candidates due to their temporal and spatial resolution. On the contrary, the extraction of relevant land use/land cover information is demanding in terms of knowledge base and time. The presented approach offers a proof-of-concept machine-learning pipeline that takes care of the entire complex process in the following manner. The relevant Sentinel-2 images are obtained through the pipeline. Later, cloud masking is performed, including the linear interpolation of merged-feature time frames. Subsequently, four-dimensional arrays are created with all potential training data to become a basis for estimators from the scikit-learn library; the LightGBM estimator is then used. Finally, the classified content is applied to the open land use and open land cover databases. The verification of the provided experiment was conducted against detailed cadastral data, to which Shannon's entropy was applied since the number of cadaster information classes was naturally consistent. The experiment showed a good overall accuracy (OA) of 85.9%. It yielded a classified land use/land cover map of the study area consisting of 7188 km2 in the southern part of the South Moravian Region in the Czech Republic. The developed proof-of-concept machine-learning pipeline is replicable to any other area of interest so far as the requirements for input data are met. |
Related projects: |