Home > Research > Publications & Outputs > Very fine spatial resolution urban land cover m...

Electronic data

  • RSE-D-23-00469_R3_manuscript

    Accepted author manuscript, 6.02 MB, PDF document

    Available under license: CC BY: Creative Commons Attribution 4.0 International License


Text available via DOI:

View graph of relations

Very fine spatial resolution urban land cover mapping using an explicable sub-pixel mapping network based on learnable spatial correlation

Research output: Contribution to Journal/MagazineJournal articlepeer-review

Article number113884
<mark>Journal publication date</mark>15/12/2023
<mark>Journal</mark>Remote Sensing of Environment
Publication StatusPublished
Early online date1/11/23
<mark>Original language</mark>English


Sub-pixel mapping is the prevailing approach for dealing with the mixed pixel effect in urban land use/land cover classification, by reconstructing the sub-pixel-scale distribution inside each mixed-pixel based on spatial autocorrelation. However, 1) traditional spatial autocorrelation is limited to a local window, which cannot model the teleconnection between two locations or objects that are far apart and 2) autocorrelation is based on the idea of “the more proximate, the more similar”, which relies on a distance-weight decay parameter and cannot characterize the rich variety of mutual information in spatially heterogenous areas in urban. In this research, we develop and demonstrate a learnable correlation-based sub-pixel mapping (LECOS) method. 1) We use the “mutual retrieval” mechanism of the self-attention operation to model teleconnections that enable more distant locations or objects to be mutually correlated and 2) we design a parameter-free “self-attention in self-attention” operation to learn adaptively the diverse global correlation patterns between pixel and sub-pixel. The learned spatial correlations are then used for reasoning the sub-pixel-scale distribution of each class. We validated our method on the most challenging public datasets of urban scenes, which exhibit considerable spatial heterogeneity with complex structures and broken objects. The learned building-tree, building-road and road-tree correlation patterns contributed most to the sub-pixel reconstruction result of the urban scenes, consistent with in-situ reference data. We further explored the model's explicability in a large-area of several metropolises in China, by mapping land cover in these cities at a 2 m very fine spatial resolution using 10 m Sentinel-2 input images, and found that the derived result not only revealed rich urban spatial heterogeneity, but also that the learned correlation was indicative of urban pattern dynamics, suggesting the potential for greater understanding of issues such as urban fairness, accessibility, human exposure and sustainability.