Spatio-temporal fusion is a technique used to produce images with both fine spatial and temporal resolution. Generally, the principle of existing spatio-temporal fusion methods can be characterized by a unified framework of prediction based on two parts: (i) the known fine spatial resolution images (e.g., Landsat images), and (ii) the fine spatial resolution increment predicted from the available coarse spatial resolution increment (i.e., a downscaling process), that is, the difference between the coarse spatial resolution images (e.g., MODIS images) acquired at the known and prediction times. Owing to seasonal changes and land cover changes, there always exist large differences between images acquired at different times, resulting in a large increment and, further, great uncertainty in downscaling. In this paper, a virtual image pair-based spatio-temporal fusion (VIPSTF) approach was proposed to deal with this problem. VIPSTF is based on the concept of a virtual image pair (VIP), which is produced based on the available, known MODIS-Landsat image pairs. We demonstrate theoretically that compared to the known image pairs, the VIP is closer to the data at the prediction time. The VIP can capture more fine spatial resolution information directly from known images and reduce the challenge in downscaling. VIPSTF is a flexible framework suitable for existing spatial weighting- and spatial unmixing-based methods, and two versions VIPSTF-SW and VIPSTF-SU are, thus, developed. Experimental results on a heterogeneous site and a site experiencing land cover type changes show that both spatial weighting- and spatial unmixing-based methods can be enhanced by VIPSTF, and the advantage is particularly noticeable when the observed image pairs are temporally far from the prediction time. Moreover, VIPSTF is free of the need for image pair selection and robust to the use of multiple image pairs. VIPSTF is also computationally faster than the original methods when using multiple image pairs. The concept of VIP provides a new insight to enhance spatio-temporal fusion by making fuller use of the observed image pairs and reducing the uncertainty of estimating the fine spatial resolution increment. © 2020 Elsevier Inc.
This is the author’s version of a work that was accepted for publication in Remote Sensing of Environment. Changes resulting from the publishing process, such as peer review, editing, corrections, structural formatting, and other quality control mechanisms may not be reflected in this document. Changes may have been made to this work since it was submitted for publication. A definitive version was subsequently published in Remote Sensing of Environment, 249, 2020 DOI: 10.1016/j.rse.2020.112009