Extensive cloud contamination severely hinders the interpretation of optical remote sensing images. Existing cloud removal methods focus primarily on the reconstruction of individual cloudy images, with few studies addressing the reconstruction of cloudy time-series images. Furthermore, current methods tend to prioritize using cloud-free auxiliary images while overlooking valuable information present in the cloudy auxiliary images that are temporally closer to the target cloudy image. In this article, we proposed a deep network called Res-cLSTM to reconstruct cloudy time-series images. Res-cLSTM processes time-series images sequentially using convolutional LSTM, synthesizing long- and short-term memory streams to match the complex temporal relationships amongst them. Then, Res-cLSTM further decodes the feature maps using a refined residual module with skip connections, resulting in the final output. Simulated and real cloud removal experiments on Landsat 8 OLI time-series data across five different regions demonstrated that Res-cLSTM is an effective cloud removal method, which can produce more accurate predictions than three benchmark approaches. For example, for reconstruction of the cloudy time-series of three simulated cloudy areas, the average correlation coefficient (CC) of the Res-cLSTM prediction is about 0.01, 0.04, and 0.04 larger than that of the second most accurate method [i.e., autoencoder (AE)]. As a lightweight network, Res-cLSTM does not require global sampling of training data and can fully exploit the valuable information in the noncloud regions of cloudy time-series images to facilitate cloud removal. Moreover, Res-cLSTM demonstrates robustness to thin cloud omission and exhibits a faster convergence rate and, thus, holds great potential for practical applications requiring real-time processing.