Continual Semantic Segmentation (CSS) extends static semantic segmentation by incrementally introducing new classes for training. To alleviate the catastrophic forgetting issue in this task, replay methods can be adopted, constructing a memory buffer that stores a small number of samples from previous classes for future replay. However, existing replay approaches in CSS often lack a thorough exploration of two critical issues: how to find the most suitable memory samples and how to utilize them for replay more effectively. Common strategies either randomly select samples or rely on hand-crafted, single-factor-driven methods that are hard to be optimal, and often employ conventional training techniques for replay that do not account for class imbalance problem resulting from limited memory capacity. In this work, we tackle these challenges by introducing a novel memory sample selection method that leverages a reinforcement learning framework with innovative state representations and a dual-stage action scheme to automatically learn a selection policy. Additionally, we propose an expert mechanism and a dual-phase training method to address the class imbalance issue, thereby enhancing the effectiveness of replay training by making better use of memory samples. Incorporating the proposed automatic sample selection and effective memory utilization methods, we develop a novel and effective replay-based pipeline for CSS. Our extensive experiments on Pascal VOC 2012 and ADE20K datasets demonstrate the effectiveness of our approach, which achieves state-of-the-art (SOTA) performance and outperforms previous advanced methods significantly.