Flow meter is one of the most essential sensors in industrial development, energy measurement and environmental protection. Monitoring of flow meter performance can help detect anomalies early and enable timely corrective actions for critical industrial equipment in harsh operating environments. However, flow meter diagnostic models are often prone to overfitting and low accuracy caused by class-imbalanced small-sample data. To address these problems, a reinforcement learning Mahalanobis Taguchi system (RLMTS) model is proposed in this paper, which primarily consists of three modules, namely Mahalanobis space (MS) construction, threshold determination, and sample classification. In the MS module, an initial MS is constructed by selecting variables through orthogonal array design and signal-to-noise ratio analysis. Reinforcement learning is then introduced to adaptively refine the MS which is verified by the Mahalanobis distance. In the threshold determination module, a neural network algorithm is proposed to replace the traditional quality loss function for optimal threshold determination. In the sample classification module, the fault diagnosis of unknown samples is performed using the valid MS and calculated Mahalanobis distance. Experimental results show that the proposed RLMTS is not only suitable for flow meter fault diagnosis under different class-imbalance ratios with different small sample sizes, but also demonstrates a better diagnostic performance, stronger robustness, and broader applicability compared to the 19 benchmark diagnosis models. The use of RLMTS therefore guarantees stable operation of the flow meters, contributing to energy savings and environmental protection.