In this paper, a multivariate statistical technique combined with a machine learning algorithm is proposed to provide a fault classification and feature extraction approach for the wind turbines. As the probability density distributions (PDDs) of the monitoring variables can illustrate the inner correlations among variables, the dominant factors causing the failure are figured out, with the comparison of PDD of the variables under the healthy and unhealthy scenarios. Then the selected variables are used for fault feature extraction by using kernel support vector machine (KSVM). The presented algorithms are implemented and assessed based on the supervisory control and data acquisition (SCADA) data acquired from an operational wind farm. The results show the features relating specifically to the faults are extracted to be able to identify and analyse different faults for the wind turbines.