Distance metric learning is a fundamental task in data mining, and is known to enhance the performance of various distance-based algorithms. Here, we consider stochastic training data in which repeated feature vectors can belong to different classes. Our primary motivation for this arises from the field of stochastic simulation. Storing the dynamic trajectory of the system state within a simulation model can support real-time predictions of stochastic performance measures. However, the inherent randomness within the system combined with the recurring nature of the system state leads to data of the type considered, on which existing methods of metric learning are known to struggle. We present a stochastic version of the popular Neighbourhood Components Analysis. We demonstrate its behaviour using simulation examples, and reveal improvements over Neighbourhood Components Analysis when used for nearest neighbour classification of stochastic data.
Date made available | 2023 |
---|
Publisher | Code Ocean |
---|