This paper addresses the problem of object tracking in video sequences for surveillance applications by using a recently proposed structural similarity-based image distance measure. Multimodality surveillance videos pose specific challenges to tracking algorithms, due to, for example, low or variable light conditions and the presence of spurious or camouflaged objects. These factors often cause undesired luminance and contrast variations in videos produced by infrared sensors (due to varying thermal conditions) and visible sensors (e.g., the object entering shadowy areas). Commonly used colour and edge histogram-based trackers often fail in such conditions. In contrast, the structural similarity measure reflects the distance between two video frames by jointly comparing their luminance, contrast and spatial characteristics and is sensitive to relative rather than absolute changes in the video frame. In this work, we show that the performance of a particle filter tracker is improved significantly when the structural similarity-based distance is applied instead of the conventional Bhattacharyya histogram-based distance. Extensive evaluation of the proposed algorithm is presented together with comparisons with colour, edge and mean-shift trackers using real-world surveillance video sequences from multimodal (infrared and visible) cameras.