Home > Research > Publications & Outputs > Visual saliency estimation through manifold lea...


View graph of relations

Visual saliency estimation through manifold learning

Research output: Contribution in Book/Report/Proceedings - With ISBN/ISSNConference contribution/Paperpeer-review

Publication date07/2012
Host publicationThe 26th AAAI Conference on Artificial Intelligence (AAAI-12), 1/07/12
Number of pages7
<mark>Original language</mark>English


Saliency detection has been a desirable way for robotic vision to find the most noticeable objects in a scene. In this paper, a robust manifold-based saliency estimation method has been developed to help capture the most salient objects in front of robotic eyes, namely cameras. In the proposed approach, an image is considered as a manifold of visual signals (stimuli) spreading over a connected grid, and local visual stimuli are compared against the global image variation to model the visual saliency. With this model, manifold learning is then applied to minimize the local variation while keeping the global contrast, and turns the RGB image into a multi-channel image. After the projection through manifold learning, histogram-based contrast is then computed for saliency modeling of all channels of the projected images, and mutual information is introduced to evaluate each single-channel saliency map against prior knowledge to provide cues for the fusion of multiple channels. In the last step, the fusion procedure combines all single-channel saliency maps according to their mutual information score, and generates the final saliency map. In our experiment, the proposed method is evaluated using one of the largest publicly available image datasets. The experimental results demonstrate that our algorithm consistently outperforms the state-of-the-art unsupervised saliency detection methods, yielding higher precision and better recall rates. Furthermore, the proposed method is tested on a video-type test dataset where a moving camera is trying to catch up with the walking person---a salient object in the video sequence. Our experimental results show that the proposed approach can successful accomplish this task, revealing its potential use for similar robotic applications.