Visual saliency estimation through manifold learning

Computing and Communications

Research output: Contribution in Book/Report/Proceedings - With ISBN/ISSN › Conference contribution/Paper › peer-review

Published

Standard

Visual saliency estimation through manifold learning. / Jiang, Richard; Crookes, Danny.
The 26th AAAI Conference on Artificial Intelligence (AAAI-12), 1/07/12. ACM, 2012. p. 2003-2009.

Research output: Contribution in Book/Report/Proceedings - With ISBN/ISSN › Conference contribution/Paper › peer-review

Harvard

Jiang, R & Crookes, D 2012, Visual saliency estimation through manifold learning. in The 26th AAAI Conference on Artificial Intelligence (AAAI-12), 1/07/12. ACM, pp. 2003-2009. <https://dl.acm.org/citation.cfm?id=2901011>

APA

Jiang, R., & Crookes, D. (2012). Visual saliency estimation through manifold learning. In The 26th AAAI Conference on Artificial Intelligence (AAAI-12), 1/07/12 (pp. 2003-2009). ACM. https://dl.acm.org/citation.cfm?id=2901011

Vancouver

Jiang R, Crookes D. Visual saliency estimation through manifold learning. In The 26th AAAI Conference on Artificial Intelligence (AAAI-12), 1/07/12. ACM. 2012. p. 2003-2009

Author

Jiang, Richard ; Crookes, Danny. / Visual saliency estimation through manifold learning. The 26th AAAI Conference on Artificial Intelligence (AAAI-12), 1/07/12. ACM, 2012. pp. 2003-2009

Bibtex

@inproceedings{9a33529c7aab40d2945a147b5c2a15dd,

title = "Visual saliency estimation through manifold learning",

abstract = "Saliency detection has been a desirable way for robotic vision to find the most noticeable objects in a scene. In this paper, a robust manifold-based saliency estimation method has been developed to help capture the most salient objects in front of robotic eyes, namely cameras. In the proposed approach, an image is considered as a manifold of visual signals (stimuli) spreading over a connected grid, and local visual stimuli are compared against the global image variation to model the visual saliency. With this model, manifold learning is then applied to minimize the local variation while keeping the global contrast, and turns the RGB image into a multi-channel image. After the projection through manifold learning, histogram-based contrast is then computed for saliency modeling of all channels of the projected images, and mutual information is introduced to evaluate each single-channel saliency map against prior knowledge to provide cues for the fusion of multiple channels. In the last step, the fusion procedure combines all single-channel saliency maps according to their mutual information score, and generates the final saliency map. In our experiment, the proposed method is evaluated using one of the largest publicly available image datasets. The experimental results demonstrate that our algorithm consistently outperforms the state-of-the-art unsupervised saliency detection methods, yielding higher precision and better recall rates. Furthermore, the proposed method is tested on a video-type test dataset where a moving camera is trying to catch up with the walking person---a salient object in the video sequence. Our experimental results show that the proposed approach can successful accomplish this task, revealing its potential use for similar robotic applications.",

author = "Richard Jiang and Danny Crookes",

year = "2012",

month = jul,

language = "English",

pages = "2003--2009",

booktitle = "The 26th AAAI Conference on Artificial Intelligence (AAAI-12), 1/07/12",

publisher = "ACM",

}

RIS

TY - GEN

T1 - Visual saliency estimation through manifold learning

AU - Jiang, Richard

AU - Crookes, Danny

PY - 2012/7

Y1 - 2012/7

N2 - Saliency detection has been a desirable way for robotic vision to find the most noticeable objects in a scene. In this paper, a robust manifold-based saliency estimation method has been developed to help capture the most salient objects in front of robotic eyes, namely cameras. In the proposed approach, an image is considered as a manifold of visual signals (stimuli) spreading over a connected grid, and local visual stimuli are compared against the global image variation to model the visual saliency. With this model, manifold learning is then applied to minimize the local variation while keeping the global contrast, and turns the RGB image into a multi-channel image. After the projection through manifold learning, histogram-based contrast is then computed for saliency modeling of all channels of the projected images, and mutual information is introduced to evaluate each single-channel saliency map against prior knowledge to provide cues for the fusion of multiple channels. In the last step, the fusion procedure combines all single-channel saliency maps according to their mutual information score, and generates the final saliency map. In our experiment, the proposed method is evaluated using one of the largest publicly available image datasets. The experimental results demonstrate that our algorithm consistently outperforms the state-of-the-art unsupervised saliency detection methods, yielding higher precision and better recall rates. Furthermore, the proposed method is tested on a video-type test dataset where a moving camera is trying to catch up with the walking person---a salient object in the video sequence. Our experimental results show that the proposed approach can successful accomplish this task, revealing its potential use for similar robotic applications.

AB - Saliency detection has been a desirable way for robotic vision to find the most noticeable objects in a scene. In this paper, a robust manifold-based saliency estimation method has been developed to help capture the most salient objects in front of robotic eyes, namely cameras. In the proposed approach, an image is considered as a manifold of visual signals (stimuli) spreading over a connected grid, and local visual stimuli are compared against the global image variation to model the visual saliency. With this model, manifold learning is then applied to minimize the local variation while keeping the global contrast, and turns the RGB image into a multi-channel image. After the projection through manifold learning, histogram-based contrast is then computed for saliency modeling of all channels of the projected images, and mutual information is introduced to evaluate each single-channel saliency map against prior knowledge to provide cues for the fusion of multiple channels. In the last step, the fusion procedure combines all single-channel saliency maps according to their mutual information score, and generates the final saliency map. In our experiment, the proposed method is evaluated using one of the largest publicly available image datasets. The experimental results demonstrate that our algorithm consistently outperforms the state-of-the-art unsupervised saliency detection methods, yielding higher precision and better recall rates. Furthermore, the proposed method is tested on a video-type test dataset where a moving camera is trying to catch up with the walking person---a salient object in the video sequence. Our experimental results show that the proposed approach can successful accomplish this task, revealing its potential use for similar robotic applications.

M3 - Conference contribution/Paper

SP - 2003

EP - 2009

BT - The 26th AAAI Conference on Artificial Intelligence (AAAI-12), 1/07/12

PB - ACM

ER -

Research

Links