Home > Research > Publications & Outputs > Class-Agnostic Object Counting with Text-to-Ima...
View graph of relations

Class-Agnostic Object Counting with Text-to-Image Diffusion Model

Research output: Contribution in Book/Report/Proceedings - With ISBN/ISSNConference contribution/Paperpeer-review

Published

Standard

Class-Agnostic Object Counting with Text-to-Image Diffusion Model. / Hui, Xiaofei; Wu, Qian; Rahmani, Hossein et al.
Computer Vision -- ECCV 2024: 18th European Conference, Milan, Italy, September 29–October 4, 2024, Proceedings, Part LXIX. ed. / Aleš Leonardis; Elisa Ricci; Stefan Roth; Olga Russakovsky; Torsten Sattler; Gül Varol. Cham: Springer, 2024. p. 1-18 (Lecture Notes in Computer Science ; Vol. 15127).

Research output: Contribution in Book/Report/Proceedings - With ISBN/ISSNConference contribution/Paperpeer-review

Harvard

Hui, X, Wu, Q, Rahmani, H & Liu, J 2024, Class-Agnostic Object Counting with Text-to-Image Diffusion Model. in A Leonardis, E Ricci, S Roth, O Russakovsky, T Sattler & G Varol (eds), Computer Vision -- ECCV 2024: 18th European Conference, Milan, Italy, September 29–October 4, 2024, Proceedings, Part LXIX. Lecture Notes in Computer Science , vol. 15127, Springer, Cham, pp. 1-18. https://doi.org/10.1007/978-3-031-72890-7_1

APA

Hui, X., Wu, Q., Rahmani, H., & Liu, J. (2024). Class-Agnostic Object Counting with Text-to-Image Diffusion Model. In A. Leonardis, E. Ricci, S. Roth, O. Russakovsky, T. Sattler, & G. Varol (Eds.), Computer Vision -- ECCV 2024: 18th European Conference, Milan, Italy, September 29–October 4, 2024, Proceedings, Part LXIX (pp. 1-18). (Lecture Notes in Computer Science ; Vol. 15127). Springer. https://doi.org/10.1007/978-3-031-72890-7_1

Vancouver

Hui X, Wu Q, Rahmani H, Liu J. Class-Agnostic Object Counting with Text-to-Image Diffusion Model. In Leonardis A, Ricci E, Roth S, Russakovsky O, Sattler T, Varol G, editors, Computer Vision -- ECCV 2024: 18th European Conference, Milan, Italy, September 29–October 4, 2024, Proceedings, Part LXIX. Cham: Springer. 2024. p. 1-18. (Lecture Notes in Computer Science ). Epub 2024 Nov 3. doi: 10.1007/978-3-031-72890-7_1

Author

Hui, Xiaofei ; Wu, Qian ; Rahmani, Hossein et al. / Class-Agnostic Object Counting with Text-to-Image Diffusion Model. Computer Vision -- ECCV 2024: 18th European Conference, Milan, Italy, September 29–October 4, 2024, Proceedings, Part LXIX. editor / Aleš Leonardis ; Elisa Ricci ; Stefan Roth ; Olga Russakovsky ; Torsten Sattler ; Gül Varol. Cham : Springer, 2024. pp. 1-18 (Lecture Notes in Computer Science ).

Bibtex

@inproceedings{93c9bfc01ccb46508fc5a6348767f5b2,
title = "Class-Agnostic Object Counting with Text-to-Image Diffusion Model",
abstract = "Class-agnostic object counting aims to count objects of arbitrary classes with limited information (e.g., a few exemplars or the class names) provided. It requires the model to effectively acquire the characteristics of the target objects and accurately perform counting, which can be challenging. In this work, inspired by that text-to-image diffusion models hold rich knowledge and comprehensive understanding of real-world objects, we propose to leverage the pre-trained text-to-image diffusion model to facilitate class-agnostic object counting. Specifically, we propose a novel framework named CountDiff with careful designs, leveraging the pre-trained diffusion model{\textquoteright}s comprehensive understanding of image contents to perform class-agnostic object counting. The experiments show the effectiveness of CountDiff on both few-shot setting with exemplars provided and zero-shot setting with class names provided.",
author = "Xiaofei Hui and Qian Wu and Hossein Rahmani and Jun Liu",
year = "2024",
month = dec,
day = "7",
doi = "10.1007/978-3-031-72890-7_1",
language = "English",
isbn = "9783031728891",
series = "Lecture Notes in Computer Science ",
publisher = "Springer",
pages = "1--18",
editor = "Leonardis, {Ale{\v s} } and Elisa Ricci and Stefan Roth and Olga Russakovsky and Torsten Sattler and G{\"u}l Varol",
booktitle = "Computer Vision -- ECCV 2024",

}

RIS

TY - GEN

T1 - Class-Agnostic Object Counting with Text-to-Image Diffusion Model

AU - Hui, Xiaofei

AU - Wu, Qian

AU - Rahmani, Hossein

AU - Liu, Jun

PY - 2024/12/7

Y1 - 2024/12/7

N2 - Class-agnostic object counting aims to count objects of arbitrary classes with limited information (e.g., a few exemplars or the class names) provided. It requires the model to effectively acquire the characteristics of the target objects and accurately perform counting, which can be challenging. In this work, inspired by that text-to-image diffusion models hold rich knowledge and comprehensive understanding of real-world objects, we propose to leverage the pre-trained text-to-image diffusion model to facilitate class-agnostic object counting. Specifically, we propose a novel framework named CountDiff with careful designs, leveraging the pre-trained diffusion model’s comprehensive understanding of image contents to perform class-agnostic object counting. The experiments show the effectiveness of CountDiff on both few-shot setting with exemplars provided and zero-shot setting with class names provided.

AB - Class-agnostic object counting aims to count objects of arbitrary classes with limited information (e.g., a few exemplars or the class names) provided. It requires the model to effectively acquire the characteristics of the target objects and accurately perform counting, which can be challenging. In this work, inspired by that text-to-image diffusion models hold rich knowledge and comprehensive understanding of real-world objects, we propose to leverage the pre-trained text-to-image diffusion model to facilitate class-agnostic object counting. Specifically, we propose a novel framework named CountDiff with careful designs, leveraging the pre-trained diffusion model’s comprehensive understanding of image contents to perform class-agnostic object counting. The experiments show the effectiveness of CountDiff on both few-shot setting with exemplars provided and zero-shot setting with class names provided.

U2 - 10.1007/978-3-031-72890-7_1

DO - 10.1007/978-3-031-72890-7_1

M3 - Conference contribution/Paper

SN - 9783031728891

T3 - Lecture Notes in Computer Science

SP - 1

EP - 18

BT - Computer Vision -- ECCV 2024

A2 - Leonardis, Aleš

A2 - Ricci, Elisa

A2 - Roth, Stefan

A2 - Russakovsky, Olga

A2 - Sattler, Torsten

A2 - Varol, Gül

PB - Springer

CY - Cham

ER -