Class-Agnostic Object Counting with Text-to-Image Diffusion Model

Computing and Communications

Associated organisational unit

Artificial Intelligence

Text available via DOI:

https://doi.org/10.1007/978-3-031-72890-7_1
Final published version

View graph of relations

Research output: Contribution in Book/Report/Proceedings - With ISBN/ISSN › Conference contribution/Paper › peer-review

Published

More...

Publication date	7/12/2024
Host publication	Computer Vision -- ECCV 2024: 18th European Conference, Milan, Italy, September 29–October 4, 2024, Proceedings, Part LXIX
Editors	Aleš Leonardis, Elisa Ricci, Stefan Roth, Olga Russakovsky, Torsten Sattler, Gül Varol
Place of Publication	Cham
Publisher	Springer
Pages	1-18
Number of pages	18
ISBN (electronic)	9783031728907
ISBN (print)	9783031728891
<mark>Original language</mark>	English

Publication series

Name	Lecture Notes in Computer Science
Publisher	Springer
Volume	15127
ISSN (Print)	0302-9743
ISSN (electronic)	1611-3349

Abstract

Class-agnostic object counting aims to count objects of arbitrary classes with limited information (e.g., a few exemplars or the class names) provided. It requires the model to effectively acquire the characteristics of the target objects and accurately perform counting, which can be challenging. In this work, inspired by that text-to-image diffusion models hold rich knowledge and comprehensive understanding of real-world objects, we propose to leverage the pre-trained text-to-image diffusion model to facilitate class-agnostic object counting. Specifically, we propose a novel framework named CountDiff with careful designs, leveraging the pre-trained diffusion model’s comprehensive understanding of image contents to perform class-agnostic object counting. The experiments show the effectiveness of CountDiff on both few-shot setting with exemplars provided and zero-shot setting with class names provided.

Research

Associated organisational unit

Links

Text available via DOI: