LLaFS++ - Research Portal | Lancaster University

Computing and Communications

Associated organisational unit

Insight

Electronic data

llafs_plus_plus
Accepted author manuscript, 6.86 MB, PDF document
Available under license: CC BY: Creative Commons Attribution 4.0 International License

Text available via DOI:

https://doi.org/10.1109/tpami.2025.3573609
Final published version

View graph of relations

LLaFS++: Few-Shot Image Segmentation With Large Language Models

Research output: Contribution to Journal/Magazine › Journal article › peer-review

E-pub ahead of print

Standard

LLaFS++: Few-Shot Image Segmentation With Large Language Models. / Zhu, Lanyun; Chen, Tianrun; Ji, Deyi et al.
In: IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 47, No. 9, 30.09.2025, p. 7715-7732.

Research output: Contribution to Journal/Magazine › Journal article › peer-review

Harvard

Zhu, L, Chen, T, Ji, D, Xu, P, Ye, J & Liu, J 2025, 'LLaFS++: Few-Shot Image Segmentation With Large Language Models', IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 47, no. 9, pp. 7715-7732. https://doi.org/10.1109/tpami.2025.3573609

APA

Zhu, L., Chen, T., Ji, D., Xu, P., Ye, J., & Liu, J. (2025). LLaFS++: Few-Shot Image Segmentation With Large Language Models. IEEE Transactions on Pattern Analysis and Machine Intelligence, 47(9), 7715-7732. Advance online publication. https://doi.org/10.1109/tpami.2025.3573609

Vancouver

Zhu L, Chen T, Ji D, Xu P, Ye J, Liu J. LLaFS++: Few-Shot Image Segmentation With Large Language Models. IEEE Transactions on Pattern Analysis and Machine Intelligence. 2025 Sept 30;47(9):7715-7732. Epub 2025 May 26. doi: 10.1109/tpami.2025.3573609

Author

Zhu, Lanyun ; Chen, Tianrun ; Ji, Deyi et al. / LLaFS++ : Few-Shot Image Segmentation With Large Language Models. In: IEEE Transactions on Pattern Analysis and Machine Intelligence. 2025 ; Vol. 47, No. 9. pp. 7715-7732.

Bibtex

@article{680e13ff2cf7420b8a6aa173e38002d6,

title = "LLaFS++: Few-Shot Image Segmentation With Large Language Models",

abstract = "Despite the rapid advancements in few-shot segmentation (FSS), most of existing methods in this domain are hampered by their reliance on the limited and biased information from only a small number of labeled samples. This limitation inherently restricts their capability to achieve sufficiently high levels of performance. To address this issue, this paper proposes a pioneering framework named LLaFS++, which, for the first time, applies large language models (LLMs) into FSS and achieves notable success. LLaFS++ leverages the extensive prior knowledge embedded by LLMs to guide the segmentation process, effectively compensating for the limited information contained in the few-shot labeled samples and thereby achieving superior results. To enhance the effectiveness of the text-based LLMs in FSS scenarios, we present several innovative and task-specific designs within the LLaFS++ framework. Specifically, we introduce an input instruction that allows the LLM to directly produce segmentation results represented as polygons, and propose a region-attribute corresponding table to simulate the human visual system and provide multi-modal guidance. We also synthesize pseudo samples and use curriculum learning for pretraining to augment data and achieve better optimization, and propose a novel inference method to mitigate potential oversegmentation hallucinations caused by the regional guidance information. Incorporating these designs, LLaFS++ constitutes an effective framework that achieves state-of-the-art results on multiple datasets including PASCAL-5 i, COCO-20 i, and FSS-1000. Our superior performance showcases the remarkable potential of applying LLMs to process few-shot vision tasks.",

author = "Lanyun Zhu and Tianrun Chen and Deyi Ji and Peng Xu and Jieping Ye and Jun Liu",

year = "2025",

month = may,

day = "26",

doi = "10.1109/tpami.2025.3573609",

language = "English",

volume = "47",

pages = "7715--7732",

journal = "IEEE Transactions on Pattern Analysis and Machine Intelligence",

issn = "0162-8828",

publisher = "IEEE Computer Society",

number = "9",

}

RIS

TY - JOUR

T1 - LLaFS++

T2 - Few-Shot Image Segmentation With Large Language Models

AU - Zhu, Lanyun

AU - Chen, Tianrun

AU - Ji, Deyi

AU - Xu, Peng

AU - Ye, Jieping

AU - Liu, Jun

PY - 2025/5/26

Y1 - 2025/5/26

N2 - Despite the rapid advancements in few-shot segmentation (FSS), most of existing methods in this domain are hampered by their reliance on the limited and biased information from only a small number of labeled samples. This limitation inherently restricts their capability to achieve sufficiently high levels of performance. To address this issue, this paper proposes a pioneering framework named LLaFS++, which, for the first time, applies large language models (LLMs) into FSS and achieves notable success. LLaFS++ leverages the extensive prior knowledge embedded by LLMs to guide the segmentation process, effectively compensating for the limited information contained in the few-shot labeled samples and thereby achieving superior results. To enhance the effectiveness of the text-based LLMs in FSS scenarios, we present several innovative and task-specific designs within the LLaFS++ framework. Specifically, we introduce an input instruction that allows the LLM to directly produce segmentation results represented as polygons, and propose a region-attribute corresponding table to simulate the human visual system and provide multi-modal guidance. We also synthesize pseudo samples and use curriculum learning for pretraining to augment data and achieve better optimization, and propose a novel inference method to mitigate potential oversegmentation hallucinations caused by the regional guidance information. Incorporating these designs, LLaFS++ constitutes an effective framework that achieves state-of-the-art results on multiple datasets including PASCAL-5 i, COCO-20 i, and FSS-1000. Our superior performance showcases the remarkable potential of applying LLMs to process few-shot vision tasks.

AB - Despite the rapid advancements in few-shot segmentation (FSS), most of existing methods in this domain are hampered by their reliance on the limited and biased information from only a small number of labeled samples. This limitation inherently restricts their capability to achieve sufficiently high levels of performance. To address this issue, this paper proposes a pioneering framework named LLaFS++, which, for the first time, applies large language models (LLMs) into FSS and achieves notable success. LLaFS++ leverages the extensive prior knowledge embedded by LLMs to guide the segmentation process, effectively compensating for the limited information contained in the few-shot labeled samples and thereby achieving superior results. To enhance the effectiveness of the text-based LLMs in FSS scenarios, we present several innovative and task-specific designs within the LLaFS++ framework. Specifically, we introduce an input instruction that allows the LLM to directly produce segmentation results represented as polygons, and propose a region-attribute corresponding table to simulate the human visual system and provide multi-modal guidance. We also synthesize pseudo samples and use curriculum learning for pretraining to augment data and achieve better optimization, and propose a novel inference method to mitigate potential oversegmentation hallucinations caused by the regional guidance information. Incorporating these designs, LLaFS++ constitutes an effective framework that achieves state-of-the-art results on multiple datasets including PASCAL-5 i, COCO-20 i, and FSS-1000. Our superior performance showcases the remarkable potential of applying LLMs to process few-shot vision tasks.

U2 - 10.1109/tpami.2025.3573609

DO - 10.1109/tpami.2025.3573609

M3 - Journal article

VL - 47

SP - 7715

EP - 7732

JO - IEEE Transactions on Pattern Analysis and Machine Intelligence

JF - IEEE Transactions on Pattern Analysis and Machine Intelligence

SN - 0162-8828

IS - 9

ER -

Research

Associated organisational unit

Electronic data

Links

Text available via DOI:

LLaFS++: Few-Shot Image Segmentation With Large Language Models

Standard

Harvard

APA

Vancouver

Author

Bibtex

RIS

Quick Links

Connect With Us

Faculties & Depts

Contact Us