Rights statement: ©2019 IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE.
Accepted author manuscript, 4.19 MB, PDF document
Available under license: CC BY-NC: Creative Commons Attribution-NonCommercial 4.0 International License
Final published version
Research output: Contribution to Journal/Magazine › Journal article › peer-review
Research output: Contribution to Journal/Magazine › Journal article › peer-review
}
TY - JOUR
T1 - Attribute-Guided Network for Cross-Modal Zero-Shot Hashing
AU - Ji, Zhong
AU - Sun, Yuxin
AU - Yu, Yunlong
AU - Pang, Yanwei
AU - Han, Jungong
N1 - ©2019 IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE.
PY - 2020/1/1
Y1 - 2020/1/1
N2 - Zero-shot hashing (ZSH) aims at learning a hashing model that is trained only by instances from seen categories but can generate well to those of unseen categories. Typically, it is achieved by utilizing a semantic embedding space to transfer knowledge from seen domain to unseen domain. Existing efforts mainly focus on single-modal retrieval task, especially image-based image retrieval (IBIR). However, as a highlighted research topic in the field of hashing, cross-modal retrieval is more common in real-world applications. To address the cross-modal ZSH (CMZSH) retrieval task, we propose a novel attribute-guided network (AgNet), which can perform not only IBIR but also text-based image retrieval (TBIR). In particular, AgNet aligns different modal data into a semantically rich attribute space, which bridges the gap caused by modality heterogeneity and zero-shot setting. We also design an effective strategy that exploits the attribute to guide the generation of hash codes for image and text within the same network. Extensive experimental results on three benchmark data sets (AwA, SUN, and ImageNet) demonstrate the superiority of AgNet on both cross-modal and single-modal zero-shot image retrieval tasks.
AB - Zero-shot hashing (ZSH) aims at learning a hashing model that is trained only by instances from seen categories but can generate well to those of unseen categories. Typically, it is achieved by utilizing a semantic embedding space to transfer knowledge from seen domain to unseen domain. Existing efforts mainly focus on single-modal retrieval task, especially image-based image retrieval (IBIR). However, as a highlighted research topic in the field of hashing, cross-modal retrieval is more common in real-world applications. To address the cross-modal ZSH (CMZSH) retrieval task, we propose a novel attribute-guided network (AgNet), which can perform not only IBIR but also text-based image retrieval (TBIR). In particular, AgNet aligns different modal data into a semantically rich attribute space, which bridges the gap caused by modality heterogeneity and zero-shot setting. We also design an effective strategy that exploits the attribute to guide the generation of hash codes for image and text within the same network. Extensive experimental results on three benchmark data sets (AwA, SUN, and ImageNet) demonstrate the superiority of AgNet on both cross-modal and single-modal zero-shot image retrieval tasks.
U2 - 10.1109/TNNLS.2019.2904991
DO - 10.1109/TNNLS.2019.2904991
M3 - Journal article
VL - 31
SP - 321
EP - 330
JO - IEEE Transactions on Neural Networks and Learning Systems
JF - IEEE Transactions on Neural Networks and Learning Systems
SN - 2162-237X
IS - 1
ER -