Robust Quantization for General Similarity Search

Computing and Communications

Associated organisational unit

Data Science

Electronic data

TIP-GUO-FINAL
Rights statement: ©2017 IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE.
Accepted author manuscript, 570 KB, PDF document
Available under license: CC BY-NC: Creative Commons Attribution-NonCommercial 4.0 International License

Text available via DOI:

https://doi.org/10.1109/TIP.2017.2766445
Final published version

View graph of relations

Research output: Contribution to Journal/Magazine › Journal article › peer-review

Published

Standard

Robust Quantization for General Similarity Search. / Guo, Yuchen; Ding, Guiguang; Han, Jungong.
In: IEEE Transactions on Image Processing, Vol. 27, No. 2, 01.02.2018, p. 949-963.

Research output: Contribution to Journal/Magazine › Journal article › peer-review

Harvard

Guo, Y, Ding, G & Han, J 2018, 'Robust Quantization for General Similarity Search', IEEE Transactions on Image Processing, vol. 27, no. 2, pp. 949-963. https://doi.org/10.1109/TIP.2017.2766445

APA

Guo, Y., Ding, G., & Han, J. (2018). Robust Quantization for General Similarity Search. IEEE Transactions on Image Processing, 27(2), 949-963. https://doi.org/10.1109/TIP.2017.2766445

Vancouver

Guo Y, Ding G, Han J. Robust Quantization for General Similarity Search. IEEE Transactions on Image Processing. 2018 Feb 1;27(2):949-963. Epub 2017 Oct 25. doi: 10.1109/TIP.2017.2766445

Author

Guo, Yuchen ; Ding, Guiguang ; Han, Jungong. / Robust Quantization for General Similarity Search. In: IEEE Transactions on Image Processing. 2018 ; Vol. 27, No. 2. pp. 949-963.

Bibtex

@article{c757ae8d067349cabfc88e138cacbba7,

title = "Robust Quantization for General Similarity Search",

abstract = "The recent years have witnessed the emerging of vector quantization (VQ) techniques for efficient similarity search. VQ partitions the feature space into a set of codewords and encodes data points as integer indices using the codewords. Then the distance between data points can be efficiently approximated by simple memory lookup operations. By the compact quantization, the storage cost and searching complexity are significantly reduced, thereby facilitating efficient large-scale similarity search. However, the performance of several celebrated VQ approaches degrades significantly when dealing with noisy data. Additionally, it can barely facilitate a wide range of applications as the distortion measurement only limits to ℓ2 norm. To address the shortcomings of the squared Euclidean (ℓ2,2 norm) loss function employed by the VQ approaches, in this paper, we propose a novel robust and general VQ framework, named RGVQ, to enhance both robustness and generalization of VQ approaches. Specifically, a ℓp,q-norm loss function is proposed to conduct the ℓp-norm similarity search, rather than the ℓ2 norm search, and the q-th order loss is used to enhance the robustness. Despite the fact that changing the loss function to ℓp,q norm makes VQ approaches more robust and generic, it brings us a challenge that a non-smooth and non-convex orthogonality constrained ℓp,q- norm function has to be minimized. To solve this problem, we propose a novel and efficient optimization scheme and specify it to VQ approaches and theoretically prove its convergence. Extensive experiments on benchmark datasets demonstrate that the proposed RGVQ is better than the original VQ for several approaches, especially when searching similarity in noisy data.",

author = "Yuchen Guo and Guiguang Ding and Jungong Han",

note = "{\textcopyright}2017 IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE.",

year = "2018",

month = feb,

day = "1",

doi = "10.1109/TIP.2017.2766445",

language = "English",

volume = "27",

pages = "949--963",

journal = "IEEE Transactions on Image Processing",

issn = "1057-7149",

publisher = "Institute of Electrical and Electronics Engineers Inc.",

number = "2",

}

RIS

TY - JOUR

T1 - Robust Quantization for General Similarity Search

AU - Guo, Yuchen

AU - Ding, Guiguang

AU - Han, Jungong

N1 - ©2017 IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE.

PY - 2018/2/1

Y1 - 2018/2/1

N2 - The recent years have witnessed the emerging of vector quantization (VQ) techniques for efficient similarity search. VQ partitions the feature space into a set of codewords and encodes data points as integer indices using the codewords. Then the distance between data points can be efficiently approximated by simple memory lookup operations. By the compact quantization, the storage cost and searching complexity are significantly reduced, thereby facilitating efficient large-scale similarity search. However, the performance of several celebrated VQ approaches degrades significantly when dealing with noisy data. Additionally, it can barely facilitate a wide range of applications as the distortion measurement only limits to ℓ2 norm. To address the shortcomings of the squared Euclidean (ℓ2,2 norm) loss function employed by the VQ approaches, in this paper, we propose a novel robust and general VQ framework, named RGVQ, to enhance both robustness and generalization of VQ approaches. Specifically, a ℓp,q-norm loss function is proposed to conduct the ℓp-norm similarity search, rather than the ℓ2 norm search, and the q-th order loss is used to enhance the robustness. Despite the fact that changing the loss function to ℓp,q norm makes VQ approaches more robust and generic, it brings us a challenge that a non-smooth and non-convex orthogonality constrained ℓp,q- norm function has to be minimized. To solve this problem, we propose a novel and efficient optimization scheme and specify it to VQ approaches and theoretically prove its convergence. Extensive experiments on benchmark datasets demonstrate that the proposed RGVQ is better than the original VQ for several approaches, especially when searching similarity in noisy data.

AB - The recent years have witnessed the emerging of vector quantization (VQ) techniques for efficient similarity search. VQ partitions the feature space into a set of codewords and encodes data points as integer indices using the codewords. Then the distance between data points can be efficiently approximated by simple memory lookup operations. By the compact quantization, the storage cost and searching complexity are significantly reduced, thereby facilitating efficient large-scale similarity search. However, the performance of several celebrated VQ approaches degrades significantly when dealing with noisy data. Additionally, it can barely facilitate a wide range of applications as the distortion measurement only limits to ℓ2 norm. To address the shortcomings of the squared Euclidean (ℓ2,2 norm) loss function employed by the VQ approaches, in this paper, we propose a novel robust and general VQ framework, named RGVQ, to enhance both robustness and generalization of VQ approaches. Specifically, a ℓp,q-norm loss function is proposed to conduct the ℓp-norm similarity search, rather than the ℓ2 norm search, and the q-th order loss is used to enhance the robustness. Despite the fact that changing the loss function to ℓp,q norm makes VQ approaches more robust and generic, it brings us a challenge that a non-smooth and non-convex orthogonality constrained ℓp,q- norm function has to be minimized. To solve this problem, we propose a novel and efficient optimization scheme and specify it to VQ approaches and theoretically prove its convergence. Extensive experiments on benchmark datasets demonstrate that the proposed RGVQ is better than the original VQ for several approaches, especially when searching similarity in noisy data.

U2 - 10.1109/TIP.2017.2766445

DO - 10.1109/TIP.2017.2766445

M3 - Journal article

VL - 27

SP - 949

EP - 963

JO - IEEE Transactions on Image Processing

JF - IEEE Transactions on Image Processing

SN - 1057-7149

IS - 2

ER -

Research

Associated organisational unit

Electronic data

Links

Text available via DOI: