Home > Research > Publications & Outputs > Related or duplicate

Links

Text available via DOI:

View graph of relations

Related or duplicate: Distinguishing similar CQA questions via convolutional neural networks

Research output: Contribution in Book/Report/Proceedings - With ISBN/ISSNConference contribution/Paperpeer-review

Published

Standard

Related or duplicate: Distinguishing similar CQA questions via convolutional neural networks. / Zhang, Wei Emma; Sheng, Quan Z.; Tang, Zhejun et al.
SIGIR '18 The 41st International ACM SIGIR Conference on Research & Development in Information Retrieval. New York: Association for Computing Machinery, Inc, 2018. p. 1153-1156.

Research output: Contribution in Book/Report/Proceedings - With ISBN/ISSNConference contribution/Paperpeer-review

Harvard

Zhang, WE, Sheng, QZ, Tang, Z & Ruan, W 2018, Related or duplicate: Distinguishing similar CQA questions via convolutional neural networks. in SIGIR '18 The 41st International ACM SIGIR Conference on Research & Development in Information Retrieval. Association for Computing Machinery, Inc, New York, pp. 1153-1156, 41st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2018, Ann Arbor, United States, 8/07/18. https://doi.org/10.1145/3209978.3210110

APA

Zhang, W. E., Sheng, Q. Z., Tang, Z., & Ruan, W. (2018). Related or duplicate: Distinguishing similar CQA questions via convolutional neural networks. In SIGIR '18 The 41st International ACM SIGIR Conference on Research & Development in Information Retrieval (pp. 1153-1156). Association for Computing Machinery, Inc. https://doi.org/10.1145/3209978.3210110

Vancouver

Zhang WE, Sheng QZ, Tang Z, Ruan W. Related or duplicate: Distinguishing similar CQA questions via convolutional neural networks. In SIGIR '18 The 41st International ACM SIGIR Conference on Research & Development in Information Retrieval. New York: Association for Computing Machinery, Inc. 2018. p. 1153-1156 doi: 10.1145/3209978.3210110

Author

Zhang, Wei Emma ; Sheng, Quan Z. ; Tang, Zhejun et al. / Related or duplicate : Distinguishing similar CQA questions via convolutional neural networks. SIGIR '18 The 41st International ACM SIGIR Conference on Research & Development in Information Retrieval. New York : Association for Computing Machinery, Inc, 2018. pp. 1153-1156

Bibtex

@inproceedings{b12954303a5e4fc4a8b427b3f6554e25,
title = "Related or duplicate: Distinguishing similar CQA questions via convolutional neural networks",
abstract = "Plenty of research attempts target the automatic duplicate detection in Community Question Answering (CQA) systems and frame the task as a supervised learning problem on the question pairs. However, these methods rely on handcrafted features, leading to the difficulty of distinguishing related and duplicate questions as they are often textually similar. To tackle this issue, we propose to leverage neural network architecture to extract {"}deep{"} features to identify whether a question pair is duplicate or related. In particular, we construct question correlation matrices, which capture the word-wise similarities between questions. The constructed matrices are input to our proposed convolutional neural network (CNN), in which the convolutional operation moves through the two dimensions of the matrices. Empirical studies on a range of real-world CQA datasets confirm the effectiveness of our proposed correlation matrices and the CNN. Our method outperforms the state-of-the-art methods and achieves better classification performance.",
keywords = "Convolutional neural networks, Question answering, Search quality",
author = "Zhang, {Wei Emma} and Sheng, {Quan Z.} and Zhejun Tang and Wenjie Ruan",
year = "2018",
month = jun,
day = "27",
doi = "10.1145/3209978.3210110",
language = "English",
pages = "1153--1156",
booktitle = "SIGIR '18 The 41st International ACM SIGIR Conference on Research & Development in Information Retrieval",
publisher = "Association for Computing Machinery, Inc",
note = "41st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2018 ; Conference date: 08-07-2018 Through 12-07-2018",

}

RIS

TY - GEN

T1 - Related or duplicate

T2 - 41st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2018

AU - Zhang, Wei Emma

AU - Sheng, Quan Z.

AU - Tang, Zhejun

AU - Ruan, Wenjie

PY - 2018/6/27

Y1 - 2018/6/27

N2 - Plenty of research attempts target the automatic duplicate detection in Community Question Answering (CQA) systems and frame the task as a supervised learning problem on the question pairs. However, these methods rely on handcrafted features, leading to the difficulty of distinguishing related and duplicate questions as they are often textually similar. To tackle this issue, we propose to leverage neural network architecture to extract "deep" features to identify whether a question pair is duplicate or related. In particular, we construct question correlation matrices, which capture the word-wise similarities between questions. The constructed matrices are input to our proposed convolutional neural network (CNN), in which the convolutional operation moves through the two dimensions of the matrices. Empirical studies on a range of real-world CQA datasets confirm the effectiveness of our proposed correlation matrices and the CNN. Our method outperforms the state-of-the-art methods and achieves better classification performance.

AB - Plenty of research attempts target the automatic duplicate detection in Community Question Answering (CQA) systems and frame the task as a supervised learning problem on the question pairs. However, these methods rely on handcrafted features, leading to the difficulty of distinguishing related and duplicate questions as they are often textually similar. To tackle this issue, we propose to leverage neural network architecture to extract "deep" features to identify whether a question pair is duplicate or related. In particular, we construct question correlation matrices, which capture the word-wise similarities between questions. The constructed matrices are input to our proposed convolutional neural network (CNN), in which the convolutional operation moves through the two dimensions of the matrices. Empirical studies on a range of real-world CQA datasets confirm the effectiveness of our proposed correlation matrices and the CNN. Our method outperforms the state-of-the-art methods and achieves better classification performance.

KW - Convolutional neural networks

KW - Question answering

KW - Search quality

U2 - 10.1145/3209978.3210110

DO - 10.1145/3209978.3210110

M3 - Conference contribution/Paper

AN - SCOPUS:85051505258

SP - 1153

EP - 1156

BT - SIGIR '18 The 41st International ACM SIGIR Conference on Research & Development in Information Retrieval

PB - Association for Computing Machinery, Inc

CY - New York

Y2 - 8 July 2018 through 12 July 2018

ER -