UPPC - Urdu Paraphrase Plagiarism Corpus - Research Portal

Associated organisational units

Electronic data

uppc-urdu-paraphrase
Accepted author manuscript, 333 KB, PDF document
Available under license: CC BY: Creative Commons Attribution 4.0 International License

Keywords

Paraphrase Plagiarism, Corpus Generation, Urdu Plagiarism Detection, Natural Language Processing

View graph of relations

UPPC - Urdu Paraphrase Plagiarism Corpus

Research output: Contribution in Book/Report/Proceedings - With ISBN/ISSN › Conference contribution/Paper › peer-review

Published

Standard

UPPC - Urdu Paraphrase Plagiarism Corpus. / Muhammad, Sharjeel; Rayson, Paul Edward; Nawab, Rao Muhammad Adeel .
Proceedings of LREC 2016, Tenth International Conference on Language Resources and Evaluation. ed. / Nicoletta Calzolari; Khalid Choukri; Thierry Declerck; Marko Grobelnik; Bente Maegaard; Joseph Mariani; Asuncion Moreno; Jan Odijk; Stelios Piperidis. European Language Resources Association (ELRA), 2016. p. 1832-1836.

Research output: Contribution in Book/Report/Proceedings - With ISBN/ISSN › Conference contribution/Paper › peer-review

Harvard

Muhammad, S, Rayson, PE & Nawab, RMA 2016, UPPC - Urdu Paraphrase Plagiarism Corpus. in N Calzolari, K Choukri, T Declerck, M Grobelnik, B Maegaard, J Mariani, A Moreno, J Odijk & S Piperidis (eds), Proceedings of LREC 2016, Tenth International Conference on Language Resources and Evaluation. European Language Resources Association (ELRA), pp. 1832-1836. <http://www.lrec-conf.org/proceedings/lrec2016/pdf/364_Paper.pdf>

APA

Muhammad, S., Rayson, P. E., & Nawab, R. M. A. (2016). UPPC - Urdu Paraphrase Plagiarism Corpus. In N. Calzolari, K. Choukri, T. Declerck, M. Grobelnik, B. Maegaard, J. Mariani, A. Moreno, J. Odijk, & S. Piperidis (Eds.), Proceedings of LREC 2016, Tenth International Conference on Language Resources and Evaluation (pp. 1832-1836). European Language Resources Association (ELRA). http://www.lrec-conf.org/proceedings/lrec2016/pdf/364_Paper.pdf

Vancouver

Muhammad S, Rayson PE, Nawab RMA. UPPC - Urdu Paraphrase Plagiarism Corpus. In Calzolari N, Choukri K, Declerck T, Grobelnik M, Maegaard B, Mariani J, Moreno A, Odijk J, Piperidis S, editors, Proceedings of LREC 2016, Tenth International Conference on Language Resources and Evaluation. European Language Resources Association (ELRA). 2016. p. 1832-1836 Epub 2016 Mar 2.

Author

Muhammad, Sharjeel ; Rayson, Paul Edward ; Nawab, Rao Muhammad Adeel . / UPPC - Urdu Paraphrase Plagiarism Corpus. Proceedings of LREC 2016, Tenth International Conference on Language Resources and Evaluation. editor / Nicoletta Calzolari ; Khalid Choukri ; Thierry Declerck ; Marko Grobelnik ; Bente Maegaard ; Joseph Mariani ; Asuncion Moreno ; Jan Odijk ; Stelios Piperidis. European Language Resources Association (ELRA), 2016. pp. 1832-1836

Bibtex

@inproceedings{8fcf03b750ee4ed2b3f87a579a5e9d5a,

title = "UPPC - Urdu Paraphrase Plagiarism Corpus",

abstract = "Paraphrase plagiarism is a significant and widespread problem and research shows that it is hard to detect. Several methods and automatic systems have been proposed to deal with it. However, evaluation and comparison of such solutions is not possible because of the unavailability of benchmark corpora with manual examples of paraphrase plagiarism. To deal with this issue, we present the novel development of a paraphrase plagiarism corpus containing simulated (manually created) examples in the Urdu language - a language widely spoken around the world. This resource is the first of its kind developed for the Urdu language and we believe that it will be a valuable contribution to the evaluation of paraphrase plagiarism detection systems.",

keywords = "Paraphrase Plagiarism, Corpus Generation, Urdu Plagiarism Detection, Natural Language Processing",

author = "Sharjeel Muhammad and Rayson, {Paul Edward} and Nawab, {Rao Muhammad Adeel}",

year = "2016",

month = may,

day = "23",

language = "English",

isbn = "9782951740891",

pages = "1832--1836",

editor = "Nicoletta Calzolari and Khalid Choukri and Thierry Declerck and Marko Grobelnik and Bente Maegaard and Joseph Mariani and Asuncion Moreno and Jan Odijk and Stelios Piperidis",

booktitle = "Proceedings of LREC 2016, Tenth International Conference on Language Resources and Evaluation",

publisher = "European Language Resources Association (ELRA)",

}

RIS

TY - GEN

T1 - UPPC - Urdu Paraphrase Plagiarism Corpus

AU - Muhammad, Sharjeel

AU - Rayson, Paul Edward

AU - Nawab, Rao Muhammad Adeel

PY - 2016/5/23

Y1 - 2016/5/23

N2 - Paraphrase plagiarism is a significant and widespread problem and research shows that it is hard to detect. Several methods and automatic systems have been proposed to deal with it. However, evaluation and comparison of such solutions is not possible because of the unavailability of benchmark corpora with manual examples of paraphrase plagiarism. To deal with this issue, we present the novel development of a paraphrase plagiarism corpus containing simulated (manually created) examples in the Urdu language - a language widely spoken around the world. This resource is the first of its kind developed for the Urdu language and we believe that it will be a valuable contribution to the evaluation of paraphrase plagiarism detection systems.

AB - Paraphrase plagiarism is a significant and widespread problem and research shows that it is hard to detect. Several methods and automatic systems have been proposed to deal with it. However, evaluation and comparison of such solutions is not possible because of the unavailability of benchmark corpora with manual examples of paraphrase plagiarism. To deal with this issue, we present the novel development of a paraphrase plagiarism corpus containing simulated (manually created) examples in the Urdu language - a language widely spoken around the world. This resource is the first of its kind developed for the Urdu language and we believe that it will be a valuable contribution to the evaluation of paraphrase plagiarism detection systems.

KW - Paraphrase Plagiarism

KW - Corpus Generation

KW - Urdu Plagiarism Detection

KW - Natural Language Processing

M3 - Conference contribution/Paper

SN - 9782951740891

SP - 1832

EP - 1836

BT - Proceedings of LREC 2016, Tenth International Conference on Language Resources and Evaluation

A2 - Calzolari, Nicoletta

A2 - Choukri, Khalid

A2 - Declerck, Thierry

A2 - Grobelnik, Marko

A2 - Maegaard, Bente

A2 - Mariani, Joseph

A2 - Moreno, Asuncion

A2 - Odijk, Jan

A2 - Piperidis, Stelios

PB - European Language Resources Association (ELRA)

ER -

Research

Associated organisational units

Electronic data

Links

Keywords

UPPC - Urdu Paraphrase Plagiarism Corpus

Standard

Harvard

APA

Vancouver

Author

Bibtex

RIS

Quick Links

Connect With Us

Faculties & Depts

Contact Us