Exploring clustering for multi-document Arabic summarisation

Computing and Communications

Associated organisational unit

UCREL - University Centre for Computer Corpus Research on Language

Text available via DOI:

https://doi.org/10.1007/978-3-642-25631-8_50
Final published version

View graph of relations

Research output: Contribution in Book/Report/Proceedings - With ISBN/ISSN › Conference contribution/Paper › peer-review

Published

Mahmoud El-Haj
Udo Kruschwitz
Chris Fox

More...

Publication date	2011
Host publication	Information Retrieval Technology: 7th Asia Information Retrieval Societies Conference, AIRS 2011, Dubai, United Arab Emirates, December 18-20, 2011. Proceedings
Editors	Mohamed Vall Mohamed Salem, Khaled Shaalan, Farhad Oroumchian, Azadeh Shakery, Halim Khelalfa
Place of Publication	Berlin
Publisher	Springer
Pages	550-561
Number of pages	12
ISBN (electronic)	9783642256318
ISBN (print)	9783642256301
<mark>Original language</mark>	English

Publication series

Name	Lecture Notes in Computer Science
Publisher	Springer
Volume	7097
ISSN (Print)	0302-9743
ISSN (electronic)	1611-9743

Abstract

In this paper we explore clustering for multi-document Arabic summarisation. For our evaluation we use an Arabic version of the DUC-2002 dataset that we previously generated using Google Translate. We explore how clustering (at the sentence level) can be applied to multi-document summarisation as well as for redundancy elimination within this process. We use different parameter settings including the cluster size and the selection model applied in the extractive summarisation process. The automatically generated summaries are evaluated using the ROUGE metric, as well as precision and recall. The results we achieve are compared with the top five systems in the DUC-2002 multi-document summarisation task.

Research

Associated organisational unit

Links

Text available via DOI: