Home > Research > Publications & Outputs > OSMAN

Electronic data

Links

View graph of relations

OSMAN: a novel Arabic readability metric

Research output: Contribution in Book/Report/Proceedings - With ISBN/ISSNConference contribution/Paperpeer-review

Published

Standard

OSMAN: a novel Arabic readability metric. / El-Haj, Mahmoud; Rayson, Paul Edward.
Proceedings of the Language Resources and Evaluation Conference 2016. ed. / Nicoletta Calzolari; Khalid Choukri; Thierry Declerck; Marko Grobelnik; Bente Maegaard; Joseph Mariani; Asuncion Moreno; Jan Odijk; Stelios Piperidis. 10. ed. Slovenia: European Language Resources Association (ELRA), 2016. p. 250-255 77.

Research output: Contribution in Book/Report/Proceedings - With ISBN/ISSNConference contribution/Paperpeer-review

Harvard

El-Haj, M & Rayson, PE 2016, OSMAN: a novel Arabic readability metric. in N Calzolari, K Choukri, T Declerck, M Grobelnik, B Maegaard, J Mariani, A Moreno, J Odijk & S Piperidis (eds), Proceedings of the Language Resources and Evaluation Conference 2016. 10 edn, 77, European Language Resources Association (ELRA), Slovenia, pp. 250-255. <http://www.lrec-conf.org/proceedings/lrec2016/pdf/77_Paper.pdf>

APA

El-Haj, M., & Rayson, P. E. (2016). OSMAN: a novel Arabic readability metric. In N. Calzolari, K. Choukri, T. Declerck, M. Grobelnik, B. Maegaard, J. Mariani, A. Moreno, J. Odijk, & S. Piperidis (Eds.), Proceedings of the Language Resources and Evaluation Conference 2016 (10 ed., pp. 250-255). Article 77 European Language Resources Association (ELRA). http://www.lrec-conf.org/proceedings/lrec2016/pdf/77_Paper.pdf

Vancouver

El-Haj M, Rayson PE. OSMAN: a novel Arabic readability metric. In Calzolari N, Choukri K, Declerck T, Grobelnik M, Maegaard B, Mariani J, Moreno A, Odijk J, Piperidis S, editors, Proceedings of the Language Resources and Evaluation Conference 2016. 10 ed. Slovenia: European Language Resources Association (ELRA). 2016. p. 250-255. 77

Author

El-Haj, Mahmoud ; Rayson, Paul Edward. / OSMAN : a novel Arabic readability metric. Proceedings of the Language Resources and Evaluation Conference 2016. editor / Nicoletta Calzolari ; Khalid Choukri ; Thierry Declerck ; Marko Grobelnik ; Bente Maegaard ; Joseph Mariani ; Asuncion Moreno ; Jan Odijk ; Stelios Piperidis. 10. ed. Slovenia : European Language Resources Association (ELRA), 2016. pp. 250-255

Bibtex

@inproceedings{cc1d00ff0d904f9f8be3d41e57788227,
title = "OSMAN: a novel Arabic readability metric",
abstract = "We present OSMAN (Open Source Metric for Measuring Arabic Narratives) - a novel open source Arabic readability metric and tool. It allows researchers to calculate readability for Arabic text with and without diacritics. OSMAN is a modified version of the conventional readability formulas such as Flesch and Fog. In our work we introduce a novel approach towards counting short, long and stress syllables in Arabic which is essential for judging readability of Arabic narratives. We also introduce an additional factor called “Faseeh” which considers aspects of script usually dropped in informal Arabic writing. To evaluate our methods we used Spearman{\textquoteright}s correlation metric to compare text readability for 73,000 parallel sentences from English and Arabic UN documents. The Arabic sentences were written with the absence of diacritics and in order to count the number of syllables we added the diacritics in using an open source tool called Mishkal. The results show that OSMAN readability formula correlates well with the English ones making it a useful tool for researchers and educators working with Arabic text.",
keywords = "Arabic, readability, NLP, diacritics, OSMAN, flesch, fog, parallel, corpus, corpus linguistics",
author = "Mahmoud El-Haj and Rayson, {Paul Edward}",
year = "2016",
month = may,
day = "23",
language = "English",
isbn = "9782951740891",
pages = "250--255",
editor = "Nicoletta Calzolari and Khalid Choukri and Thierry Declerck and Marko Grobelnik and Bente Maegaard and Mariani, {Joseph } and Asuncion Moreno and Jan Odijk and Stelios Piperidis",
booktitle = "Proceedings of the Language Resources and Evaluation Conference 2016",
publisher = "European Language Resources Association (ELRA)",
edition = "10",

}

RIS

TY - GEN

T1 - OSMAN

T2 - a novel Arabic readability metric

AU - El-Haj, Mahmoud

AU - Rayson, Paul Edward

PY - 2016/5/23

Y1 - 2016/5/23

N2 - We present OSMAN (Open Source Metric for Measuring Arabic Narratives) - a novel open source Arabic readability metric and tool. It allows researchers to calculate readability for Arabic text with and without diacritics. OSMAN is a modified version of the conventional readability formulas such as Flesch and Fog. In our work we introduce a novel approach towards counting short, long and stress syllables in Arabic which is essential for judging readability of Arabic narratives. We also introduce an additional factor called “Faseeh” which considers aspects of script usually dropped in informal Arabic writing. To evaluate our methods we used Spearman’s correlation metric to compare text readability for 73,000 parallel sentences from English and Arabic UN documents. The Arabic sentences were written with the absence of diacritics and in order to count the number of syllables we added the diacritics in using an open source tool called Mishkal. The results show that OSMAN readability formula correlates well with the English ones making it a useful tool for researchers and educators working with Arabic text.

AB - We present OSMAN (Open Source Metric for Measuring Arabic Narratives) - a novel open source Arabic readability metric and tool. It allows researchers to calculate readability for Arabic text with and without diacritics. OSMAN is a modified version of the conventional readability formulas such as Flesch and Fog. In our work we introduce a novel approach towards counting short, long and stress syllables in Arabic which is essential for judging readability of Arabic narratives. We also introduce an additional factor called “Faseeh” which considers aspects of script usually dropped in informal Arabic writing. To evaluate our methods we used Spearman’s correlation metric to compare text readability for 73,000 parallel sentences from English and Arabic UN documents. The Arabic sentences were written with the absence of diacritics and in order to count the number of syllables we added the diacritics in using an open source tool called Mishkal. The results show that OSMAN readability formula correlates well with the English ones making it a useful tool for researchers and educators working with Arabic text.

KW - Arabic

KW - readability

KW - NLP

KW - diacritics

KW - OSMAN

KW - flesch

KW - fog

KW - parallel

KW - corpus

KW - corpus linguistics

M3 - Conference contribution/Paper

SN - 9782951740891

SP - 250

EP - 255

BT - Proceedings of the Language Resources and Evaluation Conference 2016

A2 - Calzolari, Nicoletta

A2 - Choukri, Khalid

A2 - Declerck, Thierry

A2 - Grobelnik, Marko

A2 - Maegaard, Bente

A2 - Mariani, Joseph

A2 - Moreno, Asuncion

A2 - Odijk, Jan

A2 - Piperidis, Stelios

PB - European Language Resources Association (ELRA)

CY - Slovenia

ER -