Home > Research > Datasets > COrpus of Urdu News TExt Reuse (COUNTER)

Electronic data

  • COUNTER.zip

    1.29 MB, multipart/x-zip


    Available under license: CC BY-NC-SA

View graph of relations

COrpus of Urdu News TExt Reuse (COUNTER)


  • Sharjeel Muhammad (Creator)
  • Rao Muhammad Adeel Nawab (Creator)
  • Paul Rayson (Creator)


The corpus contains 600 source-derived document pairs collected from the field of journalism. We believe these documents will be useful to evaluate mono-lingual text reuse detection systems in general and specifically for Urdu language.
Date made available2016
PublisherLancaster University

Contact person


Research outputs