Home > Research > Datasets > COrpus of Urdu News TExt Reuse (COUNTER)

Electronic data

  • COUNTER.zip

    1 MB, multipart/x-zip

    Text

    Available under license: CC BY-NC-SA

View graph of relations

COrpus of Urdu News TExt Reuse (COUNTER)

Dataset

  • Sharjeel Muhammad, (Creator)
  • Rao Muhammad Adeel Nawab (Creator)
  • Paul Rayson (Creator)

Description

The corpus contains 600 source-derived document pairs collected from the field of journalism. We believe these documents will be useful to evaluate mono-lingual text reuse detection systems in general and specifically for Urdu language.
Date made available2016
PublisherLancaster University

Contact person

Relations

Publications