Home > Research > Researchers > Professor Paul Rayson > Datasets

Professor Paul Rayson

Professor of Natural Language Processing

  1. Urdu Short Text Reuse Corpus (USTRC)

    Sameen, S. (Creator), Muhammad, S. (Creator), Nawab, R. M. A. (Creator), Rayson, P. (Creator), Muneer, I. (Creator), Lancaster University, 2017, 10.17635/lancaster/researchdata/192

    Dataset

  2. Urdu Paraphrase Plagiarism Corpus (UPPC)

    Muhammad, S. (Creator), Rayson, P. (Creator), Nawab, R. M. A. (Creator), Lancaster University, 2016, 10.17635/lancaster/researchdata/67

    Dataset

  3. UNLT: Urdu Natural Language Toolkit

    Shafi, J. (Creator), Nawab, R. M. A. (Creator), Rayson, P. (Creator), Iqbal, R. (Creator), Lancaster University, 2021, 10.17635/lancaster/researchdata/494

    Dataset

  4. UK Annual Reports Key Sections

    El-Haj, M. (Creator), Young, S. (Creator), Rayson, P. (Creator), Lancaster University, 28/02/2019, 10.17635/lancaster/researchdata/262

    Dataset

  5. N-gram list for the StratScore metric

    Athanasakou, V. (Creator), El-Haj, M. (Creator), Rayson, P. (Creator), Walker, M. (Creator), Young, S. (Creator), Lancaster University, 2018, 10.17635/lancaster/researchdata/232

    Dataset

  6. Igbo-English Machine Translation: An Evaluation Benchmark

    Ezeani, I. (Creator), Onyenwe, I. E. (Creator), Chinedu, U. (Creator), Rayson, P. (Creator), Hepple, M. (Creator), Github, 1/04/2020

    Dataset

  7. Human Judgements of Sentiment Values

    Pak, I. (Creator), Teh, P. L. (Creator), Rayson, P. (Creator), Piao, S. (Creator), Ho, J. S. Y. (Creator), Moore, A. (Creator), Cheah, Y. (Creator), Lancaster University, 2020, 10.17635/lancaster/researchdata/368

    Dataset

  8. Data and scripts for extracting plant names and collocates from historical texts

    Smail, R. (Creator), Donaldson, C. (Creator), Stevens, C. (Creator), Rayson, P. (Creator), Govaerts, R. (Creator), Lancaster University, 2020, 10.17635/lancaster/researchdata/385

    Dataset

  9. Cross-Language English-Urdu Corpus (CLEU)

    Muneer, I. (Creator), Muhammad, S. (Creator), Iqbal, M. (Creator), Nawab, R. M. A. (Creator), Rayson, P. (Creator), Lancaster University, 2017, 10.17635/lancaster/researchdata/176

    Dataset

  10. COVID-19 Arabic tweets

    Alsudias, L. (Creator), Rayson, P. (Creator), Lancaster University, 2020, 10.17635/lancaster/researchdata/394

    Dataset

  11. COVID-19 Arabic tweets

    Alsudias, L. (Creator), Rayson, P. (Creator), Lancaster University, 7/07/2020, 10.17635/lancaster/researchdata/375

    Dataset

  12. COrpus of Urdu News TExt Reuse (COUNTER)

    Muhammad, S. (Creator), Nawab, R. M. A. (Creator), Rayson, P. (Creator), Lancaster University, 2016, 10.17635/lancaster/researchdata/96

    Dataset

  13. CorCenCC: Corpws Cenedlaethol Cymraeg Cyfoes – the National Corpus of Contemporary Welsh

    Knight, D. (Creator), Morris, S. (Creator), Fitzpatrick, T. (Creator), Rayson, P. (Creator), Spasić, I. (Creator), Thomas, E. M. (Creator), Lovell, A. (Creator), Morris, J. (Creator), Evas, J. (Creator), Stonelake, M. (Creator), Arman, L. (Creator), Davies, J. (Creator), Ezeani, I. (Creator), Neale, S. (Creator), Needs, J. (Creator), Piao, S. (Creator), Rees, M. (Creator), Watkins, G. (Creator), Williams, L. (Creator), Muralidaran, V. (Creator), Tovey-Walsh, B. (Creator), Anthony, L. (Creator), Cobb, T. M. (Creator), Deuchar, M. (Creator), Donnelly, K. (Creator), McCarthy, M. (Creator), Scannell, K. (Creator), Cardiff University, 2020, 10.17035/d.2020.0119878310

    Dataset

  14. Arabic tweets about infectious diseases.

    Alsudias, L. (Creator), Rayson, P. (Creator), Lancaster University, 21/06/2019, 10.17635/lancaster/researchdata/303

    Dataset

  15. Arabic Infectious Disease Ontology

    Alsudias, L. (Creator), Rayson, P. (Creator), Lancaster University, 25/02/2020, 10.17635/lancaster/researchdata/350

    Dataset

  16. Annual Reports Key Sections Corpora 2003 to 2017

    El-Haj, M. (Creator), Young, S. (Creator), Rayson, P. (Creator), Lancaster University, 13/03/2019, 10.17635/lancaster/researchdata/271

    Dataset

Back to top