A dataset of April Fools hoaxes and Fake News articles which were used in the following paper: @conference{3fb534946b3a4f219205d525e87fa080, title = "Fool{\textquoteright}s Errand: Looking at April Fools Hoaxes as Disinformation through the Lens of Deception and Humour", abstract = "Every year on April 1st, people play practical jokes on one another and news websites fabricate false stories with the goal of making fools of their audience. In an age of disinformation, with Facebook under fire for allowing “Fake News” to spread on their platform, every day can feel like April Fools{\textquoteright} day. We create a dataset of April Fools{\textquoteright} hoax news articles and build a set of features based on past research examining deception, humour, and satire. Analysis of our dataset and features suggests that looking at the structural complexity and levels of detail in a text are the most important types of feature in characterising April Fools{\textquoteright}. We propose that these features are also very useful for understanding Fake News, and disinformation more widely.", author = "Edward Dearden and Alistair Baron", year = "2019", month = apr, day = "7", language = "English", note = "20th International Conference on Computational Linguistics and
Intelligent Text Processing, CICLing 2019 ; Conference date: 07-04-2019 Through 13-04-2019", } The fake news part of the corpus was orignally created by Horne and Adali for the following paper: https://arxiv.org/abs/1703.09398 We provide both raw text as well as PoS and Semantic tags (made with CLAWS (https://ucrel.lancs.ac.uk/claws/) and USAS (https://ucrel.lancs.ac.uk/usas/)) for each article body and headline. See the paper for more information: https://www.research.lancs.ac.uk/portal/en/publications/fools-errand(3fb53494-6b3a-4f21-9205-d525e87fa080).html Code available at: https://github.com/dearden/april-fools