TED-S: Twitter Event Data in Sports and Politics with Aggregated Sentiments

Computing and Communications

Text available via DOI:

https://doi.org/10.3390/data7070090
Final published version
Available under license: CC BY: Creative Commons Attribution 4.0 International License

Research output: Contribution to Journal/Magazine › Journal article › peer-review

Published

Standard

TED-S: Twitter Event Data in Sports and Politics with Aggregated Sentiments. / Hettiarachchi, Hansi; Al-Turkey, Doaa; Adedoyin-Olowe, Mariam et al.
In: Data, Vol. 7, No. 7, 90, 30.06.2022.

Research output: Contribution to Journal/Magazine › Journal article › peer-review

Harvard

Hettiarachchi, H, Al-Turkey, D, Adedoyin-Olowe, M, Bhogal, J & Gaber, MM 2022, 'TED-S: Twitter Event Data in Sports and Politics with Aggregated Sentiments', Data, vol. 7, no. 7, 90. https://doi.org/10.3390/data7070090

APA

Hettiarachchi, H., Al-Turkey, D., Adedoyin-Olowe, M., Bhogal, J., & Gaber, M. M. (2022). TED-S: Twitter Event Data in Sports and Politics with Aggregated Sentiments. Data, 7(7), Article 90. https://doi.org/10.3390/data7070090

Vancouver

Hettiarachchi H, Al-Turkey D, Adedoyin-Olowe M, Bhogal J, Gaber MM. TED-S: Twitter Event Data in Sports and Politics with Aggregated Sentiments. Data. 2022 Jun 30;7(7):90. doi: 10.3390/data7070090

Author

Hettiarachchi, Hansi ; Al-Turkey, Doaa ; Adedoyin-Olowe, Mariam et al. / TED-S: Twitter Event Data in Sports and Politics with Aggregated Sentiments. In: Data. 2022 ; Vol. 7, No. 7.

Bibtex

@article{07ab89dd3ceb479f8dea42751027528c,

title = "TED-S: Twitter Event Data in Sports and Politics with Aggregated Sentiments",

abstract = "Even though social media contain rich information on events and public opinions, it is impractical to manually filter this information due to data{\textquoteright}s vast generation and dynamicity. Thus, automated extraction mechanisms are invaluable to the community. We need real data with ground truth labels to build/evaluate such systems. Still, to the best of our knowledge, no available social media dataset covers continuous periods with event and sentiment labels together except for events or sentiments. Datasets without time gaps are huge due to high data generation and require extensive effort for manual labelling. Different approaches, ranging from unsupervised to supervised, have been proposed by previous research targeting such datasets. However, their generic nature mainly fails to capture event-specific sentiment expressions, making them inappropriate for labelling event sentiments. Filling this gap, we propose a novel data annotation approach in this paper involving several neural networks. Our approach outperforms the commonly used sentiment annotation models such as VADER and TextBlob. Also, it generates probability values for all sentiment categories besides providing a single category per tweet, supporting aggregated sentiment analyses. Using this approach, we annotate and release a dataset named TED-S, covering two diverse domains, sports and politics. TED-S has complete subsets of Twitter data streams with both sub-event and sentiment labels, providing the ability to support event sentiment-based research.",

author = "Hansi Hettiarachchi and Doaa Al-Turkey and Mariam Adedoyin-Olowe and Jagdev Bhogal and Gaber, {Mohamed Medhat}",

year = "2022",

month = jun,

day = "30",

doi = "10.3390/data7070090",

language = "English",

volume = "7",

journal = "Data",

number = "7",

}

RIS

TY - JOUR

T1 - TED-S: Twitter Event Data in Sports and Politics with Aggregated Sentiments

AU - Hettiarachchi, Hansi

AU - Al-Turkey, Doaa

AU - Adedoyin-Olowe, Mariam

AU - Bhogal, Jagdev

AU - Gaber, Mohamed Medhat

PY - 2022/6/30

Y1 - 2022/6/30

N2 - Even though social media contain rich information on events and public opinions, it is impractical to manually filter this information due to data’s vast generation and dynamicity. Thus, automated extraction mechanisms are invaluable to the community. We need real data with ground truth labels to build/evaluate such systems. Still, to the best of our knowledge, no available social media dataset covers continuous periods with event and sentiment labels together except for events or sentiments. Datasets without time gaps are huge due to high data generation and require extensive effort for manual labelling. Different approaches, ranging from unsupervised to supervised, have been proposed by previous research targeting such datasets. However, their generic nature mainly fails to capture event-specific sentiment expressions, making them inappropriate for labelling event sentiments. Filling this gap, we propose a novel data annotation approach in this paper involving several neural networks. Our approach outperforms the commonly used sentiment annotation models such as VADER and TextBlob. Also, it generates probability values for all sentiment categories besides providing a single category per tweet, supporting aggregated sentiment analyses. Using this approach, we annotate and release a dataset named TED-S, covering two diverse domains, sports and politics. TED-S has complete subsets of Twitter data streams with both sub-event and sentiment labels, providing the ability to support event sentiment-based research.

AB - Even though social media contain rich information on events and public opinions, it is impractical to manually filter this information due to data’s vast generation and dynamicity. Thus, automated extraction mechanisms are invaluable to the community. We need real data with ground truth labels to build/evaluate such systems. Still, to the best of our knowledge, no available social media dataset covers continuous periods with event and sentiment labels together except for events or sentiments. Datasets without time gaps are huge due to high data generation and require extensive effort for manual labelling. Different approaches, ranging from unsupervised to supervised, have been proposed by previous research targeting such datasets. However, their generic nature mainly fails to capture event-specific sentiment expressions, making them inappropriate for labelling event sentiments. Filling this gap, we propose a novel data annotation approach in this paper involving several neural networks. Our approach outperforms the commonly used sentiment annotation models such as VADER and TextBlob. Also, it generates probability values for all sentiment categories besides providing a single category per tweet, supporting aggregated sentiment analyses. Using this approach, we annotate and release a dataset named TED-S, covering two diverse domains, sports and politics. TED-S has complete subsets of Twitter data streams with both sub-event and sentiment labels, providing the ability to support event sentiment-based research.

U2 - 10.3390/data7070090

DO - 10.3390/data7070090

M3 - Journal article

VL - 7

JO - Data

JF - Data

IS - 7

M1 - 90

ER -

Research

Links

Text available via DOI: