Rights statement: The final publication is available at Springer via http://dx.doi.org/10.1007/978-3-319-66429-3_29
Accepted author manuscript, 302 KB, PDF document
Final published version
Research output: Contribution in Book/Report/Proceedings - With ISBN/ISSN › Conference contribution/Paper › peer-review
Research output: Contribution in Book/Report/Proceedings - With ISBN/ISSN › Conference contribution/Paper › peer-review
}
TY - GEN
T1 - Detection of stance and sentiment modifiers in political blogs
AU - Skeppstedt, Maria
AU - Simaki, Vasiliki
AU - Paradis, Carita
AU - Kerren, Andreas
N1 - The final publication is available at Springer via http://dx.doi.org/10.1007/978-3-319-66429-3_29
PY - 2017
Y1 - 2017
N2 - The automatic detection of seven types of modifiers was studied: Certainty, Uncertainty, Hypotheticality, Prediction, Recommendation, Concession/Contrast and Source. A classifier aimed at detecting local cue words that signal the categories was the most successful method for five of the categories. For Prediction and Hypotheticality, however, better results were obtained with a classifier trained on tokens and bigrams present in the entire sentence. Unsupervised cluster features were shown useful for the categories Source and Uncertainty, when a subset of the training data available was used. However, when all of the 2,095 sentences that had been actively selected and manually annotated were used as training data, the cluster features had a very limited effect. Some of the classification errors made by the models would be possible to avoid by extending the training data set, while other features and feature representations, as well as the incorporation of pragmatic knowledge, would be required for other error types.
AB - The automatic detection of seven types of modifiers was studied: Certainty, Uncertainty, Hypotheticality, Prediction, Recommendation, Concession/Contrast and Source. A classifier aimed at detecting local cue words that signal the categories was the most successful method for five of the categories. For Prediction and Hypotheticality, however, better results were obtained with a classifier trained on tokens and bigrams present in the entire sentence. Unsupervised cluster features were shown useful for the categories Source and Uncertainty, when a subset of the training data available was used. However, when all of the 2,095 sentences that had been actively selected and manually annotated were used as training data, the cluster features had a very limited effect. Some of the classification errors made by the models would be possible to avoid by extending the training data set, while other features and feature representations, as well as the incorporation of pragmatic knowledge, would be required for other error types.
KW - Stance modifiers
KW - Sentiment modifiers
KW - Active learning
KW - Unsupervised features
KW - Sesource-aware natural language processing
U2 - 10.1007/978-3-319-66429-3_29
DO - 10.1007/978-3-319-66429-3_29
M3 - Conference contribution/Paper
SN - 9783319665286
T3 - Lecture Notes in Computer Science
SP - 302
EP - 311
BT - SPECOM 2017
A2 - Karpov, A.
A2 - Potapova, R.
A2 - Mporas, I.
PB - Springer
CY - Cham
ER -