Rights statement: © 2016 Vasiliki Simaki, Iosif Mporas and Vasileios Megalooikonomou. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Final published version, 199 KB, PDF document
Available under license: CC BY: Creative Commons Attribution 4.0 International License
Final published version
Licence: CC BY: Creative Commons Attribution 4.0 International License
Research output: Contribution to Journal/Magazine › Journal article › peer-review
Research output: Contribution to Journal/Magazine › Journal article › peer-review
}
TY - JOUR
T1 - Evaluation and Sociolinguistic Analysis of Text Features for Gender and Age Identification
AU - Simaki, Vasiliki
AU - Mporas, Iosif
AU - Megalooikonomou, Vasileios
N1 - © 2016 Vasiliki Simaki, Iosif Mporas and Vasileios Megalooikonomou. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
PY - 2016/9/25
Y1 - 2016/9/25
N2 - The paper presents an interdisciplinary study in the field of automatic gender and age identification, under the scope of sociolinguistic knowledge on gendered and age linguistic choices that social media users make. The authors investigated and gathered standard and novel text features used in text mining approaches on the author's demographic information and profiling and they examined their efficacy in gender and age detection tasks on a corpus consisted of social media texts. An analysis of the most informative features is attempted according to the nature of each feature and the information derived after the characteristics' score of importance is discussed.
AB - The paper presents an interdisciplinary study in the field of automatic gender and age identification, under the scope of sociolinguistic knowledge on gendered and age linguistic choices that social media users make. The authors investigated and gathered standard and novel text features used in text mining approaches on the author's demographic information and profiling and they examined their efficacy in gender and age detection tasks on a corpus consisted of social media texts. An analysis of the most informative features is attempted according to the nature of each feature and the information derived after the characteristics' score of importance is discussed.
KW - Sociolinguistics
KW - Text Mining
KW - Feature Ranking
KW - ReliefF Algorithm
KW - Gender Detection
KW - Age Identification
U2 - 10.3844/ajeassp.2016.868.876
DO - 10.3844/ajeassp.2016.868.876
M3 - Journal article
VL - 9
SP - 868
EP - 876
JO - American Journal of Engineering and Applied Sciences
JF - American Journal of Engineering and Applied Sciences
IS - 4
ER -