Research output: Contribution in Book/Report/Proceedings - With ISBN/ISSN › Conference contribution/Paper › peer-review
Research output: Contribution in Book/Report/Proceedings - With ISBN/ISSN › Conference contribution/Paper › peer-review
}
TY - GEN
T1 - A Feasibility Study on Extracting Twitter Users' Interests Using NLP Tools for Serendipitous Connections
AU - Piao, S.
AU - Whittle, J.
PY - 2011
Y1 - 2011
N2 - This paper presents our research on the feasibility of extracting Twitter users' interests for suggesting serendipitous connections using natural language processing (NLP) technology. Defined by Andel [1] as the art of making an unsought finding, serendipity has a positive role in scientific research and people's daily lives. Applications that facilitate serendipity would bring various benefits to us. In this work, we focus on the mining of users' interests from Twitter messages (tweets hereafter) to support the detection of serendipitous connections. To address the challenge, we explore a set of NLP tools to develop a real-time system for automatically extracting the users' interests in the form of named entities and core terms. We also examine the different contributions of three different information sources with regard to the user's interests. Furthermore, we examine the issue of determining the additional attribute of surprisingness/ unexpectedness of the terms and entities of interest which we deem critical for detecting serendipitous connections. Our prototype system was tested with a group of Twitter users involving approximately 2,300 tweets. Our algorithm achieved varying degrees of success on each of the users, demonstrating feasibility of identifying serendipitous interest terms and entities. For example, 27.5% of terms extracted for one of the users were judged to be serendipitous.
AB - This paper presents our research on the feasibility of extracting Twitter users' interests for suggesting serendipitous connections using natural language processing (NLP) technology. Defined by Andel [1] as the art of making an unsought finding, serendipity has a positive role in scientific research and people's daily lives. Applications that facilitate serendipity would bring various benefits to us. In this work, we focus on the mining of users' interests from Twitter messages (tweets hereafter) to support the detection of serendipitous connections. To address the challenge, we explore a set of NLP tools to develop a real-time system for automatically extracting the users' interests in the form of named entities and core terms. We also examine the different contributions of three different information sources with regard to the user's interests. Furthermore, we examine the issue of determining the additional attribute of surprisingness/ unexpectedness of the terms and entities of interest which we deem critical for detecting serendipitous connections. Our prototype system was tested with a group of Twitter users involving approximately 2,300 tweets. Our algorithm achieved varying degrees of success on each of the users, demonstrating feasibility of identifying serendipitous interest terms and entities. For example, 27.5% of terms extracted for one of the users were judged to be serendipitous.
KW - interest extraction
KW - named entity
KW - natural langauge processing
KW - serendipity
KW - social computing
KW - twitter analysis
U2 - 10.1109/PASSAT/SocialCom.2011.164
DO - 10.1109/PASSAT/SocialCom.2011.164
M3 - Conference contribution/Paper
SN - 978-1-4577-1931-8
SP - 910
EP - 915
BT - Privacy, Security, Risk and Trust (PASSAT), 2011 IEEE Third International Conference on and 2011 IEEE Third International Confernece on Social Computing (SocialCom)
PB - IEEE
ER -