City Research Online

Mining Newsorthy Topics from Social Media

Martin, C., Corney, D., Goker, A. S. & MacFarlane, A. (2013). Mining Newsorthy Topics from Social Media. Paper presented at the BCS SGAI Workshop on Social Media Analysis 2013, 10-12-2013, Cambridge, UK.


Newsworthy stories are increasingly being shared through social networking platforms such as Twitter and Reddit, and journal-ists now use them to rapidly discover stories and eye-witness accounts. We present a technique that detects “bursts” of phrases on Twitter that is designed for a real-time topic-detection system. We describe a time-dependent variant of the classic tf-idf approach and group together bursty phrases that often appear in the same messages in order to identify emerging topics. We demonstrate our methods by analysing tweets corresponding to events drawn from the worlds of politics and sport. We created a user-centred “ground truth” to evaluate our methods, based on mainstream media accounts of the events. This helps ensure our methods remain practical. We compare several clustering and topic ranking methods to discover the characteristics of news-related collections, and show tha t different strategies are needed to detect emerging topics within them. We show that our methods successfully detect a range of different topics for each event and can retrieve messages (for example, tweets) that represent each topic for the user.

Publication Type: Conference or Workshop Item (Paper)
Additional Information: Copyright 2013, the authors. For private and academic purposes only.
Publisher Keywords: topic detection, Twitter, temporal analysis
Subjects: Q Science > QA Mathematics > QA75 Electronic computers. Computer science
Z Bibliography. Library Science. Information Resources > Z665 Library Science. Information Science
Departments: School of Science & Technology > Computer Science > Human Computer Interaction Design
PDF - Accepted Version
Download (374kB) | Preview



Downloads per month over past year

View more statistics

Actions (login required)

Admin Login Admin Login