Summarising a Twitter Feed Using Weighted Frequency

Journal Title
Journal ISSN
Volume Title
Springer Link
Data is growing exponentially every day, with 500 million tweets sent on Twitter alone (Desjardins 2021). Twitter feeds are long, take time to understand, are multilingual, and have multimedia. This makes it difficult to analyse in its raw form so the data needs to be extracted, cleaned, and structured, to be able to be used in research. This paper proposes summarising twitter feeds as a manner of structuring them. The objectives we sought to achieve are: (1) Use the Twitter API to retrieve tweets successfully, (2) Efficiently detect the language of text, and tokenize it to then analyse their content (in its language), (3) Use live tweets as the input instead of a database of tweets, (4) Create the interface as a plugin to make it accessible for computer scientists, and others, alike. We also aimed to test whether using weighted frequency to construct summaries of tweets would be successful, and by conducting a survey to test our results, we have found that our program is seen to be useful, accessible, and efficient at giving summarizations of twitter accounts. Weighted frequency also proved to be good at summarising text of any language, inputted.
This open access book presents contributions on a wide range of scientific areas originating from the BUiD Doctoral Research Conference (BDRC 2022)
Natural Language Processing (NLP), twitter, summarization, weighted Frequency
Abohaia, Z.A., Hassan, Y.M. (2023). Summarising a Twitter Feed Using Weighted Frequency. In: Al Marri, K., Mir, F., David, S., Aljuboori, A. (eds) BUiD Doctoral Research Conference 2022. Lecture Notes in Civil Engineering, vol 320. Springer, Cham.