Identification of Targets in Disinformation News Articles Using Supervised Machine Learning

Hussain, S.; Khattak, A. S.; Abbasi, R. A.; Russell-Rose, T.; Chinthalapati, V. L. R.

Identification of Targets in Disinformation News Articles Using Supervised Machine Learning

Hussain, S., Khattak, A. S., Abbasi, R. A. , Russell-Rose, T. ORCID: 0000-0003-4394-9876 & Chinthalapati, V. L. R. (2024). Identification of Targets in Disinformation News Articles Using Supervised Machine Learning. In: Sheng, Q. Z., Dobbie, G., Jiang, J. , Zhang, X., Zhang, W. E., Manolopoulos, Y., Wu, J., Mansoor, W. & Ma, C. (Eds.), Advanced Data Mining and Applications. 20th International Conference, ADMA 2024, 3-5 Dec 2024, Sydney, NSW, Australia. doi: 10.1007/978-981-96-0847-8_15

Abstract

Fake news or disinformation spreads widely among various communities worldwide due to the advancement in technology involving social media platforms such as Facebook, Twitter, and Instagram in our daily lives. Disinformation news is designed to mislead and deceive the public against some entities, particularly countries, the public, religion, etc. This news is frequently disseminated by individuals, organizations, covert agencies, or nations to target particular governments and organizations to damage their international standing. In the past, academics concentrated on issues related to classification problems, identifying fake news, and detecting fake profiles. In the field mentioned above, locating hidden targets is a popular topic of investigation. In the proposed work, we have used the EU Disinfo Lab dataset to identify the targets within the disinformation news articles. The targets in disinformation news are identified using content features, unigram, bigram, unigram with bigram, and unigram with trigram. The proposed model is trained using supervised machine learning techniques such as the Linear Support Vector Classifier (LSVC) and Logistic Regression (LR), as well as three ensemble methods: Random Forest (RF), Passive Aggressive (PA), and extreme Gradient Boosting Classifier (XGB). For nine classes, the LSVC performed better on all four N-grams, including unigram, bigram, unigram with bigram, and unigram with trigram. This classifier also performed better for three classes except for unigram with bigram and unigram with trigram features; for these features, it was the second highest after LR. The targets were correctly identified using contents features by unigram with bigram and unigram with trigram, with a higher accuracy of 77% for each.

Publication Type:	Conference or Workshop Item (Paper)
Additional Information:	© 2025 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
Subjects:	Q Science > QA Mathematics > QA75 Electronic computers. Computer science
Departments:	School of Science & Technology School of Science & Technology > Department of Computer Science
SWORD Depositor:	Symplectic Administrator

[thumbnail of Target Identification.pdf]

Preview

Text - Accepted Version
Download (2MB) | Preview

Official URL: https://doi.org/10.1007/978-981-96-0847-8_15

Export

Downloads

Downloads per month over past year

View more statistics

Metadata

Altmetric

View Altmetric information about this item.

CORE (COnnecting REpositories)

Actions (login required)

Admin Login

Creators:	Hussain, S. Khattak, A. S. Abbasi, R. A. Russell-Rose, T. ORCID: 0000-0003-4394-9876 Chinthalapati, V. L. R.
Event Title:	20th International Conference, ADMA 2024
Event Type:	Conference
Event Location:	Sydney, NSW, Australia
Event Dates:	3-5 Dec 2024
Status:	Published
Refereed:	Yes
Journal or Publication Title:	Advanced Data Mining and Applications
Publisher:	Springer Nature Singapore
ISBN:	978-981-96-0847-8
ISSN:	0302-9743
e-ISSN:	1611-3349
URI:	https://openaccess.city.ac.uk/id/eprint/34317
Date available in CRO:	23 Dec 2024 09:26
Date deposited:	18 December 2024
Dates:	Date Event 14 December 2024 Published 14 December 2024 Published Online