Multirelational Twitter Bot Detection using Graph Neural Networks

Chris Pollett
CS Dept, SJSU
Joint Work with:
Ketan Jadhav (Graduate), Petros Potikas, and Katerina Potika

July 22, 2025

Outline

Introduction

Social media and 'X' aka Twitter

What are Bots?

An Example of a bot generated tweet vs that of a real user.

Example Real and Bot Tweets

Motivation

Usage of Bot Accounts

Results of Botometer [5] application when given a few user ids as input

Screenshots of botometer in action

Related Work

Previous Work

Prior Work Summary

TitleYearApproachDatasets
BotFinder: a novel framework for social bots detection in online social networks based on graph embedding and community detection [6] 2022 User profile data and relations classified using ML algorithms ByteDance Security AI Challenge
MRLBot: Multi-Dimensional Representation Learning for Social Media Bot Detection [7] 2023 User relation with neural networks Cresci-15
Detect Me If You Can: Spam Bot Detection Using Inductive Representation Learning [8] 2019 Profile properties with neighborhood relations with representational learning Cresci-15
SATAR: A Self-supervised Approach to Twitter Account Representation Learning and its Application in Bot Detection [9] 2021 Tweet, metadata and neighborhood relations with self-supervised learning TwiBot-20, Cresci-17, PAN-19
Heterogeneity-Aware Twitter Bot Detection with Relational Graph Transformers [10] 2021 Focus on neighborhood relation graphs with GNNs Twibot-20

Techniques

Graph Neural Networks

Different GNN algorithms

All GNN algorithms make use of a feature vector to process node and edge information.

GCNs and RGCNs

RGCNs are extended versions meaning they can process different edge types along with nodes Edge information is also carried along during message passing

GraphSAGE and GAT

GNN Approaches Summary

GCNR-GCNGraphSAGEGAT
Use CaseBasic convolutional graph-based learning Handles graphs with multiple edge typesSupports scalable inductive learning for large graphs If need attention mechanism for varying neighbor importance
How WorksCaptures local graph structureCaptures multi-relational graph structure Captures local structure and supports inductive learning Captures complex relationships with attention
AggregationWeighted sum of neighbor features Weighted sum of neighbor features for each relation type Sample and aggregate (mean, LSTM, pooling) Attention-weighted sum of neighbor features

Natural Language Processing

Methodology

Process Flowchart:
Multirelational Twitter Bot Detection using Graph Neural Networks

Process Flow chart

GitHub:

QR Code for Github Repo

Dataset

Dataset

UsersBotsTweetsEdges
Twibot-22 [16] 86005713994386764167170185937
Subset Used139943139943N/A2349098

RelationEdges
Following2626979
Follower1116655
Post40887365
Like595794

Feature Vector

Experiment Set-Ups

Features Extracted from Dataset

  • Categorical Properties:
    • is the user account protected
    • is the user account verified
    • does the user have a default profile image or a custom one
  • Edge Index and Edge type:
    • From user relations
  • Numerical Properties:
    • followers count
    • active days
    • length of the user screen name
    • following count
    • number of tweets posted
  • Description:
    • User bio

GNN models

Results

Evaluation Metrics

Accuracy Results

Graph Sage Accuracy Results GAT Accuracy Results GCN and RGCN Accuracy Results

Takeaways: GraphSAGE performs similar on all relations GAT for interaction relation is the best followed by RGCN for follower + following

Precision Results

Graph Sage Precision Results GAT Precision Results GCN and RGCN Precision Results

Takeaways: GraphSAGE performs similar on all relations GAT for interaction relation is the best followed by RGCN for follower + following + interaction

Recall Results

Graph Sage Precision Results GAT Precision Results GCN and RGCN Precision Results

Takeaways: GAT for follower + following + interaction relation is the best followed by RGCN for follower + following + interaction

F1-Score Results

Graph Sage Precision Results GAT Precision Results GCN and RGCN Precision Results

Takeaways: GraphSAGE performs similar on all relations GAT for interaction relation is the best followed by RGCN for follower + following

Conclusion and Future Work

Conclusion

Future Work

References

References

[1] ''Estimating twitter's bot-free monetizable daily active users,'' https://www.similarweb.com/blog/insights/social-media-news/twitter-bot-research/, 2022.

[2] Alison Grace Johansen. ''What's a Twitter bot and how to spot one'', https://us.norton.com/blog/emerging-threats/what-are-twitter-bots-and-how-to-spot-them

[3] Luceri, et al. ''View of Evolution of bot and human behavior during elections'' (2019)

[4] Nizzoli, et al. ''Charting the Landscape of Online Cryptocurrency Manipulation'', (2020)

[5] - K.-C. Yang, E. Ferrara, and F. Menczer. ''Botometer 101: Social bot practicum for computational social scientists,'' Journal of Computational Social Science, vol. 5, no. 2, pp. 1511--1528, 2022.

[6] Li, Shudong. Zhao, Chuanyu. Li, Qing. Huang, Jiuming. Zhao, Dawei. Zhu, Pei can. ''BotFinder: A Novel Framework for Social Bots Detection in Online Social Networks Based on Graph Embedding and Community Detection.'' 10.21203/rs.3.rs-1871702/v1. (2022).

[7] Zeng, F. Sun, Y. Li, Y. ''MRLBot: Multi-Dimensional Representation Learning for Social Media Bot Detection''. Electronics 2023, 12, 2298. https://doi.org/10.3390/electronics12102298 (2023)

[8] Alhosseini, Seyed. Bin Tareaf, Raad. Najafi, Pejman. Meinel, Christoph. ''Detect Me If You Can: Spam Bot Detection Using Inductive Representation Learning.'' 10.1145/3308560.3316504. (2019).

References

[9] Shangbin Feng. Herun Wan. Ningnan Wang. Jundong Li. Minnan Luo. SATAR: A Self-supervised Approach to Twitter Account Representation Learning and its Application in Bot Detection. In Proceedings of the 30th ACM International Conference on Information & Knowledge Management (CIKM '21). Association for Computing Machinery, New York, NY, USA, 3808--3817. https://doi.org/10.1145/3459637.3481949 (2021)

[10] Shangbin Feng. Zhaoxuan Tan. Rui Li. Minnan Luo. ''Heterogeneity-aware Twitter Bot Detection with Relational Graph Transformers''. https://doi.org/10.48550/arXiv.2109.02927

[11] T. N. Kipf and M. Welling, ''Semi-supervised classification with graph convolutional networks,'' CoRR, vol. abs/1609.02907 (2016)

[12] M. Schlichtkrull. T. N. Kipf. P. Bloem. R. van den Berg. I. Titov, and M. Welling. ''Modeling relational data with graph convolutional networks.'' (2017)

[13] P. Velickovic. G. Cucurull. A. Casanova. A. Romero. P. Lio. Y. Bengio. ''Graph attention networks.'' (2018)

[14] W. L. Hamilton. R. Ying. J. Leskovec. ''Inductive representation learning on large graphs.'' CoRR, vol. abs/1706.02216, (2017)

[15] Y. Liu. M. Ott. N. Goyal. J. Du. M. Joshi. D. Chen. O. Levy. M. Lewis. L. Zettlemoyer. V. Stoyanov. ''Roberta: A robustly optimized BERT pretraining approach.'' CoRR, vol. abs/1907.11692 (2019)

[16] S. Feng, et al. ''Twibot-22: Towards graph-based twitter bot detection.'' (2023)

Thank You

Q & A