Event-Based Detection of Provocative Political Discourse on Indonesian Twitter: A Comparative Study of SVM and IndoBERT

Evril Fadrekha Cahyani; Ali Nur Ikhsan; Deuis Nur Astrida

doi:10.63158/journalisi.v8i1.1409

Authors

Evril Fadrekha Cahyani Indonesia
Ali Nur Ikhsan Indonesia
Deuis Nur Astrida Indonesia

DOI:

https://doi.org/10.63158/journalisi.v8i1.1409

Keywords:

Provocative discourse, Political polarization, IndoBERT, Support Vector Machine, Event-based Twitter analysis

Abstract

Political polarization on Indonesian social media intensified during the August 2025 House of Representatives (DPR) demonstrations, where provocative and sarcastic tweets helped amplify institutional criticism and widen public conflict. This study examines event-based automatic detection of provocative political discourse by comparing a feature-based Support Vector Machine (SVM) classifier with a transformer-based IndoBERT model on a large-scale Indonesian Twitter (X) corpus collected from 15 August to 15 September 2025. Tweets were preprocessed and labeled using a rule-based proxy lexicon to distinguish provocative from neutral content, then both models were trained and evaluated under the same experimental setting. Results show that SVM is highly effective for recognizing explicit provocation expressed through repetitive and lexically salient slogans, whereas IndoBERT provides more stable detection of implicit and context-dependent provocation, including irony and sarcasm that are common in Indonesian political talk online. In addition, temporal exploration indicates sharp spikes in tweet volume that align with key offline protest moments, suggesting a close coupling between street-level mobilization and digital discourse dynamics. Overall, the findings support the use of contextual NLP models within event-centered social media analysis to strengthen scalable monitoring of polarization and to inform early-warning approaches for escalating conflict in Indonesia’s digital public sphere.

Downloads

Download data is not yet available.

References

[1] S. Gayatri and I. B. C. Satwika, “The Role of Social Media as a Medium for Political Information (Peran Media Sosial sebagai Media Sarana Informasi Politik),” Anubhava: J. Ilmu Komun. Hindu, vol. 2, no. 1, pp. 273–282, May 2022, doi: 10.25078/anubhava.v2i1.1050.

[2] L. Judijanto, R. Maulinda, S. Zulaika, I. Tjahyadi, and S. Suroso, “The Influence of Information Sources and Social Interaction on Social Media toward the Formation of Public Political Opinion in Indonesia (Pengaruh Sumber Informasi dan Interaksi Sosial di Media Sosial terhadap Pembentukan Opini Politik Masyarakat di Indonesia),” Sanskara Ilmu Sos. Humaniora, vol. 1, no. 01, pp. 21–31, Dec. 2023, doi: 10.58812/sish.v1i01.303.

[3] A. Arpandi, “Online Media in Increasing Public Political Participation in General Elections (Media Online dalam Meningkatkan Partisipasi Politik Masyarakat pada Pemilihan Umum (Pemilu)),” Edu Soc.: J. Pendidik. Ilmu Sos. Pengabdi. Masy., vol. 3, no. 1, pp. 843–855, Aug. 2023, doi: 10.56832/edu.v3i1.293.

[4] A. A. Santoso, “The Nomination of Ganjar Pranowo as Presidential Candidate: Topic Analysis Study on Reverse Agenda Setting Related to the Ganjar Pranowo Case (Penetapan Ganjar Pranowo Sebagai Calon Presiden: Studi Analisis Topik pada Reverse Agenda Setting Terkait Kasus Ganjar Pranowo),” J. Ilm. Ilmu Pendidik., vol. 7, no. 4, pp. 3805–3812, Apr. 2024, doi: 10.54371/jiip.v7i4.4140.

[5] A. C. Rosdiana and A. A. Suryaningtyas, “Identity Politics in Ganjar Pranowo’s Political Campaign Ahead of the 2024 Presidential Election (Politik Identitas dalam Kampanye Politik Ganjar Pranowo Menjelang Pilpres 2024),” J. Audiens, vol. 5, no. 1, pp. 77–90, Mar. 2024, doi: 10.18196/jas.v5i1.336.

[6] A. Bilbao-Jayo and A. Almeida, “Improving Political Discourse Analysis on Twitter With Context Analysis,” IEEE Access, vol. 9, pp. 104846–104863, 2021, doi: 10.1109/ACCESS.2021.3099093.

[7] M. Dynel, “Do We Know Whether to Laugh or Cry? User Responses to @Ukraine’s Dark-humour Meme,” J. Creative Commun., vol. 19, no. 3, pp. 243–258, Nov. 2024, doi: 10.1177/09732586241239908.

[8] P. Kanungo and H. Singh, “A Feature Extraction based Improved Sentiment Analysis on Apache Spark for Real-time Twitter Data,” Scalable Comput.: Pract. Exp., vol. 24, no. 4, pp. 847–855, Nov. 2023, doi: 10.12694/scpe.v24i4.2343.

[9] H. Setyawan, L. M. Azizah, and A. Y. Pradani, “Sentiment Analysis of Public Responses on Indonesia Government Using Naïve Bayes and Support Vector Machine,” Emerg. Inf. Sci. Technol., vol. 4, no. 1, pp. 1–7, May 2023, doi: 10.18196/eist.v4i1.18681.

[10] A. R. W. Sait and M. K. Ishak, “Deep Learning with Natural Language Processing Enabled Sentimental Analysis on Sarcasm Classification,” Comput. Syst. Sci. Eng., vol. 44, no. 3, pp. 2553–2567, 2023, doi: 10.32604/csse.2023.029603.

[11] H. Jayadianti et al., “Sentiment analysis of Indonesian reviews using fine-tuning IndoBERT and R-CNN,” ILKOM J. Ilm., vol. 14, no. 3, pp. 348–354, Dec. 2022, doi: 10.33096/ilkom.v14i3.1505.348-354.

[12] D. I. Putri et al., “IndoBERT Model Analysis: Twitter Sentiments on Indonesia’s 2024 Presidential Election,” J. Appl. Inform. Comput., vol. 8, no. 1, pp. 7–12, Jul. 2024, doi: 10.30871/jaic.v8i1.7440.

[13] O. H. Rahman, G. Abdillah, and A. Komarudin, “Classification of Hate Speech on Twitter Social Media Using Support Vector Machine (Klasifikasi Ujaran Kebencian pada Media Sosial Twitter Menggunakan Support Vector Machine),” J. RESTI, vol. 5, no. 1, pp. 17–23, Feb. 2021, doi: 10.29207/resti.v5i1.2700.

[14] M. Miqdad, “Literature Review: Political Buzzers and Opinion Development on Social Media in Indonesia (Literature Review: Buzzer Politik dan Pengembangan Opini di Media Sosial di Indonesia),” NeoRespublica: J. Ilmu Pemerintah., vol. 5, no. 2, pp. 689–698, Mar. 2024, doi: 10.52423/neores.v5i2.231.

[15] R. A. Saputra and Y. Sibaroni, “Multilabel Hate Speech Classification in Indonesian Political Discourse on X using Combined Deep Learning Models with Considering Sentence Length,” J. Ilmu Komput. Inform., vol. 18, no. 1, pp. 113–125, Feb. 2025, doi: 10.21609/jiki.v18i1.1440.

[16] N. N. A. Aryanti and O. Suria, “Sentiment Analysis of Layoffs in Indonesia: Comparison of IndoBERT with SVM, Random Forest, and Decision Tree with TF-IDF Optimization (Analisis Sentimen terhadap Pemutusan Hubungan Kerja di Indonesia: Komparasi IndoBERT dengan SVM, Random Forest, dan Decision Tree dengan Optimasi TF-IDF),” Rabit: J. Teknol. Sist. Inf. Univrab, vol. 10, no. 2, pp. 1158–1176, Jul. 2025, doi: 10.36341/rabit.v10i2.6364.

[17] I. Riadi, A. Fadlil, and U. A. Dahlan Yogyakarta, “Identifying Hate Speech in Tweets with Sentiment Analysis on Indonesian Twitter Utilizing Support Vector Machine Algorithm,” 2023.

[18] S. Kumar et al., “An anatomical comparison of fake-news and trusted-news sharing pattern on Twitter,” Comput. Math. Organ. Theory, vol. 27, no. 2, pp. 109–133, Jun. 2021, doi: 10.1007/s10588-019-09305-5.

[19] T. Lynn et al., “An Exploratory Data Analysis of the #Crowdfunding Network on Twitter,” J. Open Innov.: Technol. Mark. Complex., vol. 6, no. 3, p. 80, Sep. 2020, doi: 10.3390/joitmc6030080.

[20] B. Wilie et al., “IndoNLU: Benchmark and Resources for Evaluating Indonesian Natural Language Understanding,” in Proc. AACL-IJCNLP 2020, Suzhou, China, 2020, pp. 843–857.

[21] F. Koto, A. Rahimi, J. H. Lau, and T. Baldwin, “IndoLEM and IndoBERT: A Benchmark Dataset and Pre-trained Language Model for Indonesian NLP,” in Proc. COLING 2020, Barcelona, Spain, 2020, pp. 757–770.

[22] A. W. Pradana and M. Hayaty, “The Effect of Stemming and Removal of Stopwords on the Accuracy of Sentiment Analysis on Indonesian-language Texts,” Kinetik, vol. 4, no. 4, pp. 375–380, Oct. 2019, doi: 10.22219/kinetik.v4i4.912.

[23] N. H. Jeremy, “The Impact of Text Preprocessing in Sarcasm Detection on Indonesian Social Media Contents,” J. EMACS (Eng. Math. Comput. Sci.), vol. 7, no. 2, pp. 183–189, 2025, doi: 10.21512/emacsjournal.v6.

[24] E. W. Pamungkas et al., “Enhancing hate speech detection in Indonesian using abusive words lexicon,” Indones. J. Electr. Eng. Comput. Sci., vol. 33, no. 1, p. 450, Jan. 2024, doi: 10.11591/ijeecs.v33.i1.pp450-462.

[25] V. B. Lestari and C. A. Hutagalung, “Evaluation of TF-IDF Extraction Techniques in Sentiment Analysis of Indonesian-Language Marketplaces Using SVM, Logistic Regression, and Naive Bayes,” J-KOMA J. Comput. Sci. Appl., 2025, doi: 10.21009/j.

[26] R. Santosa, A. B. Nusantara, and S. Imron, “Comparative Analysis of SVM and IndoBERT for Intent Classification in Indonesian Overtime Chatbots,” J. Syst. Comput. Eng. (JSCE), vol. 6, no. 3, pp. 258–270, Aug. 2025, doi: 10.61628/jsce.v6i3.2058.

[27] R. Romindo, J. J. Pangaribuan, and O. P. Barus, “Implementation of TF-IDF and Support Vector Machine Algorithms for Detecting Cyberbullying Comments on TikTok Social Media (Implementasi Algoritma TF-IDF dan Support Vector Machine terhadap Analisis Pendeteksi Komentar Cyberbullying di Media Sosial TikTok),” J. Device, vol. 13, no. 1, pp. 124–134, 2023.

[28] F. Baharuddin and M. F. Naufal, “Fine-Tuning IndoBERT for Indonesian Exam Question Classification Based on Bloom’s Taxonomy,” J. Inf. Syst. Eng. Bus. Intell., vol. 9, no. 2, pp. 253–263, Oct. 2023, doi: 10.20473/jisebi.9.2.253-263.

[29] A. R. Hanum et al., “Performance Analysis of BERT Text Classification Algorithm in Detecting Hoax News (Analisis Kinerja Algoritma Klasifikasi Teks BERT dalam Mendeteksi Berita Hoaks),” J. Teknol. Inf. Ilmu Komput., vol. 11, no. 3, pp. 537–546, 2024, doi: 10.25126/jtiik2024118093.

Event-Based Detection of Provocative Political Discourse on Indonesian Twitter: A Comparative Study of SVM and IndoBERT

Authors

DOI:

Keywords:

Abstract

Downloads

References

Downloads

Published

Issue

Section

License

How to Cite

publisher

sidebar

certificate

template

gs-citation

index

stat