Cyberbullying Detection in Indonesian TikTok Comments Using IndoBERT with Fairness Evaluation

Authors

  • Hanik Dewi Jayanti Indonesia
  • Abdul Rohman Indonesia
Pages Icon

DOI:

https://doi.org/10.63158/journalisi.v8i1.1448

Keywords:

Content Moderation, Cyberbullying Detection, Fairness Evaluation, IndoBERT, TikTok Comments

Abstract

This study investigates automated cyberbullying detection on TikTok within the Indonesian digital context, where high social media usage among children and adolescents demands scalable and consistent content moderation. We propose an IndoBERT-based framework for detecting and classifying cyberbullying in Indonesian-language TikTok comments, incorporating algorithmic fairness considerations. A dataset of 2,122 TikTok comments was collected from a publicly available Kaggle repository and divided into training, validation, and testing sets using a 70:15:15 stratified sampling ratio. The IndoBERT-base-p1 model was fine-tuned with the PyTorch and HuggingFace frameworks, optimizing hyperparameters like the AdamW optimizer and learning rate scheduling. Experimental results show that the model achieved an accuracy of 70.66% and a ROC-AUC score of 0.7969, demonstrating solid discriminative power. With a macro F1-score of 0.7066 and a cyberbullying recall of 0.7170, the model shows balanced performance in identifying harmful content. A key contribution of this study is a fairness evaluation framework that reveals an accuracy gap of 2.08% and an equal opportunity gap of 0.0208, indicating overall fairness. However, demographic parity remains a concern. This system, supporting content triage combined with human review, enhances moderation workflows by filtering non-cyberbullying cases while flagging potentially harmful content for human oversight.

Downloads

Download data is not yet available.

References

[1] H. Dwistia, M. Sajdah, O. Awaliah, and N. Elfina, “Pemanfaatan Media Sosial Sebagai Media Pembelajaran Pendidikan Agama Islam,” Ar-Rusyd: Jurnal Pendidikan Agama Islam, vol. 1, no. 2, pp. 81–99, 2022, doi: 10.61094/Arrusyd.2830-2281.33.

[2] D. McCashin and C. M. Murphy, “Using Tiktok for Public and Youth Mental Health – A Systematic Review and Content Analysis,” Clin. Child Psychol. Psychiatry, vol. 28, no. 1, pp. 279–306, 2023, doi: 10.1177/13591045221106608.

[3] D. Keasaman, D. I. Pelabuhan Pengasinan, P. Jakarta, and Y. Mariah, “Jurnal Indonesia Sosial Sains,” Jurnal Indonesia Sosial Sains, vol. 2, no. 3, p. 494, 2021.

[4] N. Rokhman, P. A. Maulan, and N. A. Wirahuda, “Analisis Penilaian Esai Secara Otomatis Menggunakan Natural Language Processing (NLP) dan Cosine Similarity,” Go Infotech: Jurnal Ilmiah Stmik Aub, vol. 31, no. 1, pp. 41–52, 2025, doi: 10.36309/Goi.V31i1.359.

[5] T. Nugraha Manoppo and D. Hatta Fudholi, “Deteksi Cyberbullying Berdasarkan Unsur Perbuatan Pidana Yang Dilanggar Dengan Naive Bayes Dan Support Vector Machine,” Jurnal Sains Komputer & Informatika (J-Sakti), vol. 5, no. 1, pp. 10–19, 2021.

[6] F. Muftie, K. M. Yafi, and Q. M. Addina, “Perbandingan Performa Deteksi Cyberbullying Dengan Transformer, Deep Learning, Dan Machine Learning,” Jurnal Pendidikan Informatika Dan Sains, vol. 13, no. 1, pp. 75–87, 2024, doi: 10.31571/Saintek.V13i1.4002.

[7] G. Z. Nabiilah, I. N. Alam, E. S. Purwanto, and M. F. Hidayat, “Indonesian Multilabel Classification Using Indobert Embedding and Mbert Classification,” International Journal of Electrical and Computer Engineering, vol. 14, no. 1, pp. 1071–1078, 2024, doi: 10.11591/Ijece.V14i1.Pp1071-1078.

[8] C. Denis, R. Elie, M. Hebiri, and F. Hu, “Fairness Guarantees in Multi-Class Classification with Demographic Parity,” Journal of Machine Learning Research, vol. 25, pp. 1–46, 2024.

[9] A. Kurniasih and L. P. Manik, “On the Role of Text Preprocessing in Bert Embedding-Based Dnns for Classifying Informal Texts,” International Journal of Advanced Computer Science and Applications, vol. 13, no. 6, pp. 927–934, 2022, doi: 10.14569/Ijacsa.2022.01306109.

[10] Hushian, “Cyberbullying Bahasa Indonesia, With Slang,” [Online]. Available: https://www.kaggle.com/Datasets/Hushian/Cyberbullying-Dataset-With-Slang

[11] E. Küzeci, “Personal Data Protection Law,” Introduction to Turkish Business Law, no. 016999, pp. 457–483, 2022.

[12] D. Rifaldi, Abdul Fadlil, and Herman, “Teknik Preprocessing pada Text Mining Menggunakan Data Tweet ‘Mental Health,’” Jurnal Pendidikan Teknologi Informasi, vol. 3, no. 2, pp. 161–171, 2023.

[13] A. A. Khan, “Balanced Split: A New Train-Test Data Splitting Strategy for Imbalanced Datasets,” arXiv, 2022.

[14] R. B. D. Figueiredo and H. A. Mendes, “Analyzing Information Leakage on Video Object Detection Datasets by Splitting Images into Clusters with High Spatiotemporal Correlation,” IEEE Access, vol. 12, pp. 47646–47655, 2024, doi: 10.1109/Access.2024.3383047.

[15] H. Bichri, A. Chergui, and M. Hain, “Investigating the Impact of Train/Test Split Ratio on the Performance of Pre-Trained Models with Custom Datasets,” International Journal of Advanced Computer Science and Applications, vol. 15, no. 2, pp. 331–339, 2024, doi: 10.14569/Ijacsa.2024.0150235.

[16] A. Paszke et al., “PyTorch: An Imperative Style, High-Performance Deep Learning Library,” Adv. Neural Inf. Process. Syst., vol. 32, no. Neurips, 2021.

[17] M. Riva, T. L. Parigi, F. Ungaro, and L. Massimino, “Hugging Face’s Impact on Medical Applications of Artificial Intelligence,” Computational and Structural Biotechnology Reports, vol. 1, no. March, p. 100003, 2024, doi: 10.1016/J.Csbr.2024.100003.

[18] Anugerah Simanjuntak et al., “Research and Analysis of Indobert Hyperparameter Tuning in Fake News Detection,” Jurnal Nasional Teknik Elektro Dan Teknologi Informasi, vol. 13, no. 1, pp. 60–67, 2024, doi: 10.22146/Jnteti.V13i1.8532.

[19] H. Tan, W. Shao, H. Wu, K. Yang, and L. Song, “A Sentence is Worth 128 Pseudo Tokens: A Semantic-Aware Contrastive Learning Framework for Sentence Embeddings,” Proceedings of the Annual Meeting of the Association for Computational Linguistics, no. 2018, pp. 246–256, 2022, doi: 10.18653/V1/2022.Findings-Acl.22.

[20] L. Wu, G. Perin, and S. Picek, “I Choose You: Automated Hyperparameter Tuning for Deep Learning-Based Side-Channel Analysis,” IEEE Trans. Emerg. Top. Comput., vol. 12, no. 2, pp. 546–557, 2024, doi: 10.1109/Tetc.2022.3218372.

[21] S. Xie and Z. Li, “Implicit Bias of AdamW: ℓ∞-Norm Constrained Optimization,” Proc. Mach. Learn. Res., vol. 235, pp. 54488–54510, 2024.

[22] C. Wang, Y. Xiao, X. Gao, L. L. Li, and J. Wang, “Close the Gap Between Deep Learning and Mobile Intelligence by Incorporating Training in the Loop,” MM 2019 - Proceedings of the 27th ACM International Conference on Multimedia, no. October 2019, pp. 1419–1427, 2019, doi: 10.1145/3343031.3350904.

[23] G. Alfonso-Francia et al., “Performance Evaluation of Different Object Detection Models for the Segmentation of Optical Cups and Discs,” Diagnostics, vol. 12, no. 12, 2022, doi: 10.3390/Diagnostics12123031.

[24] D. Chicco and G. Jurman, “The Matthews Correlation Coefficient (MCC) Should Replace the ROC AUC as the Standard Metric for Assessing Binary Classification,” Biodata Min., vol. 16, no. 1, Dec. 2023, doi: 10.1186/S13040-023-00322-4.

[25] K. C. Yuni K and I. Hanifuddin, “Analisis Fairness Terhadap Sistem Pembayaran Jasa Pengairan Sawah pada Petani Desa Bibrik Kecamatan Jiwan Kabupaten Madiun,” Journal of Economics, Law, and Humanities, vol. 1, no. 2, pp. 59–74, 2022, doi: 10.21154/Jelhum.V1i2.1194.

[26] H. Al-Khalifa, K. Al-Khalefah, and H. Haroon, “Error Analysis of Pretrained Language Models (PLMs) in English-to-Arabic Machine Translation,” Human-Centric Intelligent Systems, vol. 4, no. 2, pp. 206–219, 2024, doi: 10.1007/S44230-024-00061-7.

Downloads

Published

2026-03-02

Issue

Section

Articles

How to Cite

[1]
H. D. Jayanti and A. Rohman, “Cyberbullying Detection in Indonesian TikTok Comments Using IndoBERT with Fairness Evaluation”, journalisi, vol. 8, no. 1, pp. 907–927, Mar. 2026, doi: 10.63158/journalisi.v8i1.1448.

Most read articles by the same author(s)