Virtual Guard: AI-Driven Defense Against Harmful Digital Content
DOI:
https://doi.org/10.71143/hqmhkt90Abstract
Artificial Intelligence (AI)-based content moderation is widely used in social media platforms to filter out harmful and inappropriate content, including hate speech, misinformation, and explicit material. This paper explores existing AI moderation techniques, their effectiveness, and current loopholes. The study highlights challenges such as adversarial attacks, bias in AI models, scalability, and privacy concerns. Finally, we propose solutions including adversarial training, cross-lingual transformers, explainable AI (XAI), and federated learning for privacy-preserving moderation. The exponential growth of social media platforms has resulted in a surge of user-generated content, necessitating robust moderation mechanisms to filter harmful and sensitive material. Manual moderation is no longer scalable due to psychological toll and inefficiency. This paper explores the role of Artificial Intelligence (AI) in content moderation, focusing on the types of sensitive content, their impact on victims, existing loopholes in current systems, and proposes novel solutions to enhance moderation accuracy. We also reference real-life incidents to validate the need for stronger AI moderation and outline areas still under-researched.
Downloads
Downloads
Published
Issue
Section
License
Copyright (c) 2025 International Journal of Research and Review in Applied Science, Humanities, and Technology

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.








