MindsDB: Kick Against CyberBullying

The inspiration behind this project was deeply personal and infused with emotion. During my high school years, I endured the pain of being a victim of bullying. My small stature made me an easy target, and I've heard nearly every derogatory term thrown my way. As I transitioned to university, I felt a sense of liberation, believing I had triumphed over those tormentors.
However, a new menace emerged – cyberbullying. It's disheartening that not only I but countless others still grapple with the effects of bullies exploiting social media on platforms like Telegram to inflict harm, humiliation, and suffering.

Given the internet's intrinsic nature, it appears that escaping this form of bullying isn't as straightforward as completing high school. Driven by this, I endeavored to create a bot that could take a stand against cyberbullying. The concept is elegantly simple. Employing MindsDB, I trained a model on paired text-sentiment data. This model empowers the bot to predict the presence of cyberbullying in incoming messages. Each message undergoes analysis via the Machine Learning model.

Through the implementation of KACyB, my objective is to provide a preemptive solution to the pervasive problem of cyberbullying. Capitalizing on the capabilities of artificial intelligence and sentiment analysis, the bot swiftly identifies potentially harmful content on Telegram and intervenes proactively. This approach not only empowers potential victims but also fosters a culture of respectful online communication. With KACyB, I aim to foster a safer and more inclusive digital environment, particularly within the context of Telegram's platform.

Bots (short for robots) are software applications or scripts that execute automated tasks. These tasks vary from simple, repetitive actions to complex, intelligent interactions. Bots can be designed to operate across various platforms, including websites, messaging apps, social media platforms, and more.

In this article, we will create a bot on the social platform Telegram to address cyberbullying. Telegram serves as both a messaging and social media platform, offering features such as message transmission, group interactions, voice and video calls, and more. The purpose of the bot is to monitor group messages and take action against group members engaging in cyberbullying. This bot aims to foster constructive conversations and assist group admins in managing their communities. You can view an example of the bot's functionality here.

Prerequisites

To follow this tutorial effectively, you should have the following:

Proficiency in Python programming.
Basic understanding of Machine Learning algorithms.
Python installed on your computer.
A text editor - Visual Studio Code is recommended.

Libraries Used

Throughout this tutorial, we will use several Python libraries and packages, with the main ones being: mindsdb-sdk, python-telegram-bot, scikit-learn, sqlalchemy, pandas, numpy, and nltk. On one hand,mindsdb-sdk allows you to connect to a MindsDB server from Python using the HTTP API. MindsDB as a whole abstracts LLMs, time series, regression, and classification models as virtual tables (AI-Tables). Furthermore, Python-telegram-bot is a Python wrapper for the telegram API, simplifying interactions with the API through Python code. SQLAlchemy will create a basic database to track member information in group chats and the count of cyberbullying-related messages sent. Pandas, nltk, scikit-learn, and numpy will be used to construct a machine learning model that identifies and predicts cyberbullying-related messages.

Installing Packages

To install the required packages, execute the following command in your terminal:

pip install -requirements.txt

With this step, I installed the requirements for this projects which is only everything that has been written above saved in a .txt file.

Building the Telegram bot

Building the Telegram Bot

The python-telegram-bot package offers a range of high-level classes designed to streamline bot development. This package comprises two primary submodules: the pure Python telegram module (useful for fetching updates and sending messages) and the telegram.ext module, which provides a plethora of built-in objects and classes to simplify your work.

To construct the Telegram bot, follow these outlined steps:

Creating the Bot and Obtaining the API Token

Your initial task involves crafting a bot and acquiring its API Token. BotFather serves as the creator and overseer of all Telegram bots. Through BotFather, you can generate and manage your Telegram bots and their corresponding tokens.

To interact with BotFather, search for "BotFather" within the Telegram app.

Once you select "BotFather," a chat interface will open, facilitating interaction with the bot. Initiate the conversation by clicking on Start.

Input /newbot to begin the bot creation process. BotFather will prompt you for a name and a unique username for your new bot.

Subsequently, you'll receive a celebratory message confirming the successful creation of your Telegram bot. This message will include a link for accessing your bot and, crucially, the bot token. Think of the token as a password you'll use to instruct your bot. It's essential to safeguard your token, as it grants control over your bot to anyone possessing it.

Creating the Application and Implementing the Start Handler

To establish the application's foundation, you'll integrate the start handler for the start function. Subsequently, execute the application using the run_polling() function. The parameter filters=~filters.ChatType.GROUPS ensures that this handler refrains from processing updates originating from Group chats.

if __name__ == "__main__":
    application = Application.builder().token(TOKEN).build()
    start_handler = CommandHandler("start", start, filters=~filters.ChatType.GROUPS)
    application.add_handler(start_handler)
    application.run_polling()

Registering the Start Command with BotFather

Though not obligatory, it proves advantageous to notify BotFather that your bot recognizes the "/start" command. This action generates a command menu displaying the acknowledged commands. Employ the "/setcommands" command for this purpose.

Testing Your Bot

Execute the Python file and ascertain the proper functioning of the start command.

Implementing the Cyberbullying Messages Handler

The cyberbullying handler serves to process messages exchanged within group chats. Its role involves evaluating whether a message pertains to cyberbullying. Upon identifying such content, the handler logs both the user_id and groupchat_id within a database. Additionally, the no_bullying parameter maintains a tally of cyberbullying messages transmitted by a user within the group chat. Once the no_bullying count surpasses a predetermined threshold (in this instance, let's assume 3), the user becomes subject to a temporary ban from the group (with the ban lasting for 3 days).

In the bot.py file, a function is introduced to detect cyberbullying:

import mindsdb_sdk
import pandas as pd

server = mindsdb_sdk.connect('https://cloud.mindsdb.com', login='salimonyinlola@outlook.com', password='Qwerty12345')
project = server.get_project("mindsdb")
model = project.list_models()[0] # Selecting the model to use; index 0 for the first model in the list

# Check if text contains cyberbullying
def is_cyberbullying(text):
    text_data = {text}
    result = pd.DataFrame(text_data)
    if model.predict(result)["oh_label"].loc[0] == 1:
        return True 
    else: 
        return False

This function hinges on a MindsDB model, trained using a dataset obtained from Kaggle as shown above.

The code in the bot.py is tailored to the cyberbullying machine learning model, where the function accepts textual input and outputs True if the text is deemed cyberbullying, and False otherwise. This encapsulates the fundamental logic governing the app.

Adding a Database

SQL (Structured Query Language) is a specialized language used to manage and manipulate relational databases. It enables users to interact with databases for tasks like querying data, inserting, updating, and deleting records, and creating or altering database structures. SQLAlchemy is a Python library designed to facilitate the management and interaction with SQL databases using Python code.

Using SQLAlchemy

Begin by creating a file named test_sql.py in your project folder. This file will serve as a playground to create a basic SQL database and learn how to perform CRUD (Create, Read, Update, Delete) operations using SQLAlchemy. Follow these steps within the test_sql.py file.

Importing SQLAlchemy:

import sqlalchemy as db

Building the Database

Create a New Python File "database.py": This file will house functions that enable the bot to communicate with the database.
Establish the Group Members' Table

import sqlalchemy as db

# Create an engine
engine = db.create_engine('sqlite:///database.sqlite')

# Connect to the engine
conn = engine.connect()

# Define metadata
metadata = db.MetaData()

GroupMembers = db.Table('GroupMembers', metadata,
                        db.Column('Id', db.Integer(), primary_key=True),
                        db.Column('user_id', db.Integer),
                        db.Column('groupchat_id', db.Integer),
                        db.Column('no_bullying', db.Integer),
                        )

metadata.create_all(engine)

Refactor the Previous Functions to:

Add a user_id and groupchat_id to the database
Check if a user has exceeded the cyberbullying threshold
Reset the count of bullying messages sent by the user to zero

All of these operations will be conducted using SQLAlchemy.

MAX_BULLYING_MESSAGES = 3

def add_to_db(user_id, groupchat_id):
    # Check if user_id and groupchat_id are present in the database
    search_query = GroupMembers.select().where(db.and_(GroupMembers.columns.user_id == user_id, GroupMembers.columns.groupchat_id == groupchat_id))
    output = conn.execute(search_query)
    result = output.fetchone()

    if result:
        # Increment the no_bullying value by 1
        no_bullying = result.no_bullying
        update_query = GroupMembers.update().where(db.and_(GroupMembers.columns.user_id == result.user_id, GroupMembers.columns.groupchat_id == result.groupchat_id)).values(no_bullying=no_bullying + 1)
        conn.execute(update_query)
    else:
        # Add the user
        insert_query = db.insert(GroupMembers).values(user_id=user_id, groupchat_id=groupchat_id, no_bullying=1)
        conn.execute(insert_query)

def has_hit_limit(user_id, groupchat_id):
    """ Check if the user has reached the cyberbullying
    limit for that group chat """
    # Search for the user
    search_query = GroupMembers.select().where(db.and_(GroupMembers.columns.user_id == user_id, GroupMembers.columns.groupchat_id == groupchat_id))
    output = conn.execute(search_query)
    result = output.fetchone()
    if result:
        if result.no_bullying >= MAX_BULLYING_MESSAGES:
            return True
    return False

def reset_user_record(user_id, groupchat_id):
    update_query = GroupMembers.update().where(db.and_(GroupMembers.columns.user_id == user_id, GroupMembers.columns.groupchat_id == groupchat_id)).values(no_bullying=0)
    conn.execute(update_query)

To conclude this step, remove the previous functions from the bot.py file and import them instead from the database.py file.