Building Text Toxicity Classifier using Lyzr Automata
harshit-lyzr
Posted on March 28, 2024
The internet has revolutionized communication, fostering connection and knowledge sharing. However, this digital landscape can also breed negativity. Online toxicity - hateful comments, insults, and harassment - can create a hostile environment, discouraging meaningful discourse and pushing valuable voices away.
This blog post will guide you through building a Streamlit app that utilizes the power of Lyzr Automata and OpenAI to identify potentially toxic comments.
Join the Lyzr.ai Discord Server!
Why Text Classification Matters?
Text classification tools can play a vital role in mitigating online toxicity. By identifying potentially harmful comments, these tools empower platforms and individuals to:
Promote respectful dialogue: By flagging toxic comments, platforms can encourage users to reconsider their language and foster more civil discussions.
Protect users from harassment: Identifying toxic comments allows for intervention and the protection of users from targeted negativity.
Create a safer online space: By filtering out toxic content, online communities can become more welcoming and inclusive for everyone.
Building Your Text Classifier with Lyzr Automata
Lyzr Automata provides a user-friendly platform to build AI-powered applications. In this case, we'll utilize Lyzr Automata alongside OpenAI's powerful language models to create a Streamlit application for text toxicity classification.
Libraries and API Key:
import streamlit as st
from lyzr_automata.ai_models.openai import OpenAIModel
from lyzr_automata import Agent,Task
from lyzr_automata.pipelines.linear_sync_pipeline import LinearSyncPipeline
from PIL import Image
from dotenv import load_dotenv
import os
load_dotenv()
api = os.getenv("OPENAI_API_KEY")
The code imports necessary libraries like streamlit, lyzr_automata, PIL (for image processing), and dotenv (for environment variables).
load_dotenv() loads the OpenAI API key stored in a separate .env file (remember to create a .env file and store your API key securely!).
User Input:
query=st.text_input("Enter your comment: ")
st.text_input("Enter your comment: ") creates a text box where users can enter the comment they want to classify.
OpenAIModel() from lyzr_automata is used to define the OpenAI model to be used. Here, it's set to "gpt-4-turbo-preview" with specific parameters like temperature and maximum tokens.
Toxicity Classification Function:
def toxicity_classifier(query):
toxicity_agent = Agent(
role="toxicity expert",
prompt_persona=f"You are an Expert toxicity finder. Your task is to find whether the {query} is toxic or not"
)
prompt=f"""you are a toxicity classifier and you have to find whether {query} is toxic or not.[!IMPORTANT]ONLY ANSWER WHETHER SENTENCE IS TOXIC OR NOT.nothing else apart from this"""
toxicity_task = Task(
name="Toxicity Classifier",
model=open_ai_text_completion_model,
agent=toxicity_agent,
instructions=prompt,
)
output = LinearSyncPipeline(
name="Toxicity Pipline",
completion_message="pipeline completed",
tasks=[
toxicity_task
],
).run()
answer = output[0]['task_output']
return answer
toxicity_classifier(query) defines a function that takes the user's input (query) as an argument.
An Agent is created within the function, specifying its role as a "toxicity expert" and its purpose of identifying toxicity in the provided text.
The prompt_persona provides context to the model.
A clear prompt is defined using another st.markdown() call. It emphasizes that the model should only classify the text as toxic or non-toxic, avoiding any additional information.
A Task is created using Task(), specifying the task name, the OpenAI model, the agent, and the instructions (prompt) for the model.
A LinearSyncPipeline is created using LinearSyncPipeline(). This pipeline executes the defined tasks sequentially.
The pipeline is named "Toxicity Pipeline" with a completion message.
The list of tasks includes the previously defined toxicity_task.
.run() executes the pipeline, triggering the model's analysis.
The function retrieves the task's output (output[0]['task_output']) which contains the model's classification (toxic or non-toxic).
Finally, the function returns the retrieved answer.
Run Button and Output:
if st.button("Solve"):
solution = toxicity_classifier(query)
st.markdown(solution)
st.button("Solve") creates a button labeled "Solve." Clicking this button triggers the code below it.
If the button is pressed, the solution variable stores the result of the toxicity_classifier function, passing the user's input (query).
st.markdown(solution) displays the model's classification (toxic or non-toxic) on the Streamlit app using the markdown function.
This code demonstrates how to leverage Streamlit, Lyzr Automata, and OpenAI to create a user-friendly application for text toxicity classification. With this Streamlit app, users can easily identify potentially toxic comments, promoting a more positive online environment.
try it now:https://lyzr-toxicity-classifier.streamlit.app/
For more information explore the website: Lyzr
Lyzr is the simplest agent framework to help customers build and launch Generative AI apps faster. It is a low-code agent framework that follows an ‘agentic’ way to build LLM apps, contrary to Langchain’s ‘functions and chains’ way and DSPy’s ‘programmatic’ way of building LLM apps. For more information, visit Lyzr website .