Michael Wahl
Posted on July 2, 2023
Using OpenAI’s ChatGPT, we can train a language model using our own local/custom data, thats scoped toward our own needs or use cases.
I am using a Mac/MacOS, but you can also use Windows or Linux.
Install Python
You need to make sure you have Python installed, and at least version 3.0+. Head over to following link and download python installer: . You can also open a terminal and run python3 --version
to verify you have the correct version of python installed.
Upgrade PIP
python3 -m pip install -U pip
Installing Libraries
pip3 install openai
pip install gpt_index==0.4.24
pip3 install PyPDF2
pip3 install gradio
Prep Data
Create a new directory named ‘docs’ anywhere you like and put PDF, TXT or CSV files inside it. You can add multiple files if you like but remember that more data you add, more the tokens will be used. Free accounts are given 18$ worth of tokens to use.
Script
from gpt_index import SimpleDirectoryReader, GPTListIndex, GPTSimpleVectorIndex, LLMPredictor, PromptHelper
from langchain import OpenAI
import gradio as gr
import sys
import os
os.environ["OPENAI_API_KEY"] = 'ApiGoesHere'
def construct_index(directory_path):
max_input_size = 4096
num_outputs = 512
max_chunk_overlap = 20
chunk_size_limit = 600
prompt_helper = PromptHelper(max_input_size, num_outputs, max_chunk_overlap, chunk_size_limit=chunk_size_limit)
llm_predictor = LLMPredictor(llm=OpenAI(temperature=0.7, model_name="text-davinci-003", max_tokens=num_outputs))
documents = SimpleDirectoryReader(directory_path).load_data()
index = GPTSimpleVectorIndex(documents, llm_predictor=llm_predictor, prompt_helper=prompt_helper)
index.save_to_disk('index.json')
return index
def chatbot(input_text):
index = GPTSimpleVectorIndex.load_from_disk('index.json')
response = index.query(input_text, response_mode="compact")
return response.response
iface = gr.Interface(fn=chatbot,
inputs=gr.inputs.Textbox(lines=7, label="Enter your text"),
outputs="text",
title="My AI Chatbot")
index = construct_index("docs")
iface.launch(share=True)
Save as app.py
Open Terminal and run following command
python3 app.py
This will start training. This might take some time based on how much data you have fed to it. Once done, it will output a link where you can test the responses using simple UI. It outputs local URL: http://127.0.0.1:7860
You can open this in any browser and start testing your custom trained chatbot. The port number above might be different for you.
To train on more or different data, you can close using CTRL + C and change files and then run the python file again.
If this article was helpful, maybe consider a clap or follow me back.
Posted on July 2, 2023
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.