Building a Document QA with Streamlit & OpenAI

cypriantinasheaarons

CyprianTinasheAarons

Posted on October 15, 2024

Building a Document QA with Streamlit & OpenAI

What is Streamlit? šŸš€

Streamlit is an open-source Python framework for data scientists and AI/ML engineers to deliver dynamic data apps with only a few lines of code.

Streamlit is exciting for AI engineers who want to quickly demo or create Proof of concept projects.

Streamlit provides great documentation that is easy to understand, and for any developer to pick up easily. šŸ“ˆ


Some Fundamentals before we dive into our project šŸ§©

Installation šŸ› ļø

To install Streamlit, we can run the following command:

pip install streamlit
Enter fullscreen mode Exit fullscreen mode

To test if we have installed it successfully, we run the following:

streamlit hello
Enter fullscreen mode Exit fullscreen mode

Once we have built our application script i.e <streamlit_script.py>, we can run it using the following command:

streamlit run <streamlit_script.py>
Enter fullscreen mode Exit fullscreen mode

Displaying Text or Diagrams šŸ“

Using st.write we can display information in our app:

st.write("hello world")
Enter fullscreen mode Exit fullscreen mode

Text Elements āœļø

We can display strings in different formats, e.g., markdown, title, header, and subheader:

st.markdown("*Streamlit* is **really** ***cool***.")
Enter fullscreen mode Exit fullscreen mode

Widgets šŸŽ›ļø

Streamlit has many widgets that include buttons, select boxes, checkboxes, etc.:

st.button("Click me")
Enter fullscreen mode Exit fullscreen mode

Layout šŸ–¼ļø

We can work with sidebars, columns, and expanders. For example, st.sidebar will show a sidebar on our app interface:

st.sidebar.write("I am a sidebar")
Enter fullscreen mode Exit fullscreen mode

šŸ‘‰ Going through the Streamlit docs and cheat sheet will quickly get you updated on the entire syntax:

Hosting a Streamlit App šŸŒ

Hosting a Streamlit app is very easy when working with Streamlit Cloud:


Prerequisites šŸ“‹

  1. You are a Python developer.
  2. You have a basic understanding of Gen AI and LLMs like OpenAI.
  3. You love learning and upskilling.
  4. Your preferred IDE e.g VScode.

A Breakdown of our Document Question & Answer Streamlit application

We start by importing Streamlit and OpenAI into our app.py file:

import streamlit as st
from openai import OpenAI
Enter fullscreen mode Exit fullscreen mode

Next, we make use of st.title and st.write to display the title and description:

st.title("šŸ“„ Document Question Answering")
st.write(
    "Upload a document below and ask a question about it ā€“ GPT will answer! "
    "To use this app, you need to provide an OpenAI API key, which you can get [here](https://platform.openai.com/account/api-keys). "
)
Enter fullscreen mode Exit fullscreen mode

Image description

Next up, is the st.text_input function by Streamlit to add our OpenAI key giving our application AI capabilities:

openai_api_key = st.text_input("OpenAI API Key", type="password")
Enter fullscreen mode Exit fullscreen mode

Image description

Lastly, when Implementing the core logic for the platform, we start with an if not condition to check if the key exists; otherwise, we show the st.info to ask the user to add the key:

if not openai_api_key:
    st.info("Please add your OpenAI API key to continue.", icon="šŸ—ļø")
Enter fullscreen mode Exit fullscreen mode

Image description

The else condition shows our fully functional Doc QA:

else:

    # Create an OpenAI client.
    client = OpenAI(api_key=openai_api_key)

    # Let the user upload a file via `st.file_uploader`.
    uploaded_file = st.file_uploader(
        "Upload a document (.txt or .md)", type=("txt", "md")
    )

    # Ask the user for a question via `st.text_area`.
    question = st.text_area(
        "Now ask a question about the document!",
        placeholder="Can you give me a short summary?",
        disabled=not uploaded_file,
    )

    if uploaded_file and question:

        # Process the uploaded file and question.
        document = uploaded_file.read().decode()
        messages = [
            {
                "role": "user",
                "content": f"Here's a document: {document} \n\n---\n\n {question}",
            }
        ]

        # Generate an answer using the OpenAI API.
        stream = client.chat.completions.create(
            model="gpt-3.5-turbo",
            messages=messages,
            stream=True,
        )

        # Stream the response to the app using `st.write_stream`.
        st.write_stream(stream)
Enter fullscreen mode Exit fullscreen mode

Image description

A Breakdown of the Code šŸ§

  1. Initializing our OpenAI client using the added OpenAI key:

    client = OpenAI(api_key=openai_api_key)
    
  2. Using file_uploader from Streamlit, we upload our types .txt and .md:

    uploaded_file = st.file_uploader(
        "Upload a document (.txt or .md)", type=("txt", "md")
    )
    
  3. Using text_area, we take the input from the user:

    question = st.text_area(
        "Now ask a question about the document!",
        placeholder="Can you give me a short summary?",
        disabled=not uploaded_file,
    )
    
  4. We implement a condition to check if the user has uploaded a file and inputted a question:

    if uploaded_file and question:
    
  5. We read the file and process what the user uploaded:

    document = uploaded_file.read().decode()
    
  6. We initialize the messages and pass them to our OpenAI chat completions endpoint:

    messages = [
        {
            "role": "user",
            "content": f"Here's a document: {document} \n\n---\n\n {question}",
        }
    ]
    
    # Generate an answer using the OpenAI API.
    stream = client.chat.completions.create(
        model="gpt-3.5-turbo",
        messages=messages,
        stream=True,
    )
    
  7. Finally, using write_stream, we stream the output:

    st.write_stream(stream)
    

Setting Up the Project Locally on your machine šŸ—ļø

Clone the repository:

git clone git@github.com:CyprianTinasheAarons/document-qa.git
cd document-qa/
Enter fullscreen mode Exit fullscreen mode

Create a virtual environment:

python3 -m venv venv
Enter fullscreen mode Exit fullscreen mode

Activate the environment:

source venv/bin/activate
Enter fullscreen mode Exit fullscreen mode

Install the requirements found in requirements.txt:

pip install -r requirements.txt
Enter fullscreen mode Exit fullscreen mode

Yay!! Now we can run our code:

streamlit run streamlit_app.py
Enter fullscreen mode Exit fullscreen mode

Image description


We navigate to our local URL and add our OpenAI key:

Get your API key here: OpenAI API Keys

Image description

Image description


šŸŽ‰ Conclusion

Congratulations on getting this far! Now you can go and launch great AI solutions that will make the world better! šŸŽŠ

Feel free to follow me on Twitter for more updates and projects. Also, check out my website here. šŸŒāœØ


šŸ“š Resources

šŸ’– šŸ’Ŗ šŸ™… šŸš©
cypriantinasheaarons
CyprianTinasheAarons

Posted on October 15, 2024

Join Our Newsletter. No Spam, Only the good stuff.

Sign up to receive the latest update from our blog.

Related