Create a streaming AI assistant with ChatGPT, FastAPI, WebSockets and React ✨🤖🚀
Deon Pillsbury
Posted on November 12, 2023
A Generative Pre-Trained Transformer (GPT) is a type of Large Language Model (LLM) and they are the hot topic in the technology world this year with many companies scrambling to add this technology to their products. Creating and training these large models can be a very complex, time consuming and expensive. You may think that you cannot use this technology since it is so complex and expensive but companies like OpenAI have done a ton of work to create useful models and setup platforms exposing APIs to use them. If you have ever used an API where you send some data in, it does some magic behind the scenes and you get some data to use in a response, then you can integrate this cutting edge technology into your application. Let’s take a look at how we can setup a Full stack web app which lets us ask questions sent to OpenAI and stream the response.
⭐️ The complete source code referenced in this guide is available on GitHub
https://github.com/dpills/ai-assistant
In order to use the OpenAI API you will need to sign up for an account and then Generate an API Key then add this to a .env
file in your new project folder.
📝 .env
OPENAI_API_KEY=sk-YWUedpcl1xiGvGiD4xTwT3TlbkFJx9Sgnt8s0QYNxxxxxxxx
Install the API dependencies along with the openai
python library.
📝 pyproject.toml
[tool.poetry]
name = "ai-assistant"
version = "0.1.0"
description = ""
authors = ["dpills"]
readme = "README.md"
[tool.poetry.dependencies]
python = "^3.11"
openai = "^1.2.3"
python-dotenv = "^1.0.0"
fastapi = "^0.104.1"
uvicorn = { extras = ["standard"], version = "^0.24.0.post1" }
websockets = "^12.0"
[build-system]
requires = ["poetry-core"]
build-backend = "poetry.core.masonry.api"
$ poetry install
...
• Installing fastapi (0.104.1)
• Installing openai (1.2.3)
...
Create a python file and import the OpenAI library which will use the OPENAI_API_KEY
from the environment variables to authenticate. Within the options set stream
to true and use an asynchronous generator to stream the response chunks as they are returned. We are using the GPT-3.5 Turbo model which is available in the free-trial but you can swap this out for a newer model such as GPT-4 if you have access to it.
📝 main.py
from typing import AsyncGenerator
from dotenv import load_dotenv
from openai import AsyncOpenAI
load_dotenv()
client = AsyncOpenAI()
async def get_ai_response(message: str) -> AsyncGenerator[str, None]:
"""
OpenAI Response
"""
response = await client.chat.completions.create(
model="gpt-3.5-turbo",
messages=[
{
"role": "system",
"content": (
"You are a helpful assistant, skilled in explaining "
"complex concepts in simple terms."
),
},
{
"role": "user",
"content": message,
},
],
stream=True,
)
all_content = ""
async for chunk in response:
content = chunk.choices[0].delta.content
if content:
all_content += content
yield all_content
That is the basic setup for adding ChatGPT to your application, which is pretty simple! 😃
Now lets add a WebSocket API endpoint with the FastAPI WebSocket support in order to have a persistent bi-directional connection where we can stream the response from ChatGPT to our web app in real-time.
📝 main.py
from typing import AsyncGenerator, NoReturn
import uvicorn
from dotenv import load_dotenv
from fastapi import FastAPI, WebSocket
from openai import AsyncOpenAI
load_dotenv()
app = FastAPI()
client = AsyncOpenAI()
...
@app.websocket("/ws")
async def websocket_endpoint(websocket: WebSocket) -> NoReturn:
"""
Websocket for AI responses
"""
await websocket.accept()
while True:
message = await websocket.receive_text()
async for text in get_ai_response(message):
await websocket.send_text(text)
if __name__ == "__main__":
uvicorn.run(
"main:app",
host="0.0.0.0",
port=8000,
log_level="debug",
reload=True,
)
Finally, for this simple example we can just have the API serve our Web App with a index.html
file on the root route.
📝 main.py
from typing import AsyncGenerator, NoReturn
import uvicorn
from dotenv import load_dotenv
from fastapi import FastAPI, WebSocket
from fastapi.responses import HTMLResponse
from openai import AsyncOpenAI
load_dotenv()
app = FastAPI()
client = AsyncOpenAI()
with open("index.html") as f:
html = f.read()
async def get_ai_response(message: str) -> AsyncGenerator[str, None]:
"""
OpenAI Response
"""
response = await client.chat.completions.create(
model="gpt-3.5-turbo",
messages=[
{
"role": "system",
"content": (
"You are a helpful assistant, skilled in explaining "
"complex concepts in simple terms."
),
},
{
"role": "user",
"content": message,
},
],
stream=True,
)
all_content = ""
async for chunk in response:
content = chunk.choices[0].delta.content
if content:
all_content += content
yield all_content
@app.get("/")
async def web_app() -> HTMLResponse:
"""
Web App
"""
return HTMLResponse(html)
@app.websocket("/ws")
async def websocket_endpoint(websocket: WebSocket) -> NoReturn:
"""
Websocket for AI responses
"""
await websocket.accept()
while True:
message = await websocket.receive_text()
async for text in get_ai_response(message):
await websocket.send_text(text)
if __name__ == "__main__":
uvicorn.run(
"main:app",
host="0.0.0.0",
port=8000,
log_level="debug",
reload=True,
)
Now create the index.html
file using the CDN development version of React, transpiled with the standalone Babel version and the React Material UI library. The app connects to the web socket and submits new questions to the WebSocket API which then streams the response as it is generated and renders the markdown.
📝 index.html
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="utf-8" />
<title>AI Assistant 🤓</title>
<meta name="viewport" content="initial-scale=1, width=device-width" />
<script src="https://unpkg.com/react@latest/umd/react.development.js" crossorigin="anonymous"></script>
<script src="https://unpkg.com/react-dom@latest/umd/react-dom.development.js"></script>
<script src="https://unpkg.com/@mui/material@latest/umd/material-ui.development.js"
crossorigin="anonymous"></script>
<script src="https://unpkg.com/@babel/standalone@latest/babel.min.js" crossorigin="anonymous"></script>
<script src="https://cdn.jsdelivr.net/npm/marked/marked.min.js"></script>
<link rel="preconnect" href="https://fonts.googleapis.com" />
<link rel="preconnect" href="https://fonts.gstatic.com" crossorigin />
<link rel="stylesheet"
href="https://fonts.googleapis.com/css2?family=Roboto:wght@300;400;500;600;700&display=swap" />
<link rel="stylesheet" href="https://fonts.googleapis.com/icon?family=Material+Icons" />
</head>
<body>
<div id="root"></div>
<script type="text/babel">
const {
colors,
CssBaseline,
ThemeProvider,
Typography,
TextField,
Container,
createTheme,
Box,
Skeleton,
} = MaterialUI;
const theme = createTheme({
palette: {
mode: 'dark'
},
});
const WS = new WebSocket("ws://localhost:8000/ws");
function App() {
const [response, setResponse] = React.useState("");
const [question, setQuestion] = React.useState("");
const [loading, setLoading] = React.useState(false);
React.useEffect(() => {
WS.onmessage = (event) => {
setLoading(false);
setResponse(marked.parse(event.data));
};
}, []);
return (
<Container maxWidth="lg">
<Box sx={{ my: 4 }}>
<Typography variant="h4" component="h1" gutterBottom>
AI Assistant 🤓
</Typography>
<TextField
id="outlined-basic"
label="Ask me Anything"
variant="outlined"
style={{ width: '100%' }}
value={question}
disabled={loading}
onChange={e => {
setQuestion(e.target.value)
}}
onKeyUp={e => {
setLoading(false)
if (e.key === "Enter") {
setResponse('')
setLoading(true)
WS.send(question);
}
}}
/>
</Box>
{!response && loading && (<>
<Skeleton />
<Skeleton animation="wave" />
<Skeleton animation={false} /></>)}
{response && <Typography dangerouslySetInnerHTML={{ __html: response }} />}
</Container>
);
}
ReactDOM.createRoot(document.getElementById('root')).render(
<ThemeProvider theme={theme}>
<CssBaseline />
<App />
</ThemeProvider>,
);
</script>
</body>
</html>
Now we are ready to test it out! Run the python script which will start the API and serve the Web App.
$ python3 main.py
INFO: Uvicorn running on http://0.0.0.0:8000 (Press CTRL+C to quit)
DEBUG: = connection is CONNECTING
DEBUG: < GET /ws HTTP/1.1
DEBUG: < user-agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/119.0.0.0 Safari/537.36
DEBUG: < upgrade: websocket
DEB**UG: < origin: http://localhost:8000**
DEBUG: < sec-websocket-version: 13
DEBUG: < cookie: ajs_anonymous_id=57154c91-3ec5-4308-beaa-25a353f3ce66
DEBUG: < sec-websocket-key: BQOnRY2biub2alMCN7ax+w==
DEBUG: < sec-websocket-extensions: permessage-deflate; client_max_window_bits
INFO: ('127.0.0.1', 54308) - "WebSocket /ws" [accepted]
DEBUG: > date: Sun, 12 Nov 2023 19:00:20 GMT
DEBUG: > server: uvicorn
INFO: connection open
DEBUG: = connection is OPEN
Navigate to http://localhost:8000, type in a question and hit enter.
Our question gets sent to the API through the WebSocket, then to OpenAI which generates a response which is streamed back through the WebSocket and rendered in the web app! 🎉
We can see that it takes around 30 seconds for ChatGPT to generate the full response which is a long time to wait for any application. But since we are taking advantage of streaming the response over the WebSocket it makes it feel very responsive. The response data is rendered on the screen faster than the average person can read it which leads to a better user experience. 🙂
I hope this shows you how easy it is to add AI enhancements to any application and gives you some basic building blocks to create a great experience for your users! 😊
Posted on November 12, 2023
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.
Related
November 12, 2023