ChatGPT clone with React Suspense and Streaming

fibonacid

Lorenzo Rivosecchi

Posted on January 29, 2024

ChatGPT clone with React Suspense and Streaming

This is a short blog to showcase a solution I developed to create ChatGPT style interfaces.

Server Side

Let's start by creating a simple server that our client will use to communicate with OpenAI.
First, we initialize the openai client with our API key:

import OpenAI from "openai";

const openai = new OpenAI(process.env.OPENAI_API_KEY);
Enter fullscreen mode Exit fullscreen mode

Then we create a map to store chat sessions:

const sessions = new Map<string, OpenAI.ChatCompletionMessageParam[]>();
Enter fullscreen mode Exit fullscreen mode

Now we define a request handler that will forward messages to OpenAI and stream the response down the wire:

import express from "express";
import bodyParser from "body-parser";

const app = express();

// Parse body as JSON when Content-Type: application/json
app.use(bodyParser.json());

app.post("/chat", async (req, res) => {
  // 1. Validate input
  // 2. Create session if it doesn't exist
  // 3. Add user message to session
  // 4. Fetch response from OpenAI
  // 5. Stream response to client
  // 6. Add OpenAI response to session
});

app.listen(3000, () => {
  console.log("Listening on port 3000");
});
Enter fullscreen mode Exit fullscreen mode

Let's start by validating the input.
We expect to receive a session ID and a prompt from the client.

// 1. Validate input
const prompt = req.body.prompt;
const sessionId = req.body.sessionId;

// Validate input
if (typeof prompt !== "string" || prompt.length === 0) {
  res.status(400).send("prompt is required");
  return;
}
if (typeof sessionId !== "string" || sessionId.length === 0) {
  res.status(400).send("sessionId is required");
  return;
}
Enter fullscreen mode Exit fullscreen mode

Then, if session doesn't exist, we create it and add the user message to it.

// 2. Create session if it doesn't exist
if (sessions.has(sessionId) === false) {
  sessions.set(sessionId, []);
}
const messages = sessions.get(sessionId);

// 3. Add user message to session
messages.push({
  role: "user",
  content: prompt,
});
Enter fullscreen mode Exit fullscreen mode

Now we can fetch the response from OpenAI using the stream option.

// 4. Fetch response from OpenAI
const stream = await openai.chat.completions.create({
  messages,
  stream: true,
  model: "gpt-4",
});
Enter fullscreen mode Exit fullscreen mode

The stream object is an async iterable, so we can use a for await loop to iterate over the incoming chunks. To stream the chunks to the client we simply write to the response object with res.write. Once the stream is finished, we call res.end to close the connection.

// 5. Stream response to client
let response = "";
for await (const chunk of stream) {
  const token = chunk.choices?.[0]?.delta?.content ?? "";
  res.write(token);
  response += token;
}
res.end();
Enter fullscreen mode Exit fullscreen mode

Finally, we add the OpenAI response to the session.

// 6. Add OpenAI response to session
messages.push({
  role: "assistant",
  content: response,
});
Enter fullscreen mode Exit fullscreen mode

Client Side

Let's now focus on the client side. We will use some React APIs that are currently available in the canary release of React 18. Let's start by preparing our environment.

Update you react version to the latest canary release:

npm update react@canary react-dom@canary
# yarn upgrade react@canary react-dom@canary
# pnpm update react@canary react-dom@canary
Enter fullscreen mode Exit fullscreen mode

Then, reference the canary react types in your tsconfig.json:

{
  "compilerOptions": {
    "types": ["react/canary"]
  }
}
Enter fullscreen mode Exit fullscreen mode

If you prefer to use a declaration file you can use a triple-slash directive instead:

/// <reference types="react/canary" />
Enter fullscreen mode Exit fullscreen mode

Now we can start building our app. Let's start by creating a simple form to send messages to the server.
The component will accept a callback to send messages to the server and a boolean to indicate if the server is currently processing a message.

import { useState, useCallback } from "react";
import type { FormEventHandler, ChangeEventHandler } from "react";

export type ChatFormProps = {
  onSendMessage: (message: string) => void; 
  isSending: boolean;
};

export function ChatForm({ onSending, isSending }: ChatFormProps) {
  const [input, setInput] = useState("");

  const handleSubmit = useCallback<FormEventHandler<HTMLFormElement>>(
    (e) => {
      e.preventDefault();
      if (input === "") return;
      onSendMessage(input);
      setInput("");
    },
    [input, onSendMessage],
  );

  const handleInputChange = useCallback<ChangeEventHandler<HTMLInputElement>>(
    (e) => {
      setInput(e.target.value);
    },
    [],
  );

  return (
    <form onSubmit={handleSubmit}>
      <input
        value={input}
        onChange={handleInputChange}
        placeholder="Ask a question"
        required
      />
      <button disabled={isSending}>{isSending ? "Sending..." : "Send"}</button>
    </form>
  );
}
Enter fullscreen mode Exit fullscreen mode

Now, we create a parent component that will handle the communication with the server.

import { useState, useCallback } from "react";
import { ChatForm, type ChatFormProps } from "./ChatForm";

export type Message = {
  role: "user" | "assistant";
  content: string;
}

export function ChatApp() {
  const [messages, setMessages] = useState<Message[]>([]);
  const [isSending, setIsSending] = useState(false);

  const handleSendMessage = useCallback<ChatFormProps["onSendMessage"]>(
    async (message) => { 
      // We will implement this later
    }, []
  )

  return (
    <div>
      <ChatForm onSendMessage={handleSendMessage} isSending={isSending} />
    </div>
  )
}
Enter fullscreen mode Exit fullscreen mode

Before implementing the send message logic, let's define how we want to display the messages.
Let's create a presentational component that renders a single message.

import type { Message } from "./ChatApp";

export type ChatMessageProps = {
  message: Message;
};

export function ChatMessage({ message }: ChatMessageProps) {
  return (
    <p>
     <span>From {message.role}:</span>
     <span>{message.content}</span>
    </p>
  )
}
Enter fullscreen mode Exit fullscreen mode

Now let's define a component that will render a list of messages.

export type ChatLogProps = {
  messages: Message[];
};

export function ChatLog({ messages }: ChatLogProps) {
  return (
    <div role="log">
      {messages.map((message, i) => (
        <ChatMessage key={i} message={message} />
      ))}
    </div>
  );
}
Enter fullscreen mode Exit fullscreen mode

Finally we can use the ChatLog component in our ChatApp component.

// ...
import { ChatLog } from "./ChatLog";

export function ChatApp() {
  // ...
  return (
    <div>
      <ChatLog messages={messages} />
      <ChatForm onSendMessage={handleSendMessage} isSending={isSending} />
    </div>
  )
}
Enter fullscreen mode Exit fullscreen mode

Now it's time for the fun part. With Suspense we can easily render messages regardless of whether they are coming from the server or from the user. Let's define a MessageRenderer component that receives a message or a promise that resolves to a message.

import { use } from "react";
import { ChatMessage } from "./ChatMessage";

export type MessageRendererProps = {
  message: Message | Promise<Message>;
};

export function MessageRenderer(props: MessageRendererProps) {
  // Use will activate the suspense boundary when message is a promise
  const message =
    props.message instanceof Promise
      ? use(props.message)
      : props.message;

  return <ChatMessage message={message} />;
}
Enter fullscreen mode Exit fullscreen mode

In the history component we can now use the MessageRenderer component to render messages.

import { Suspense } from "react";
import { MessageRenderer, type MessageRendererProps } from "./MessageRenderer";

export type ChatLogProps = {
  // Now both messages and promises are accepted
  messages: MessageRendererProps["message"][];
};

export function ChatLog({ messages }: ChatLogProps) {
  return (
    <div role="log">
      {messages.map((message, i) => (
        <Suspense key={i} fallback="Loading...">
          <MessageRenderer message={message} />
        </Suspense>
      ))}
    </div>
  );
}
Enter fullscreen mode Exit fullscreen mode

While the message is loading, Suspense will render the fallback component. Once the promise resolves, the message will be rendered instead. To handle errors we need to wrap the Suspense element in an ErrorBoundary component.
I recommend using the react-error-boundary package for this.

npm install react-error-boundary
# yarn add react-error-boundary
# pnpm add react-error-boundary
Enter fullscreen mode Exit fullscreen mode

We can render a fallback UI when an error occurs:

import { ErrorBoundary } from "react-error-boundary";

<ErrorBoundary fallback={<p>Error</p>}>
  <Suspense fallback="Loading...">
    <MessageRenderer message={message} />
  </Suspense>
</ErrorBoundary>
Enter fullscreen mode Exit fullscreen mode

Let's create a component for the suspense fallback.
The fallback will receive a stream and will subscribe to it.
The component will listen to incoming tokens and aggregate them into a state called content. Every time content changes, a message prop is created and fed to ChatMessage.

export type MessageStream = ReadableStream<Uint8Array>

export type StreamingMessageProps = {
  stream: MessageStream;
};

export function StreamingMessage({ stream }: StreamingMessageProps) {
  const [content, setContent] = useState("");

  useEffect(() => {
    if (stream.locked) return;
    readMessageStream(stream, (token) => {
      // defined later
      setContent((prev) => prev + token);
    });
  }, [stream]);

  const message = {
    from: "assistant",
    content,
  };

  return (
    <ChatMessage message={message} />
  );
}
Enter fullscreen mode Exit fullscreen mode

The StreamingMessage component will run until the server request promise is fulfilled. Then a regular ChatMessage will replace it, resulting in less memory consumption.

To read the stream i have created this utility function.
The stream is read asynchronously in a loop until the request completes. Then returns the full response as a concatenation of text chunks (tokens).

export async function readMessageStream(
  stream: ReadableStream,
  onNewToken: (token: string) => void = () => {},
) {
  const reader = stream.getReader();
  const decoder = new TextDecoder();
  const chunks: string[] = [];

  // eslint-disable-next-line no-constant-condition
  while (true) {
    const { done, value } = await reader.read();
    if (done) break;
    if (value) {
      const chunk = decoder.decode(value);
      chunks.push(chunk);
      onNewToken(chunk);
    }
  }

  const text = chunks.join("");
  return text;
}
Enter fullscreen mode Exit fullscreen mode

Now we can use the StreamingMessage component in our ChatLog component:

import { StreamingMessage, type MessageStream } from "./StreamingMessage";

export type ChatLogProps = {
  messages: MessageOrPromise[];
  stream?: MessageStream;
};

// ...
<ErrorBoundary fallback={<p>Error</p>}>
  <Suspense fallback={<StreamingMessage stream={stream} />}>
    <MessageRenderer message={message} />
  </Suspense>
</ErrorBoundary>
Enter fullscreen mode Exit fullscreen mode

Now we can extend our ChatApp component to track the message stream and pass it to the ChatLog component.

// ...
import { ChatLog, type ChatLogProps } from "./ChatLog";
import { ChatForm, type ChatFormProps } from "./ChatForm";

export function ChatApp() {
  const [messages, setMessages] = useState<ChatLogProps["messages"]>([]);
  const [isSending, setIsSending] = useState<ChatFormProps["isSending"]>(false);
  const [stream, setStream] = useState<ChatLogProps["stream"]>();

  const handleSendMessage = useCallback<ChatFormProps["onSendMessage"]>(
    async (message) => { 
      // We will implement this later
    }, []
  )

  return (
    <div>
      <ChatLog messages={messages} stream={stream} />
      <ChatForm onSendMessage={handleSendMessage} isSending={isSending} />
    </div>
  );
}
Enter fullscreen mode Exit fullscreen mode

Finally, here is the complete implementation of the handleSendMessage function:

const handleSendMessage = useCallback(
  (input: string) => {
    const userMessage: Message = {
      from: "user",
      content: input,
    }

    const assistantMessage = fetchMessageStream(input, sessionId)
      .then((stream) => {
        const [stream1, stream2] = stream.tee();
        setStream(stream1); // read by ChatLog
        return readMessageStream(stream2);
      })
      .then((text): Message => {
        return {
          from: "assistant",
          content: text,
        }
      });

    setIsSending(true);
    // Update messages state
    setMessages((prevMessages) => [
      ...prevMessages,
      userMessage,
      assistantMessage,
    ]);
  },
  [sessionId],
);


function fetchMessageStream(prompt: string, sessionId: string) {
  const response = fetch("/chat", {
    method: "POST",
    headers: {
      "Content-Type": "application/json",
    },
    body: JSON.stringify({
      prompt,
      sessionId,
    }),
  });
  if (!response.ok) {
    throw new Error("Failed to fetch message stream");
  }
  return response.body satisfies ChatLogProps["stream"];
}
Enter fullscreen mode Exit fullscreen mode
๐Ÿ’– ๐Ÿ’ช ๐Ÿ™… ๐Ÿšฉ
fibonacid
Lorenzo Rivosecchi

Posted on January 29, 2024

Join Our Newsletter. No Spam, Only the good stuff.

Sign up to receive the latest update from our blog.

Related

ยฉ TheLazy.dev

About