Building a Secure PDF Chat AI Application with Langchain, Next.js, arcjet, and Pinecone DB

In this tutorial, we will guide you through creating a secure PDF chat AI application where users can upload PDF files and ask questions related to the content of those files. The chat AI will use a combination of the question, chat history, and context from the uploaded PDFs to generate accurate responses. We’ll leverage a Retrieval-Augmented Generation (RAG) system with Pinecone DB for efficient information retrieval and Arcjet.com for securing our endpoints.

Introduction
System Overview
Setup Instructions
Code Walkthrough
Security Considerations
Conclusion

Introduction

Interactive chat applications are becoming increasingly popular, especially those capable of understanding and processing document content. In this tutorial, we'll build a secure PDF chat AI application using Langchain, Next.js, Pinecone DB, and Arcjet.com. This application will allow users to upload PDFs and interact with an AI that can answer questions based on the content of the uploaded documents.

System Overview

Here's a high-level overview of our system:

📸 Architecture-Overview

Frontend: Allows users to upload PDF files and ask questions.
Backend: Processes the PDF files, stores the content in Pinecone DB, and handles the question-answering mechanism.
RAG System: Combines retrieval of relevant PDF content with generation of answers.
Security: Utilizes Arcjet.com for endpoint security, rate limiting, and bot protection.

Setup Instructions

Follow these steps to set up the application:

Clone the Repository:

git clone https://github.com/NickolasBenakis/secure-pdf-chat.git
cd secure-pdf-chat

Install Dependencies:

npm install

Following the steps on Readme.md

Setup Pinecone DB:

Setup Arcjet:

Create a free account and get access to Arcjet and get your API key.

Setup OPEN AI:

Create a free account and get an OPEN_AI key from openai.

Configure Environment Variables:

Create a .env.local file and add your Pinecone API key and Arcjet credentials:

PINECONE_API_KEY=your_pinecone_api_key
PINECONE_INDEX_NAME=your_pinecone_index_name
ARCJET_API_KEY=your_arcjet_api_key
OPENAI_API_KEY=your_openai_key

🚀 You can run the app:

npm run dev

Code Walkthrough

This project was made with Next.js with Typescript with App Router and with vercel AI SDK.

😎 Great now let's dive into our domain critical parts.🚀

Pdf-loader This is the function responsible for chunking our PDFs into smaller documents to store them in a Pinecone afterward.

We are looping through our files in sequence and we are using the WebPDFLoader.

Afterwards, we created our textSplitter. In which you can pass different options for different use cases.

export async function getChunkedDocsFromUploadedPDFs(
  fileList: File[],
): Promise<Document<Record<string, unknown>>[]> {
  try {
    const docList = [];
    for (const file of fileList) {
      const pdfLoader = new WebPDFLoader(file);
      const docs = await pdfLoader.load();
      docList.push(docs);
    }

    const textSplitter = new RecursiveCharacterTextSplitter({
      chunkSize: 1000,
      chunkOverlap: 200,
    });

    const chunkedDocs = await textSplitter.splitDocuments(flattenDeep(docList));

    return chunkedDocs;
  } catch (error) {
    logger.error(`Error loading PDF: ${fileList} ${error}`);
    throw new Error("Error loading PDF");
  }
}

Saving docs in Pinecone DB

Here we save all the splitter docs from the previous step and we save them along with OpenAIEmbeddings to pinecode db index.

export async function embedAndStoreDocs(
  docs: Document<Record<string, unknown>>[],
) {
  /*create and store the embeddings in the vectorStore*/
  try {
    const pineconeClient = await getPineconeClient();
    const embeddings = new OpenAIEmbeddings();
    const index = pineconeClient.index(env.PINECONE_INDEX_NAME);

    //embed the PDF documents
    await PineconeStore.fromDocuments(docs, embeddings, {
      pineconeIndex: index,
    });
  } catch (error) {
    logger.error(error);
    throw new Error("Failed to load your docs !");
  }
}

Langchain

Here we create our chat chain. It receives the streamingModel, in our case ChatGPT, it receives our pinecone db store as retrieval. and our prompts both for generating the final question to the model + the prompt about answering to it.

    const chain = ConversationalRetrievalQAChain.fromLLM(
      streamingModel,
      vectorStore.asRetriever(),
      {
        qaTemplate: QA_TEMPLATE,
        questionGeneratorTemplate: STANDALONE_QUESTION_TEMPLATE,
        returnSourceDocuments: true,
        questionGeneratorChainOptions: {
          llm: nonStreamingModel,
        },
      },
    );

In more details:

PROMPTS

// Creates a standalone question from the chat-history and the current question
export const STANDALONE_QUESTION_TEMPLATE = `Given the following conversation and a follow up question, rephrase the follow up question to be a standalone question.

Chat History:
{chat_history}
Follow Up Input: {question}
Standalone question:`;

// Actual question you ask the chat and send the response to client
export const QA_TEMPLATE = `You are an enthusiastic AI assistant. Use the following pieces of context to answer the question at the end.
If you don't know the answer, just say you don't know. DO NOT try to make up an answer.
If the question is not related to the context, politely respond that you are tuned to only answer questions that are related to the context.

{context}

Question: {question}
Helpful answer in markdown:`;

Models

import { ChatOpenAI } from "@langchain/openai";

export const streamingModel = new ChatOpenAI({
  modelName: "gpt-3.5-turbo",
  streaming: true,
  verbose: true,
  temperature: 0,
});

export const nonStreamingModel = new ChatOpenAI({
  modelName: "gpt-3.5-turbo",
  verbose: true,
  temperature: 0,
});

In the end we connect all the pieces together and we invoke our chain. In the invocation phase we add the chatHistory as input among with the user question. When the call is finished we take the source documents from the chain and we return the sources to the client for reference. We are using the ai SDK from vercel in order to stream the data to the client.


    chain
      .call(
        {
          question: question,
          chat_history: chatHistory,
        },
        [handlers],
      )
      .then(async (res) => {
        const sourceDocuments = res?.sourceDocuments;
        const firstTwoDocuments = sourceDocuments.slice(0, 2);
        const pageContents = firstTwoDocuments.map(
          ({ pageContent }: { pageContent: string }) => pageContent,
        );

        data.append({
          sources: pageContents,
        });
        data.close();
      });

    return new StreamingTextResponse(stream, {}, data);

Security considerations

To ensure the security of our application, we use Arcjet.com for securing our endpoints. Here are some additional security measures:

Bot Detection: Ensure that our website is not a target for bots. More info here.
Secure from common attacks: Arcjet Shield protects your application against common attacks, including the OWASP Top 10.
Rate limit: Ensure that your llm is not overloaded with n input that can overcharge your wallet. Arcjet offers a declarative rate limiter without worrying about creating a Redis more info here.
Environment Configuration: Use environment variables to manage sensitive configurations.

In more details

Arcjet shield with rate limiter setup in chat route.

const aj = arcjet({
  key: env.ARCJET_KEY,
  rules: [
    tokenBucket({
      mode: "LIVE", // Live for Production, DRY_RUN logs only for dev
      characteristics: ["sessionId"], // property being tracked
      refillRate: 1, // 1 token per interval
      interval: 7200, // 2 hours
      capacity: 5, // bucket maximum capacity of 5 tokens
    }),
    shield({
      mode: "LIVE", // Live for Production, DRY_RUN logs only for dev
    }),
  ],
});

middleware setup

import arcjet, { createMiddleware, detectBot } from "@arcjet/next";
import { env } from "./services/config";
export const config = {
  matcher: ["/((?!_next/static|_next/image|favicon.ico).*)"],
};
const aj = arcjet({
  key: env.ARCJET_KEY, // Get your site key from https://app.arcjet.com
  rules: [
    detectBot({
      mode: process.env.NODE_ENV === "production" ? "LIVE" : "DRY_RUN", // will block requests. Use "DRY_RUN" to log only
      block: ["AUTOMATED"], // blocks all automated clients
    }),
  ],
});
export default createMiddleware(aj);

Conclusion

~Disclaimer starts
Arcjet contacted me to test their product and share my experience with the developer community. While they sponsored this article, they did not influence the content or opinions expressed in this write-up.

This article aims to provide an honest and unbiased guide on integrating Arcjet's SDK with a Next.js application. This ensures you get an authentic look at the process and can make an informed decision about using these tools in your projects.

Transparency is key in the developer community, and I believe in sharing my experiences honestly. Arcjet offers innovative security solutions that I found valuable, and I hope this guide helps you understand how to leverage their services effectively.
~Disclaimer ends

By following this tutorial, you’ve created a secure PDF chat AI application that leverages a RAG system with Pinecone DB, built with TypeScript and Next.js. This setup allows users to interact with their PDF documents in a meaningful way, extracting and utilizing the content effectively. The use of Arcjet.com ensures that your application remains secure and resilient against common threats.

Feel free to customize and extend this application according to your needs. If you have any questions or suggestions, please leave a comment below or contribute to the GitHub repository.

Happy coding!

Blog

Building a Secure PDF Chat AI Application with Langchain, Next.js, arcjet, and Pinecone DB

Nikos Benakis

Table Of Contents