Creating a Smart Second Brain: Leveraging Cloudflare Workers, Vectorize, and OpenAI

Introduction

Welcome to my latest adventure: creating an AI "second brain" that's more than just a digital notebook — it's like a smart personal assistant that knows what I need before I do. This journey involves using Cloudflare Vectorize for storing snippets of my life and then retrieving relevant entries for each query I have. The goal is a tool that makes sense of all my notes and reminders.

In this post, I'll explain how I pieced this together, the tech behind it, and why I think this will be useful in my daily life as a programmer (and a person). It won't be a tutorial, as such, but I'll provide snippets throughout the body to explain concepts, and then, I will provide the full code at the end.

Use Cases

I haven't explored all of these fully, but here are some use-cases I have thought of so far:

daily briefings. Could be a cronjob that sends an email summarising everything I need to know that day. Any urgent action items, etc.
automated meeting summaries. I keep meeting notes in Obsidian, and this could automatically detect these, and summarise in email (or other) form.
project reports. Depending on how well I'm keeping notes on my project (I keep daily notes on all sorts of things), it could summarise work done in the last days/weeks/etc.

Vector Databases, Embeddings, and Similarity Search: The Tech Behind the AI Second Brain

Let's delve into the technical aspects of vector databases and embeddings, and and how they relate to my second brain. These are the core components that make this system not just a storage unit, but an intelligent assistant.

Vector Databases: Efficient Storage and Retrieval

A vector database, like Cloudflare Vectorize, is specialised for storing and querying vectors. In this context, vectors are essentially high-dimensional data points. Each piece of information - a note, a calendar entry, or an email - is converted into a vector. This vector represents the essence of the text in a mathematical form. The beauty of a vector database lies in its ability to compare these vectors "spatially", where those vectors that are "close" together are also similar semantically. So, how do we create the vectors?

Embeddings: Translating Text into Vectors

Embeddings are where the transformation of text into vectors happens. This process involves using algorithms (like those in OpenAI's models) to analyse the text and encode it into a numerical form that captures its semantic meaning. Cloudflare has its own tool for creating embeddings, but I chose to use OpenAI because of its extra little quality and higher dimensional vectors (1536 dimensions for OpenAI).

Here's how to get a vector from text using OpenAI:

const embedding = await openai.embeddings.create({
  encoding_format: 'float',
  input: text,
  model: 'text-embedding-ada-002',
});

const vector = embedding?.data?.[0].embedding;

The vector here is of type number[], and can be inserted directly into the vector DB like this:

await env.VECTORIZE_INDEX.insert([{
  id: someId, // Very useful because you need this if you want to be able to delete it later, which we do.
  values: vector,
  metadata: {
    // Any metadata you want to store, usually you will store the content so that when you query the DB, you can grab the original text too:
    text,
  },
}]);

Splitting the Text up into Manageable Chunks

Something I only thought about after having started on this journey is: imagine you have a huge document and you then create an embedding for it. And then you have 5 smaller texts too. Your query might retrieve all 6, but in terms of word count, they could be absolutely dominated by the larger text.

We don't want that. We need something more balanced, so that our LLM can draw context from a wide variety of sources for its responses.

Therefore we'll use a text splitter. Here I'm using the one from Langchain, which splits it up into text lengths of 1536 characters, with a 200 character overlap to keep semantic similarity between the sections. So when we post a note, it will split it up into smaller documents, then embed each of those. Luckily embeddings are cheap so this isn't a costly process, even for large files.

const splitter = new RecursiveCharacterTextSplitter({
  chunkOverlap: 200,
  chunkSize: 1536,
});

const documents = splitter.createDocuments(
  [content],
  [
    {
      fileName,
      timestamp: Date.now(),
    },
  ],
  {
    appendChunkOverlapHeader: true,
    chunkHeader: `FILE NAME: ${fileName || 'None.'}\n\n---\n\n`,
  },
);

Then, we loop over the documents and embed them individually:

for (const [i, document] of documents.entries()) {
  try {
    const embedding = await openai.embeddings.create({
      encoding_format: 'float',
      input: document.pageContent,
      model: 'text-embedding-ada-002',
    });

    const vector = embedding?.data?.[0].embedding;

Considerations when Inserting Documents into the Vector DB.

One issue I thought of was that what if, when I post an update to a note, some of the split chunks are different. If I just "upserted" the docs by their id, you could quickly lose the context of your original text.

I therefore decided to keep track of any files I've added here and, when adding and update, delete the existing entries before recalculating and adding in the new vectors.

I found this difficult and messy to do with just Vectorize. Some other vector databases have the required functionality, e.g. Pinecone - but it's ludicrously expensive for hobbyists. Luckily Cloudflare gives you easy access to a KV store that is both rock-solid and extremely cheap (never been charged for it in any of my toy projects to date).

So, after creating the embeddings, we add the ids of the embeddings under the key of the file name. In reverse, therefore, when we post a note we can search for ids by file name and delete by ids (which is a Vectorize function).

Adding the ids to KV:

await env.NOTES_AI_KV.put(filename, JSON.stringify(embeddingsArray.map(embedding => embedding.id)));

Deleting the entries by file name:

async function deleteByFilename(filename: string, env: Env) {
  // If there are existing embeddings for this file, delete them
  const existingIds: string[] = JSON.parse((await env.NOTES_AI_KV.get(filename)) ?? '') ?? [];

  if (existingIds.length) {
    await env.VECTORIZE_INDEX.deleteByIds(existingIds);
  }

  return existingIds;
}

Similarity Search: Finding Relevant Connections

So what happens when we query? First we convert that query itself into a vector using the same embedding tool and model (i.e. Open AI's text-embedding-ada-002). Then we use that vector to do a similarity search on our vector database.

This similarity is determined based on how close or far apart vectors are in the high-dimensional space. The result is a set of data points that are contextually similar to the query, not just textually.

const embedding = await openai.embeddings.create({
  encoding_format: 'float',
  input: query,
  model: 'text-embedding-ada-002',
});

const vector = embedding?.data?.[0].embedding;

const similar = await env.VECTORIZE_INDEX.query(vector, {
  topK: 10,
  returnMetadata: true,
});

Second Brain Endpoints

I want to call these functions from all sorts of places (which I'll talk about in a subsequent post), which is why I chose a Cloudflare Worker to host it. Endpoints are exposed to allow me to post notes, query notes, and delete notes by their file name.

`/vectors (POST)`

Functionality: Handles the addition of new data.

Process: Receives content and a filename, splits the content into manageable chunks, converts these chunks into embeddings, and stores them in the vector database.

`/vectors/delete_by_filename (POST)`

Functionality: Allows for deletion of data based on filename.

Process: When provided with a filename, it removes all associated embeddings from the vector database, ensuring that outdated or unwanted data is not retained.

`/vectors/query (POST)`

Functionality: Handles querying for information.

Process: Accepts a query, converts it into an embedding, and performs a similarity search in the vector database. It retrieves the most contextually relevant information based on the query.

Application in the Second Brain

In the context of the second brain, this technology astounds me with its capability. When you ask a question or make a query, the system doesn’t just retrieve direct matches - it understands the context and essence of your query. It then uses similarity search to find and provide information that's contextually relevant. This means your interactions with the AI are more intuitive and insightful, as it brings forward information based on semantic understanding, not just keyword matching.

It's like ChatGPT, but tailored to you.

I'm finding it useful for all sorts of things, like upcoming deadlines, planning weekly tasks, etc. And I'm sure I'll find a lot more as it grows.

To get set up with Cloudflare Vectorize, follow their docs here: https://developers.cloudflare.com/vectorize/

Code

Here's the full worker code. I've deliberately kept it quite raw - not done any heavy refactoring - because I wanted to show exactly what is required without hiding any details. I think you'll be surprised at how simple it is.

Note that I added a little security by just ensuring I have a locally-defined key in the request headers. This is only a personal project, and will keep the hoards out for now while I work on it.

import OpenAI from "openai";
import { splitFileIntoDocuments } from "./text-splitter";

export interface Env {
  NOTES_AI_KV: KVNamespace;
  NOTES_AI_API_KEY: string;
  OPENAI_API_KEY: string;
  VECTORIZE_INDEX: VectorizeIndex;
}

const DEFAULT_MODEL = 'gpt-3.5-turbo-1106';

export default {
  async fetch(request: Request, env: Env): Promise<Response> {
    if (request.headers.get('NOTES_AI_API_KEY') !== env.NOTES_AI_API_KEY) {
      return new Response('Unauthorized', { status: 401 });
    }

    const openai = new OpenAI({
      apiKey: env.OPENAI_API_KEY,
    });

    if (request.url.endsWith('/vectors') && request.method === 'POST') {
      const body = (await request.json() as { content: string; filename: string; });

      if (!body?.content || !body?.filename) {
        return new Response('Missing content or filename', { status: 400 });
      }

      const { content, filename } = body;

      const documents = await splitFileIntoDocuments(content, filename);

      if (!documents.length) {
        return new Response('No content found', { status: 400 });
      }

      const timestamp = Date.now();

      let successful = true;
      const embeddings = new Set<{ content: string, id: string, vector: number[] }>();

      for (const [i, document] of documents.entries()) {
        try {
          const embedding = await openai.embeddings.create({
            encoding_format: 'float',
            input: document.pageContent,
            model: 'text-embedding-ada-002',
          });

          const vector = embedding?.data?.[0].embedding;

          if (!vector?.length) {
            successful = false;
            break;
          }

          embeddings.add({
            content: document.pageContent,
            id: `${filename}-${i}`,
            vector,
          });
        } catch (e) {
          successful = false;
          break;
        }
      }

      if (successful === false) {
        return new Response('Could not create embeddings', { status: 500 });
      }

      // If there are existing embeddings for this file, delete them
      deleteByFilename(filename, env);

      for (const embedding of embeddings) {
        await env.VECTORIZE_INDEX.insert([{
          id: embedding.id,
          values: embedding.vector,
          metadata: {
            filename,
            timestamp,
            content: embedding.content,
          },
        }]); 
      }

      const embeddingsArray = [...embeddings];

      await env.NOTES_AI_KV.put(filename, JSON.stringify(embeddingsArray.map(embedding => embedding.id)));

      return new Response(JSON.stringify({
        embeddings: embeddingsArray.map(embedding => ({
          filename,
          timestamp,
          id: embedding.id,
        })),
      }), { status: 200 });
    }

    if (request.url.endsWith('/vectors/delete_by_filename') && request.method === 'POST') {
      const body = (await request.json() as { filename: string });

      if (!body?.filename) {
        return new Response('Missing filename', { status: 400 });
      }

      const { filename } = body;

      const deleted = await deleteByFilename(filename, env);

      new Response(JSON.stringify({
          deleted,
        }), { status: 200 });
    }

    if (request.url.endsWith('/vectors/query') && request.method === 'POST') {
      const body = (await request.json() as { model: string; query: string });

      if (!body?.query) {
        return new Response('Missing query', { status: 400 });
      }

      const { model = DEFAULT_MODEL, query } = body;

      const embedding = await openai.embeddings.create({
        encoding_format: 'float',
        input: query,
        model: 'text-embedding-ada-002',
      });

      const vector = embedding?.data?.[0].embedding;

      if (!vector?.length) {
        return new Response('Could not create embedding', { status: 500 });
      }

      const similar = await env.VECTORIZE_INDEX.query(vector, {
        topK: 10,
        returnMetadata: true,
      });

      const context = similar.matches.map((match) => `
Similarity: ${match.score}
Content:\n${(match as any).vector.metadata.content as string}
      `).join('\n\n');

      const prompt = `You are my second brain. You have access to things like my notes, meeting notes, some appointments.
In fact you're like a CEO's personal assistant (to me), who also happens to know everything that goes on inside their head.
Your job is to help me be more productive, and to help me make better decisions.
Use the following pieces of context to answer the question at the end.
If you really don't know the answer, just say that you don't know, don't try to make up an answer. But do try to give any
information that you think might be relevant.
----------------
${context}
----------------
Question:
${query}`;

      try {
        const chatCompletion = await openai.chat.completions.create({
          model,
          messages: [{ role: 'user', content: prompt }],
        });

        const response = chatCompletion.choices[0].message;

        return new Response(JSON.stringify({
          prompt,
          response,
        }), { status: 200 });

      } catch (e) {
        return new Response('Could not create completion', { status: 500 });
      }
    }

    return new Response('Not found', { status: 404 });
    },
};


async function deleteByFilename(filename: string, env: Env) {
  // If there are existing embeddings for this file, delete them
  const existingIds: string[] = JSON.parse((await env.NOTES_AI_KV.get(filename)) ?? '') ?? [];

  if (existingIds.length) {
    await env.VECTORIZE_INDEX.deleteByIds(existingIds);
  }

  return existingIds;
}

Blog