RAG Tutorial: Exploring AnythingLLM and Vector Admin

raphiki

raphiki

Posted on December 9, 2023

RAG Tutorial: Exploring AnythingLLM and Vector Admin

This tutorial is designed with a dual purpose in mind. Firstly, it introduces two highly innovative open-source projects from Mintplex Labs. These are AnythingLLM, an enterprise-grade solution engineered for the creation of custom ChatBots, inclusive of the RAG pattern, and Vector Admin, a sophisticated admin GUI for the effective management of multiple vectorstores.

Logos

The second aim of this tutorial is to guide you through the deployment of local models, specifically for text embedding and generation, as well as a vectorstore, all designed to integrate seamlessly with the aforementioned solutions. For this, we'll be utilizing LocalAI in conjunction with Chroma.

So, strap in and let's embark on this informative journey!

Installing the Chroma Vectorstore

Chroma Logo

The process begins with cloning the official repository and initiating the Docker container.



git clone https://github.com/chroma-core/chroma.git
cd chroma
docker compose up -d --build


Enter fullscreen mode Exit fullscreen mode

To verify the availability of the vectorstore, we connect to its API documentation located at: http://localhost:8000/docs

Chroma OpenAPI

Using this API, we proceed to create a new collection, aptly named 'playground'.



curl -X 'POST' 'http://localhost:8000/api/v1/collections?tenant=default_tenant&database=default_database' 
  -H 'accept: application/json' 
  -H 'Content-Type: application/json' 
  -d '{ "name": "playground", "get_or_create": false}'


Enter fullscreen mode Exit fullscreen mode

Following this, we check the result to ensure proper setup.



curl http://localhost:8000/api/v1/collections

[
  {
    "name": "playground",
    "id": "0072058d-9a5b-4b96-8693-c314657365c6",
    "metadata": {
      "hnsw:space": "cosine"
    },
    "tenant": "default_tenant",
    "database": "default_database"
  }
]


Enter fullscreen mode Exit fullscreen mode

Implementation of LocalAI

LocalAI Logo

Next, our focus shifts to establishing the LocalAI Docker container.



git clone https://github.com/go-skynet/LocalAI
cd LocalAI
docker compose up -d --pull always


Enter fullscreen mode Exit fullscreen mode

Once the container is operational, we embark on downloading, installing, and testing two specific models.

s-BERT Logo

Our first model is the sentence-transformers embedding model from Bert: MiniLM L6.



curl http://localhost:8080/models/apply 
  -H "Content-Type: application/json" 
  -d '{ "id": "model-gallery@bert-embeddings" }'

curl http://localhost:8080/v1/embeddings 
  -H "Content-Type: application/json" 
  -d '{ "input": "The food was delicious and the waiter...",
        "model": "bert-embeddings" }'

{
  "created": 1702050873,
  "object": "list",
  "id": "b11eba4b-d65f-46e1-8b50-38d3251e3b52",
  "model": "bert-embeddings",
  "data": [
    {
      "embedding": [
        -0.043848168,
        0.067443006,
    ...
        0.03223838,
        0.013112408,
        0.06982294,
        -0.017132297,
        -0.05828256
      ],
      "index": 0,
      "object": "embedding"
    }
  ],
  "usage": {
    "prompt_tokens": 0,
    "completion_tokens": 0,
    "total_tokens": 0


 }
}


Enter fullscreen mode Exit fullscreen mode

Zephyr Logo

Subsequently, we explore the LLM: Zephyr-7B-β from Hugging Face, a refined version of the foundational Mistral 7B model.



curl http://localhost:8080/models/apply 
  -H "Content-Type: application/json" 
  -d '{ "id": "huggingface@thebloke__zephyr-7b-beta-gguf__zephyr-7b-beta.q4_k_s.gguf", 
        "name": "zephyr-7b-beta" }'

curl http://localhost:8080/v1/chat/completions 
  -H "Content-Type: application/json" 
  -d '{ "model": "zephyr-7b-beta", 
        "messages": [{
          "role": "user", 
          "content": "Why is the Earth round?"}], 
        "temperature": 0.9 }'

{
  "created": 1702050808,
  "object": "chat.completion",
  "id": "67620f7e-0bc0-4402-9a21-878e4c4035ce",
  "model": "thebloke__zephyr-7b-beta-gguf__zephyr-7b-beta.q4_k_s.gguf",
  "choices": [
    {
      "index": 0,
      "finish_reason": "stop",
      "message": {
        "role": "assistant",
        "content": "\nThe Earth appears round because it is
actually a spherical body. This shape is a result of the 
gravitational forces acting upon it from all directions. The force 
of gravity pulls matter towards the center of the Earth, causing 
it to become more compact and round in shape. Additionally, the 
Earth's rotation causes it to bulge slightly at the equator, 
further contributing to its roundness. While the Earth may appear 
flat from a distance, up close it is clear that our planet is 
indeed round."
      }
    }
  ],
  "usage": {
    "prompt_tokens": 0,
    "completion_tokens": 0,
    "total_tokens": 0
  }
}


Enter fullscreen mode Exit fullscreen mode

Deploying and Setting Up AnythingLLM

AnythingLLM Logo

Having successfully installed the models and vectorstore, our next step is to deploy the AnythingLLM application. For this, we will utilize the official Docker image provided by Mintplex Labs.



docker pull mintplexlabs/anythingllm:master
docker run -p 3001:3001 mintplexlabs/anythingllm:master


Enter fullscreen mode Exit fullscreen mode

Access to the application is achieved through navigating to http://localhost:3001, where we can begin the configuration process using the intuitive GUI.

In the configuration, we opt for the LocalAI backend, accessible via the http://172.17.0.1:8080/v1 URL, and integrate the Zephyr model. It's noteworthy that AnythingLLM also supports other backends such as OpenAI, Azure OpenAI, Anthropic Claude 2, and the locally available LM Studio.

AnythingLLM LLM Preference

Following this, we align our embedding model with the same LocalAI backend, ensuring a cohesive system.

AnythingLLM Embedding Preference

Next, we select the Chroma vector database, using the URL http://172.17.0.1:8000. It’s important to mention that AnythingLLM is also compatible with other vectorstores such as Pinecone, QDrant, Weaviate, and LanceDB.

Vector Database

Customization options for AnythingLLM include the possibility of adding a logo to personalize the instance. However, for the sake of simplicity in this tutorial, we will skip this step. Similarly, while there are options for configuring user and rights management, we will proceed with a streamlined, single-user setup.

We then proceed to create a workspace, aptly named "Playground," reflecting the name of our earlier Chroma collection.

Create Workspace

The AnythingLLM start page is designed to offer initial instructions to the user in a chat-like interface, with the flexibility to tailor this content to specific needs.

Start page

From our "Playground" workspace, we can upload documents, further expanding the capabilities of our setup.

Upload Documents

We monitor the logs to confirm that AnythingLLM is effectively inserting the corresponding vectors into Chroma.



Adding new vectorized document into namespace playground
Chunks created from document: 4
Inserting vectorized chunks into Chroma collection.
Caching vectorized results of custom-documents/techsquad-3163747c-a2e1-459c-92e4-b9ec8a6de366.json to prevent duplicated embedding.
Adding new vectorized document into namespace playground
Chunks created from document: 8
Inserting vectorized chunks into Chroma collection.
Caching vectorized results of custom-documents/techsquad-f8dfa1c0-82d3-48c3-bac4-ceb2693a0fa8.json to prevent duplicated embedding.


Enter fullscreen mode Exit fullscreen mode

This functionality enables us to engage in interactive dialogues with the documents.

AnythingLLM Chat

An interesting feature of AnythingLLM is its ability to display the content that forms the basis of its responses.

AnythingLLM Citation

In conclusion, AnythingLLM and each workspace within it offer a range of configurable parameters. These include the system prompt, response temperature, chat history, and the threshold for document similarity, among others, allowing for a customized and efficient user experience.

AnythingLLM Settings

Workspace Settings

Installing and Configuring Vector Admin

Vector Admin Logo

To complete our architecture, we now focus on installing the Vector Admin GUI, which serves as a powerful tool for visualizing and managing the vectors stored by AnythingLLM in Chroma.

The installation process involves utilizing Docker containers provided by Mintplex Labs: one for the Vector Admin application and another for a PostgreSQL database, which stores the application's configuration and chat history.



git clone https://github.com/Mintplex-Labs/vector-admin.git
cd vector-admin/docker/
cp .env.example .env


Enter fullscreen mode Exit fullscreen mode

We modify the .env file to adjust the server port from 3001 to 3002, avoiding a conflict with the port already in use by AnythingLLM. On Linux systems, it is also necessary to set the default Docker gateway IP address for the PostgreSQL connection string.



SERVER_PORT=3002
DATABASE_CONNECTION_STRING="postgresql://vectoradmin:password@172.17.0.1:5433/vdbms"


Enter fullscreen mode Exit fullscreen mode

Additionally, we configure the SYS_EMAIL and SYS_PASSWORD variables to define credentials for the first GUI connection.

Given the change in the default port, we also reflect this modification in both the docker-compose.yaml and Dockerfile.

After configuring the backend, we turn our attention to the frontend installation.



cd ../frontend/
cp .env.example .env.production


Enter fullscreen mode Exit fullscreen mode

In the .env.production file, we update the port to align with the Docker gateway.



GENERATE_SOURCEMAP=false
VITE_API_BASE="http://172.17.0.1:3002/api"


Enter fullscreen mode Exit fullscreen mode

With these settings in place, we build and launch the Docker containers.



docker compose up -d --build vector-admin


Enter fullscreen mode Exit fullscreen mode

Accessing the GUI is straightforward via http://localhost:3002. The initial connection utilizes the SYS_EMAIL and SYS_PASSWORD values specified in the .env file. These credentials are only required for the first login to create a primary admin user from the GUI and start configuring the tool.

The first step in the GUI is to create an organization, followed by establishing a Vector Database Connection. For the database type, we select Chroma, although Pinecone, QDrant, and Weaviate are also compatible options.

Vector Database Connection

After synchronizing workspace data, the documents and vectors stored in the "playground" collection within Chroma become visible.

Workspace Data

Details of these vectors are also accessible for in-depth analysis.

Vector Details

A note on functionality: editing vector content directly via Vector Admin is currently limited as it utilizes OpenAI's embedding model. Since we opted for s-BERT MiniLM, this capability is not available. Had we chosen OpenAI's model, uploading new documents and embedding vectors directly into Chroma would have been possible.

Vector Admin also boasts additional features like user management and advanced tools, including automatic drift detection in similarity searches, upcoming snapshots, and migration capabilities between organizations (and, by extension, vectostores).

This tool is particularly admirable for its capacity to grant full control over vectors, simplifying their management considerably.


That concludes our exploration for today. As demonstrated, Mintplex Labs' tools, AnythingLLM and Vector Admin, facilitate the straightforward setup of the RAG pattern, empowering users to interact with documents conversationally. These projects are actively evolving, with new features on the horizon. Therefore, it is worthwhile to regularly check their roadmap and begin leveraging these tools to engage with your files interactively.

💖 💪 🙅 🚩
raphiki
raphiki

Posted on December 9, 2023

Join Our Newsletter. No Spam, Only the good stuff.

Sign up to receive the latest update from our blog.

Related