RAG Tutorial: Exploring AnythingLLM and Vector Admin
raphiki
Posted on December 9, 2023
This tutorial is designed with a dual purpose in mind. Firstly, it introduces two highly innovative open-source projects from Mintplex Labs. These are AnythingLLM, an enterprise-grade solution engineered for the creation of custom ChatBots, inclusive of the RAG pattern, and Vector Admin, a sophisticated admin GUI for the effective management of multiple vectorstores.
The second aim of this tutorial is to guide you through the deployment of local models, specifically for text embedding and generation, as well as a vectorstore, all designed to integrate seamlessly with the aforementioned solutions. For this, we'll be utilizing LocalAI in conjunction with Chroma.
So, strap in and let's embark on this informative journey!
Installing the Chroma Vectorstore
The process begins with cloning the official repository and initiating the Docker container.
git clone https://github.com/chroma-core/chroma.git
cd chroma
docker compose up -d --build
To verify the availability of the vectorstore, we connect to its API documentation located at: http://localhost:8000/docs
Using this API, we proceed to create a new collection, aptly named 'playground'.
curl -X 'POST' 'http://localhost:8000/api/v1/collections?tenant=default_tenant&database=default_database'
-H 'accept: application/json'
-H 'Content-Type: application/json'
-d '{ "name": "playground", "get_or_create": false}'
Following this, we check the result to ensure proper setup.
curl http://localhost:8000/api/v1/collections
[
{
"name": "playground",
"id": "0072058d-9a5b-4b96-8693-c314657365c6",
"metadata": {
"hnsw:space": "cosine"
},
"tenant": "default_tenant",
"database": "default_database"
}
]
Implementation of LocalAI
Next, our focus shifts to establishing the LocalAI Docker container.
git clone https://github.com/go-skynet/LocalAI
cd LocalAI
docker compose up -d --pull always
Once the container is operational, we embark on downloading, installing, and testing two specific models.
Our first model is the sentence-transformers embedding model from Bert: MiniLM L6.
curl http://localhost:8080/models/apply
-H "Content-Type: application/json"
-d '{ "id": "model-gallery@bert-embeddings" }'
curl http://localhost:8080/v1/embeddings
-H "Content-Type: application/json"
-d '{ "input": "The food was delicious and the waiter...",
"model": "bert-embeddings" }'
{
"created": 1702050873,
"object": "list",
"id": "b11eba4b-d65f-46e1-8b50-38d3251e3b52",
"model": "bert-embeddings",
"data": [
{
"embedding": [
-0.043848168,
0.067443006,
...
0.03223838,
0.013112408,
0.06982294,
-0.017132297,
-0.05828256
],
"index": 0,
"object": "embedding"
}
],
"usage": {
"prompt_tokens": 0,
"completion_tokens": 0,
"total_tokens": 0
}
}
Subsequently, we explore the LLM: Zephyr-7B-β from Hugging Face, a refined version of the foundational Mistral 7B model.
curl http://localhost:8080/models/apply
-H "Content-Type: application/json"
-d '{ "id": "huggingface@thebloke__zephyr-7b-beta-gguf__zephyr-7b-beta.q4_k_s.gguf",
"name": "zephyr-7b-beta" }'
curl http://localhost:8080/v1/chat/completions
-H "Content-Type: application/json"
-d '{ "model": "zephyr-7b-beta",
"messages": [{
"role": "user",
"content": "Why is the Earth round?"}],
"temperature": 0.9 }'
{
"created": 1702050808,
"object": "chat.completion",
"id": "67620f7e-0bc0-4402-9a21-878e4c4035ce",
"model": "thebloke__zephyr-7b-beta-gguf__zephyr-7b-beta.q4_k_s.gguf",
"choices": [
{
"index": 0,
"finish_reason": "stop",
"message": {
"role": "assistant",
"content": "\nThe Earth appears round because it is
actually a spherical body. This shape is a result of the
gravitational forces acting upon it from all directions. The force
of gravity pulls matter towards the center of the Earth, causing
it to become more compact and round in shape. Additionally, the
Earth's rotation causes it to bulge slightly at the equator,
further contributing to its roundness. While the Earth may appear
flat from a distance, up close it is clear that our planet is
indeed round."
}
}
],
"usage": {
"prompt_tokens": 0,
"completion_tokens": 0,
"total_tokens": 0
}
}
Deploying and Setting Up AnythingLLM
Having successfully installed the models and vectorstore, our next step is to deploy the AnythingLLM application. For this, we will utilize the official Docker image provided by Mintplex Labs.
docker pull mintplexlabs/anythingllm:master
docker run -p 3001:3001 mintplexlabs/anythingllm:master
Access to the application is achieved through navigating to http://localhost:3001, where we can begin the configuration process using the intuitive GUI.
In the configuration, we opt for the LocalAI backend, accessible via the http://172.17.0.1:8080/v1 URL, and integrate the Zephyr model. It's noteworthy that AnythingLLM also supports other backends such as OpenAI, Azure OpenAI, Anthropic Claude 2, and the locally available LM Studio.
Following this, we align our embedding model with the same LocalAI backend, ensuring a cohesive system.
Next, we select the Chroma vector database, using the URL http://172.17.0.1:8000. It’s important to mention that AnythingLLM is also compatible with other vectorstores such as Pinecone, QDrant, Weaviate, and LanceDB.
Customization options for AnythingLLM include the possibility of adding a logo to personalize the instance. However, for the sake of simplicity in this tutorial, we will skip this step. Similarly, while there are options for configuring user and rights management, we will proceed with a streamlined, single-user setup.
We then proceed to create a workspace, aptly named "Playground," reflecting the name of our earlier Chroma collection.
The AnythingLLM start page is designed to offer initial instructions to the user in a chat-like interface, with the flexibility to tailor this content to specific needs.
From our "Playground" workspace, we can upload documents, further expanding the capabilities of our setup.
We monitor the logs to confirm that AnythingLLM is effectively inserting the corresponding vectors into Chroma.
Adding new vectorized document into namespace playground
Chunks created from document: 4
Inserting vectorized chunks into Chroma collection.
Caching vectorized results of custom-documents/techsquad-3163747c-a2e1-459c-92e4-b9ec8a6de366.json to prevent duplicated embedding.
Adding new vectorized document into namespace playground
Chunks created from document: 8
Inserting vectorized chunks into Chroma collection.
Caching vectorized results of custom-documents/techsquad-f8dfa1c0-82d3-48c3-bac4-ceb2693a0fa8.json to prevent duplicated embedding.
This functionality enables us to engage in interactive dialogues with the documents.
An interesting feature of AnythingLLM is its ability to display the content that forms the basis of its responses.
In conclusion, AnythingLLM and each workspace within it offer a range of configurable parameters. These include the system prompt, response temperature, chat history, and the threshold for document similarity, among others, allowing for a customized and efficient user experience.
Installing and Configuring Vector Admin
To complete our architecture, we now focus on installing the Vector Admin GUI, which serves as a powerful tool for visualizing and managing the vectors stored by AnythingLLM in Chroma.
The installation process involves utilizing Docker containers provided by Mintplex Labs: one for the Vector Admin application and another for a PostgreSQL database, which stores the application's configuration and chat history.
git clone https://github.com/Mintplex-Labs/vector-admin.git
cd vector-admin/docker/
cp .env.example .env
We modify the .env
file to adjust the server port from 3001 to 3002, avoiding a conflict with the port already in use by AnythingLLM. On Linux systems, it is also necessary to set the default Docker gateway IP address for the PostgreSQL connection string.
SERVER_PORT=3002
DATABASE_CONNECTION_STRING="postgresql://vectoradmin:password@172.17.0.1:5433/vdbms"
Additionally, we configure the SYS_EMAIL and SYS_PASSWORD variables to define credentials for the first GUI connection.
Given the change in the default port, we also reflect this modification in both the docker-compose.yaml
and Dockerfile
.
After configuring the backend, we turn our attention to the frontend installation.
cd ../frontend/
cp .env.example .env.production
In the .env.production
file, we update the port to align with the Docker gateway.
GENERATE_SOURCEMAP=false
VITE_API_BASE="http://172.17.0.1:3002/api"
With these settings in place, we build and launch the Docker containers.
docker compose up -d --build vector-admin
Accessing the GUI is straightforward via http://localhost:3002. The initial connection utilizes the SYS_EMAIL and SYS_PASSWORD values specified in the .env
file. These credentials are only required for the first login to create a primary admin user from the GUI and start configuring the tool.
The first step in the GUI is to create an organization, followed by establishing a Vector Database Connection. For the database type, we select Chroma, although Pinecone, QDrant, and Weaviate are also compatible options.
After synchronizing workspace data, the documents and vectors stored in the "playground" collection within Chroma become visible.
Details of these vectors are also accessible for in-depth analysis.
A note on functionality: editing vector content directly via Vector Admin is currently limited as it utilizes OpenAI's embedding model. Since we opted for s-BERT MiniLM, this capability is not available. Had we chosen OpenAI's model, uploading new documents and embedding vectors directly into Chroma would have been possible.
Vector Admin also boasts additional features like user management and advanced tools, including automatic drift detection in similarity searches, upcoming snapshots, and migration capabilities between organizations (and, by extension, vectostores).
This tool is particularly admirable for its capacity to grant full control over vectors, simplifying their management considerably.
That concludes our exploration for today. As demonstrated, Mintplex Labs' tools, AnythingLLM and Vector Admin, facilitate the straightforward setup of the RAG pattern, empowering users to interact with documents conversationally. These projects are actively evolving, with new features on the horizon. Therefore, it is worthwhile to regularly check their roadmap and begin leveraging these tools to engage with your files interactively.
Posted on December 9, 2023
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.