From Local AI to Enterprise-grade deployment with BionicGPT

Welcome to the second article of my Chat with Your Content series. Today, we delve into BionicGPT, a project that could easily feature in the Bringing AI Home series due to its ability for local deployment on desktop systems or hosting in the Cloud.

Introducing BionicGPT

BionicGPT, a brand new project visible on GitHub, is an open-source venture, licensed under Apache 2 and MIT. An enterprise backs it, offering support and consulting services.

Developed in Rust, BionicGPT promises safety and performance. Despite being relatively new, its robust architecture and industrial-grade features stand out. It builds upon the Rust on Nails framework, initially created at Airbus, which offers scalability via Kubernetes and embraces an infrastructure-as-code approach.

Below is a glimpse of BionicGPT's architecture, adapted from Rust on Nails, as seen on their official website:

The system integrates multiple open source components within Docker containers, forming a Large Language Model pipeline, complete with a custom user interface. It utilizes LocalAI for local LLM deployments and PgVector for vector storage.

Setting Up on My Laptop

BionicGPT's documentation indicates compatibility with 16GB RAM laptops, matching my setup. The installation process, streamlined through Docker, involves downloading a compose file and running it.



curl -O https://raw.githubusercontent.com/purton-tech/bionicgpt/main/docker-compose.yml
docker compose up

This step might take some time as it downloads necessary Docker images, including a quantized LLaMA 2 7B default model.



➜  bionic-gpt docker compose up
[+] Running 9/0
 ✔ Container bionic-gpt-db-1              Created
 ✔ Container bionic-gpt-embeddings-api-1  Created
 ✔ Container bionic-gpt-unstructured-1    Created
 ✔ Container bionic-gpt-envoy-1           Created
 ✔ Container bionic-gpt-llm-api-1         Created
 ✔ Container bionic-gpt-migrations-1      Created
 ✔ Container bionic-gpt-barricade-1       Created
 ✔ Container bionic-gpt-embeddings-job-1  Created
 ✔ Container bionic-gpt-app-1             Created

After the setup, accessing the Web console at http://localhost:7800 allows for the creation of an admin user.

Navigating the UI

Upon immediate use, response times may vary based on your computer's specifications.

With that first test passed, let's step back and understand some key elements of BionicGPT.

The UI allows users to be organized into Teams with varying permissions - System Administrator, Team Administrator, and Team Collaborator.

Within Teams, users can craft prompts, link them to models with custom settings, and associate them with datasets. This facilitates the Retrieval-augmented generation (RAG) by indexing uploaded documents, converting them into vectors with the BGE Small EN v1.5 default embedding model, and storing them in PostgreSQL/PgVector.

I tested this by uploading documents to the TechSquad dataset and creating a prompt named Chat with docs. The resulting contextualized answers in the chat console were impressive.

Leveraging API Endpoints

BionicGPT also enables the creation of API endpoints, which can be used in applications like chatbots. These endpoints are linked to specific prompts and require API keys for access.

Using CURL, I tested the Chat with docs prompt:



curl http://localhost:7800/v1/chat/completions  \
  -H "Content-Type: application/json"  \
  -H "Authorization: Bearer <API-key-here>" \
  -d '{ "model": "llama-2", "messages": [{"role": "user", "content": "What is the TechSquad?"}] }'



{
    "id": "cmpl-25c0d2f4-ab74-47f8-a76c-7d9319658e1a",
    "object": "chat.completion",
    "model": "Llama-2",
    "choices": [
        {
            "index": 0,
            "message": {
                "role": "assistant",
                "content": " The TechSquad is an initiative at 
Worldline that aims to empower tech experts within the company to 
voice their expertise and collaborate with other teams. It is 
composed of a core team and seven working groups called squads, 
each focused on a specific area of technology. The TechSquad 
initiative provides various channels and clubs for employees to 
share their expertise, contribute to the company's knowledge base, 
and build relationships within the organization. Its goal is to 
promote coherence, communication, and alignment within Worldline, 
while fostering innovation and supporting business functions."
            },
            "finish_reason": "Length"
        }
    ],
    "usage": {
        "prompt_tokens": 0,
        "completion_tokens": 0,
        "total_tokens": 0
    }
}

The design choice to require a model parameter, despite the presence of the API key, aligns with OpenAI specifications, allowing for seamless integration with tools compatible or designed for OpenAI LLMs, like like Flowise or LibreChat for instance.

I further experimented with the lightweight ChatGPT Lite frontend, configuring it to interact with BionicGPT.



git clone https://github.com/blrchen/chatgpt-lite.git
npm install
cp .env.example .env.local

Content of the .env.local file:



OPENAI_API_KEY="<API-key-here>"
OPENAI_API_BASE_URL="http://localhost:7800"
OPENAI_MODEL="llama-2"

Chat on http://localhost:3000:

Other models can be installed as soon as they are supported by LocalAI.

Conclusion

BionicGPT, still evolving, shows promise with upcoming features like model fine-tuning and S3 storage for documents. My initial tests on a laptop were successful, and the next steps before production should involve deploying on a Kubernetes infrastructure with added observability tools.

Blog

From Local AI to Enterprise-grade deployment with BionicGPT

raphiki

Introducing BionicGPT

Setting Up on My Laptop

Navigating the UI

Leveraging API Endpoints

Conclusion

Join Our Newsletter. No Spam, Only the good stuff.

Related