Zerops Devs
Posted on August 22, 2024
Vector databases are revolutionizing how data is managed and stored for AI applications. At Zerops, we recognized the growing importance of vector databases, leading us to integrate Qdrant, one of the most popular options available. While it might seem straightforward to spin up a Qdrant instance using a Docker container, the reality of managing a production-ready vector database is far more complex. In this article, we'll explore the intricacies of Qdrant implementation and how we've addressed the challenges at Zerops.
Understanding Data in Qdrant
In Qdrant, data is organized into collections, which are essentially named sets of vectors. Each collection contains vectors that share the same dimensionality and are compared using a specific metric. Collections in Qdrant are further divided into shards—independent stores of vectors that can handle all the operations provided by collections.
This architecture allows for distributing the workload across multiple Qdrant nodes in a cluster, enhancing performance and reliability. To ensure high availability (HA), shards can be replicated across nodes. Understanding this structure is crucial for grasping the complexities of deployment and management.
Deployment Options: Balancing Simplicity and Reliability
At Zerops, we offer support for both HA and non-HA modes of Qdrant, giving you the flexibility to choose the configuration that best suits your needs.
Non-HA Mode: Simplicity for Non-Production Projects
The Non-HA mode is perfect for non-production projects where data persistence isn't critical. This setup involves a single Qdrant node, making it easy to install and manage. In Zerops, deploying Qdrant in Non-HA mode is straightforward:
- We run the Qdrant binary in a Ubuntu container.
- Port
6333
is opened for communication.
Even if a data outage occurs, our backup solution ensures minimal disruption.
HA Cluster: Reliability for Production Environments
For production environments where reliability and data availability are crucial, the HA cluster mode is essential. Setting up an HA cluster can be complex, but we've streamlined the process in Zerops:
-
Cluster Configuration: We enable cluster mode by adding
enabled: true
to the configuration file and configure peer-to-peer communication on port6335
:
cluster:
enabled: true
p2p:
port: 6335
-
Node Setup: Building a cluster requires careful configuration of each node:
- The first node runs with a
--uri <uri>
flag to let other peers know how to be reached. - We make the address of the first peer
node1.db.<service_name>.zerops
available in local DNS. - All other peers start with the
--bootstrap <cluster_uri>
flag to locate the rest of the cluster.
- The first node runs with a
High Availability: Qdrant uses the RAFT protocol, which requires more than 50% of the nodes to be functional. We automatically set up 3 nodes per cluster to meet this requirement.
Automatic Replication: By default, we automatically create replicas of any collection across all nodes. This safeguards against data loss if a node fails and serves as a safety net for incorrect
replication_factor
configurations. This approach ensures data safety even if thereplication_factor
is incorrectly configured by the user. If desired, this feature can be disabled by setting theautomaticClusterReplication
parameter tofalse
.Failure Recovery: If a node fails, we automatically start a new node, connect it to the cluster, and create replicas of each collection and shard on the new node.
Node Cleanup: We streamline this process by utilizing the Qdrant API to identify the current cluster Leader node, which then efficiently handles the removal of failed nodes.
Overcoming Technical Challenges
During the integration of Qdrant with Zerops, we encountered and solved several complex issues. Fortunately, Qdrant offers robust tools to monitor and manage clusters through its well-documented REST and gRPC APIs.
One issue we faced was that when a new node was added to the cluster, it appeared fully operational but couldn't receive new replicas. This is often due to ongoing data transfers between nodes, which can take considerable time. To address this, Qdrant provides the POST /cluster/recover
endpoint, which can be triggered on any non-leader node. This endpoint sends a request to the current leader to create a snapshot. The leader then sends this snapshot back to the requesting node for application. This snapshot captures the cluster's agreed-upon state at a specific point in time, allowing the cluster to recover and synchronize.
Data Backup: Safeguarding Your Qdrant Data
At Zerops, we prioritize the safety and security of your data by providing:
- Daily, automatic backups at no extra cost
- Backups in the form of encrypted disk snapshots for each collection
- Secure upload to our S3-compatible backup storage
- Retention of up to 100 backups per stack for a maximum of one month
- Current maximum storage size per project of 25 GiB
You also have the flexibility to choose between different backup options, including one-time backups, regular backups with a customizable frequency, or even disabling backups entirely.
Our backup retention policy is designed to provide comprehensive coverage. For example, if you back up every hour, you'll have up to 4 days of backups available. This ensures that you have access to recent versions of your data in case of any issues.
The Value of Managed Vector Databases
While it's possible to set up and manage Qdrant yourself, doing so properly requires significant expertise and resources. Our managed Qdrant solution allows you to leverage the power of vector databases without the operational complexities. You can focus on developing AI features, while we ensure your vector database is running optimally, is highly available, and is protected against data loss.
Ready to harness the power of vector databases without the operational headaches? Give Zerops a try
Posted on August 22, 2024
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.