Automatic synchronization of Django services
Max Kolyubyakin
Posted on June 18, 2020
Using microservices is like going to the bar: it’s great when the party starts, but you can only hope that you won’t have a headache in the morning.
Please, don’t get me wrong. Microservices are definetly a great pattern, but as always, it comes with the cost. Let me briefly describe one of the main technical problems that you can face dealing with microservices.
Problem 🧐
Each business service operates in a certain subdomain/bounded context, which may be stored in a dedicated database. As services are distributed over the network, data is also distributed. And as business domains are pretty complex, one domain object may be shared between several contexts. In other words, we need the same pieces of data in different loosely related isolated environments. That sounds like a problem! Let’s find a solution.
Theoretical Solution 📚
CQRS is rushing to help us! Command Query Responsibility Segregation is the pattern that separates read/query and update/command operations for a data store. This pattern can be applied within one service (or even one DB instance), as well as adopted to be used in different isolated services with separate stores.
Examples:
- A: simple views to expand ORM bounds or materialized views to avoid complex joins and optimize queries
- B: DB shards with one write instance and several read instances; different DB types, where write type is SQL and read one is a NoSQL search engine, like ElasticSearch
- C: Shared domain data between several services with one producing service and several consumers for solving performance issues like API JOINs (our case!)
If you are already thinking about integrating it into your system, please, think twice. CQRS is a great tool for the right job, but there are several issues and considerations, which you need to be aware of:
- CQRS is simple in theory, but can be pretty complex in practice (especially when Event Sourcing is added to the mix)
- Good messaging system will be needed for data synchronization between services (simple HTTP is rarely enough due to many reasons, like message failures)
- Data will only be eventually (!) consistent (that may not fit for business critical transactions: there are other patterns for such cases)
Let’s now look at the real world example, where CQRS pattern could be exceptionally useful:
Any supply-chain automation platform must be able to register incoming orders from customers. There may be millions of orders monthly, so the load is pretty high. The platform may have several certified integrations, which need to receive webhooks on different order actions (e.g. on order creation). And our UI portal must be able to show some aggregated daily/monthly stats for processed orders.
Technically that means that we have:
- Orders microservice with CRUD
/orders
REST API and own SQL DB — source/master service for Order domain objects - Webhooks microservice, that sends outbound HTTP messages to our integrations (this service may have other SLAs than Orders). This service has DB to store replicated order data for schedules, manual triggers or execution history.
-
Stats service with read only
/stats/orders
Public API, that provides aggregated orders data within some time range. Incoming orders data (replica) are stored to avoid data duplication (and maybe some NoSQL DB like Mongo is used). There are several CQRS consumers, which simultaneously pull messages from the queue. - Reliable fast message transport like RabbitMQ or Apache Kafka with support for persistence and ability to send messages to several consumers like Webhooks or Stats services.
I will try to dispel some of you doubts immediately. We could technically merge all three services into one monolithic Super-Orders service to avoid orders data duplication. But the cost of development and operating of such service would be extra high. Webhooks may run on the asynchronous engine, while Orders may be a simple synchronous service. Scaling strategies and SLAs differ for these services, so it’s not reasonable to merge them together. Stats could be the part of Orders until there is a common algorithm or functionality for collecting and aggregating stats from different domain objects. Stats could be aggregated by some background task or cron job that would add additional load to a mission-critical Orders DB. In a long run separation of Stats is the most reasonable solution.
I hope that CQRS theory is pretty clear at this moment. If you need deeper integration, here are some helpful links for you:
Let’s jump to practice already!
Practical Solution 💡
cloudblue / django-cqrs
django-cqrs is an Django application, that implements CQRS data synchronization between several Django micro-services
Meet Django-CQRS (!)— the open source Django Application for automatic data synchronization between several Django services. This library adds the power of CQRS to your applications with 10 lines of configuration and code! Made with 💙 by CloudBlue Connect.
pip install django-cqrs # You can try yourself!
Django-CQRS is a new, but stable library. We have been using it in our production for about a year already. There has never been a single major issue and we use it more and more in our services. Library has 95+% coverage due to a big pack of unit tests and is also shipped with a set of integration tests. This library is designed to be transport agnostic (RabbitMQ is supported by default).
Let me share the motivation behind creation of this library and afterwards I will show an example of usage.
Motivation
- Django is the main development framework in CloudBlue Connect and it is used in the majority of our services. It may not be the most trendy framework, but “stability and quality are our #1 priority”. Plus Django has a great community and a long proven track record of being used in the best Python projects. In other words, we needed something to perfectly fit this framework: something that will benefit from ORM, that is easy to configure and that will work behind the scenes in most of the times.
- There are some open source python libraries like EventSourcing, but all of them try to implement CQRS pattern together with Event Sourcing. This may be an overhead for 90% of the real life cases and we needed something simple.
Example
We will start with creating a simple model in Orders service.
Each domain model needs a unique system-wide
CQRS_ID
. This ID is used everywhere: from model registry and utility functions to transport queues. There can be only one source model with such ID and several replicas (without any framework limitation).
And the default replica model in any other service:
We need to set CQRS Django settings in both services, make migrations:
And run replica with the shipped management command:
python manage.py cqrs_consume -w 2
And that’s it!🥳 If the transport (RabbitMQ) is ready and services are up and running, CQRS is already working. When a new order is created, updated or deleted, it is automatically replicated to the replica service!
It seems like real magic, but the code and library architecture are actually pretty simple. Let’s look at the details of the architecture through the synchronization message data flow.
Architecture
Each Master/Source model is extended with the MasterMixin. This mixin validates CQRS configuration and registers the model in the Registry and the Signals Framework. It also extends the model class with CQRS Manager (for the support of bulk operations) and two very important fields: cqrs_revision
and cqrs_updated
.
Each synchronized domain object must have a timestamp and a special revision number to solve synchronization problems like data duplication or message loss. CQRS does that out of the box on replica side.
After source signal is emitted, CQRS collects an object payload data together with synchronization primitives and calls the producer
. Producer
calls the transport layer where the message is sent to the actual transport, like RabbitMQ.
Framework supports complex object payload serialization, where even other related objects may be included automatically with proper SQL optimizations. There is a special option
CQRS_SERIALIZER = ‘api.OrderSerializer’
, which allows to specify the serialization class for payload building. Obviously, for such cases some deserialization replica code needs to be written manually.
On the replica side each Replica model is extended with the ReplicaMixin. This mixin extends model with synchronization fields, validates CQRS configuration and registers the model in the Registry. When some message comes from transport queue, Consumer routes it to the target replica model and its related manager. Manager makes synchronization checks and saves the data to DB.
Utilities
As distributed systems tend to fail, Django-CQRS is shipped with a great set of utilities for migration and operations purposes, which are implemented as Django management commands.
- Bulk synchronizer without transport (use case: initial configuration). May be used at planned downtime.
# ON MASTER
python manage.py cqrs_bulk_dump --cqrs-id=author -> author.dump
# ON REPLICA
python manage.py cqrs_bulk_load -i=author.dump
- Filter synchronizer over transport (use case: solve production synchronization problems in real-time).
# TO SYNC ALL REPLICAS
python manage.py cqrs_sync --cqrs-id=author -f={"id__in": [1, 2]}
# TO SYNC CHOSEN REPLICA
python manage.py cqrs_sync --cqrs-id=author -f={} -q=replica
- Diff sync tools over pipes and transport (use case: find synchronization problems and solve them automatically, for example by cron job).
# k8s pipe example
kubectl exec -i MASTER_CONTAINER -- python manage.py cqrs_diff_master --cqrs-id=author
|
kubectl exec -i REPLICA_CONTAINER -- python manage.py cqrs_diff_replica
|
kubectl exec -i MASTER_CONTAINER -- python manage.py cqrs_diff_sync
This is a pretty long article already and I am very happy if you have read up to this point. You are cool, thank you very much! 🙏
If you have any questions on any of these topics, you can 🤙 in LinkedIn.
Stay healthy and inspired! See you.
Posted on June 18, 2020
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.