Database per Service as a Design Pattern
Eugene Zimin
Posted on June 13, 2024
In the world of modern software development, the microservices architecture has emerged as a popular approach to building scalable and maintainable applications. This architecture breaks down a monolithic application into smaller, independent services that can be developed, deployed, and scaled independently.
However, if there are multiple approaches to perform the decomposition of the service functionality, it is still questionable how to split the data and whether it is necessary at all. This brings us to the heart of the Database per Service pattern and its role in the modern architecture.
A Bit of History
The journey of software architecture and data management is one of evolving complexity and increasing distribution. To understand where we are today with patterns like Database per Service, it's illuminating to trace the path that led us here. This journey begins with traditional monolithic architectures, where applications were built as single, tightly-integrated units sharing a common database.
Traditional Monolithic Architecture
Before diving deeper into this design pattern, it's important to understand the traditional approach and why the problem arose.
In the epoch of monolithic applications, there were no such problems with how to access data, because there was no definition of concurrent data access from multiple applications. The monolithic architecture, with its single, shared database, provided a straightforward approach to data management. All components of the application had direct access to the entire database, and data consistency was primarily managed through database transactions and locking mechanisms.
Figure 1. Application - Database communication model
This simplicity, however, came at a cost. As applications grew in scale and complexity, the limitations of the monolithic approach became increasingly apparent. The tight coupling between application components and the database made it difficult to evolve the data model without impacting the entire system. Schema changes often required coordinated updates across multiple parts of the application, leading to complex deployment processes and potential downtime.
Moreover, the lack of data encapsulation meant that any part of the application could potentially access and modify any data, increasing the risk of unintended side effects and data corruption. This became particularly problematic as teams grew and different groups of developers worked on different features within the same monolith.
The advent of service-oriented architectures and, later, microservices, brought these data access and management challenges into sharp focus. Suddenly, architects and developers were faced with questions that didn't exist in the monolithic world: How do we ensure that each service has access to the data it needs without creating tight coupling? How do we maintain data consistency when transactions span multiple services? How do we scale data access patterns independently for services with different performance requirements?
N-Layer Architecture
The first attempt to answer this question was the N-layer architecture style. This approach sought to introduce a degree of separation between different concerns within an application or by organizing code into distinct layers, typically presentation, business logic, and data access, or even further - extracting that functionality into separate application. Each layer is supposed to have a specific responsibility and would communicate with adjacent layers through well-defined interfaces.
Figure 2. Example of layered system with external DAL
In the context of data management, the N-layer architecture often involved creating a dedicated data access layer (DAL). This layer encapsulated all interactions with the database, providing a set of methods or services that the business logic layer could use to retrieve or manipulate data. The idea was to abstract away the complexities of data storage and retrieval, making it easier to change the underlying database technology or structure without impacting the rest of the application.
While the N-layer architecture brought some benefits, such as improved code organization and the potential for reusability of data access components, it did not fully address the challenges of distributed systems. The data access layer, though separate, was still typically tightly coupled to a single, shared database. This meant that while individual services or modules might have their own business logic, they were still dependent on a central data store, which could become a bottleneck and a single point of failure.
Moreover, the N-layer approach did not inherently solve issues of data ownership and autonomy. Different services or modules might need to evolve their data models at different rates or have distinct performance and scaling requirements. A shared data access layer and database made it difficult to cater to these diverse needs without complex coordination and potential disruptions.
All these problems lead system and software architects to the next step - creating a design pattern called "Database per service".
Database per Service
As systems continued to grow in complexity and the need for genuine decoupling became more pressing, architects began to look beyond layered architectures to more distributed patterns. This led to the consideration of approaches like the Database per Service pattern, which takes the concept of separation a step further by giving each service not just its own data access code, but its own dedicated database.
The transition from N-layer architectures to patterns like Database per Service reflects a fundamental shift in thinking about data management in distributed systems. Rather than seeing data access as a shared concern to be abstracted away, modern approaches often treat data as an integral part of each service's domain, to be owned and managed as autonomously as possible.
Figure 3. Database per Service Design Pattern in Action
This evolution, however, is not without its challenges. While the N-layer architecture grappled with issues of code organization and reusability, Database per Service patterns must contend with questions of data consistency, duplication, and inter-service communication. The decoupling of data stores provides greater flexibility and resilience, but it also requires careful consideration of how to maintain a coherent view of data across the system.
As we continue to explore the Database per Service pattern, we'll see how it builds upon the lessons learned from earlier architectural styles like N-layer, while introducing new paradigms that are better suited to the realities of modern, distributed applications. The goal remains the same: to create systems that are scalable, maintainable, and responsive to change. But the means of achieving this goal have evolved, reflecting the ever-increasing complexity and scale of the software we build.
Benefits of the Database per Service Pattern
We need to understand that Database per Service pattern is not a silver bullet. However, even introducing its own set of challenges, it offers several compelling benefits that have made it an increasingly popular choice in microservices architectures. These advantages stem from the fundamental principle of decoupling, not just at the application level, but at the data level as well.
Enhanced Scalability
One of the primary drivers behind the adoption of microservices, and by extension the Database per Service pattern, is scalability. Consider 2 different examples, where the first one is a traditional monolithic application with a shared database.
Figure 4. Loading Map Inside Monolith Application
With this approach scaling often means scaling the entire system, even if only one component is under heavy load. This can lead to inefficient resource utilization and increased costs. Let's count it up.
Imagine that our application runs on a VM with 4 CPU and 16Gb of RAM. That's a simple web application, providing functionality for storing and reading messages by different users based on their subscriptions, like in Twitter. Application, being built as a monolith one, is able to consume no more than 25% resources of the VM to leave some room for scaling and operational overhead.
Now, for the sake of simplicity let's imagine that each of the component consumes about 300Mb of RAM and about 100Mb of RAM is integration overhead for the monolith. In this case we notice that our application is lack of resources and we would like to create another instance. Would it be a good idea? No. Do we have a choice? No.
Figure 5. Scaling Monolith Application
Creating another instance of the whole application, we have to scale all components, sitting inside it, even those ones, which are totally not under high load, because we can't differentiate them. Eventually it will lead us to the state where we will have 2 applications with redundant functionality, consuming memory and CPU, which cost us money.
Figure 6. Scaling Microservice Application
With a database per service, each service and its corresponding database can be scaled independently based on its specific demands. A service handling creating messages and reading them, for instance, might need to scale to handle a high volume of requests, while an user registration might not require scaling at all. By allowing each service to scale its own database, we can fine-tune their infrastructure, allocating resources where they're most needed and optimizing performance without unnecessary overhead.
Cleaner Data Ownership
In large systems, it can sometimes be unclear which part of the application "owns" particular data, leading to confusion, inconsistencies, or unintended side effects when data is modified. The Database per Service pattern establishes clear boundaries of data ownership.
Imagine the scenario with the messaging application.
Figure 7. Data Ownership in Messaging Application
Adopting a Data-Driven Design approach, we can identify two primary services with distinct data responsibilities: the User Management Service and the Messaging Service. When examining the data model for the Messaging Service, it's evident that each message must include information about its author—a user registered within the User Management Service.
A simplistic solution might involve replicating user and role data within the Messaging Database, which is directly linked to the Messaging Service. The rationale seems straightforward: every time a message is created and persisted, we need to authenticate the user and verify their permissions for message production and storage.
However, duplicating user and role information in the Messaging Database would result in the User
and Roles
tables existing concurrently in two separate databases. If we consider the User Management Database as the authoritative source for user-related data, we introduce a data synchronization challenge—ensuring that information remains consistent across both locations. This scenario effectively violates the Single Responsibility Principle from the S.O.L.I.D. pattern. The Messaging Database would now require updates not only when message data changes but also when user information is modified, conflating its responsibilities.
Such an arrangement complicates data ownership and management. Ideally, each service should have clear authority over its domain-specific data, with well-defined boundaries that promote maintainability and reduce interdependencies. Duplicating user data within the Messaging Service blurs these lines, creating potential for data inconsistencies and increasing the complexity of system-wide updates.
A more robust approach would respect the natural boundaries between services, allowing the User Management Service to retain sole ownership of user-related data. Rather than replicating this information, the Messaging Service would reference it as needed, storing only links or pointers, while relying on the User Management Service for authoritative data. This strategy upholds the principles of service autonomy and data integrity, cornerstones of effective microservices architecture.
Figure 8. Correct Data Ownership
Now each service becomes the sole writer to its database and the authoritative source for its domain of data. This clarity reduces complexity in data management, helps maintain data integrity, and makes it easier to reason about the system's behavior. While services may still need to share data, this is typically done through well-defined APIs rather than direct database access or redundant data, reinforcing the principles of encapsulation and loose coupling.
Improved Fault Isolation
In a system with a shared database, issues like outages, performance degradation, or data corruption can have wide-ranging impacts, potentially bringing down the entire application. The Database per Service pattern provides a higher degree of fault isolation.
If one service's database experiences problems, the impact is largely contained to that service. Other parts of the system can continue to function, improving overall resilience. This isolation also simplifies troubleshooting and recovery, as teams can focus on addressing issues within a specific service and its database without needing to coordinate across the entire system.
Let's take a look at the example. We will use the same Messaging application and now consider the worst-case situation: the application relies on a shared database, and this database suddenly becomes unavailable.
Figure 9. Database/DAL failure
The consequences of this failure are immediate and far-reaching. With the database inaccessible, the entire service effectively grinds to a halt in terms of its core functionality. No component within the system can retrieve or manipulate data, rendering the application useless from an end-user perspective.
This outage isn't merely an inconvenience; it represents a single point of failure that can paralyze operations across the board. User authentication may fail without access to account data. The messaging feature, the heart of the application, cannot display existing conversations or allow the sending of new messages. Even secondary functions like user settings or search capabilities are compromised.
The ripple effects extend beyond just data retrieval. Write operations—such as storing new messages, updating user profiles, or logging system activities—also come to a standstill. This not only disrupts current user interactions but can lead to data loss if the system isn't designed with robust queueing and retry mechanisms.
Furthermore, the shared nature of the database means that even services or components with minimal data needs are impacted. A status monitoring tool that might normally operate with cached data could fail if it periodically needs to check in with the database. Administrative interfaces, often crucial for diagnosing and addressing issues, may themselves be rendered inoperable.
This scenario underscores one of the fundamental vulnerabilities of a monolithic architecture with a shared database: lack of isolation. When all services depend on the same data store, they share the same fate when that store fails. There's no graceful degradation, no ability for some parts of the system to continue functioning while others are impaired.
Figure 10. Database failure in distributed system
Now let's reimagine the same scenario, but within the context of a system employing the Database per Service pattern. When a database failure occurs, its impact is largely contained to the service directly connected to it, leaving other services unaffected and operational.
In this illustration, we see that the database linked to the Commenting service has gone offline. The immediate consequence is that the Commenting service loses its ability to retrieve or store data. Users might encounter error messages when trying to post new comments or view existing ones. However—and this is the critical difference—the ripple effect stops there.
The User Management service, with its own dedicated database, continues to authenticate users and manage account information unimpeded. The Messaging service still facilitates the exchange of direct messages between users. Even a hypothetical Analytics service could keep tracking user engagement metrics, perhaps with a temporary blind spot for comment-related events.
This resilience stems from the decoupling inherent in the Database per Service pattern. Each service not only has its own codebase and deployment pipeline but also its own data persistence layer. This autonomy means that a failure in one part of the system doesn't automatically cascade into a system-wide outage. Instead, the application gracefully degrades, maintaining core functionalities even as some features become temporarily unavailable.
Of course, in a well-designed microservices architecture, services rarely exist in total isolation. They often depend on each other for certain functionalities. In our example, the Messaging or User Management services might need to interact with each other -to perform users authentication and authorization, so the failure of any core services might still be critical to the overall functionality, however the rest of the services have to analyze the error response and notify customers together with system administrators about failure. Moreover, they can also narrow down the issue failure point, which makes debugging easier.
Services should be built to handle errors or timeouts from their dependencies gracefully. If the Messaging service tries to fetch recent comments for a conversation and receives an error response from the Commenting service, it shouldn't crash. Instead, it might display a message to the user that comments are temporarily unavailable, while still showing the rest of the conversation history. Similarly, the User Management service could queue moderation actions to be processed once the Commenting service is back online, rather than blocking user management tasks.
Greater Development Autonomy
One of the challenges in large software projects is coordinating changes, especially when it comes to the database schema. In a shared database, a schema change for one part of the application might require updates and testing across many components, slowing down development cycles as it has direct impact on the whole components.
By giving each service its own database, development teams gain more autonomy. They can evolve their service's data model, optimize queries, or even change database technologies without impacting other services. This autonomy can significantly speed up development and deployment cycles, allowing teams to work more independently and deliver value more quickly.
Figure 11. Access data through API only
This diagram illustrates a fundamental tenet of the Database per Service pattern: data encapsulation through well-defined APIs. Here, we see that any modifications to the Messaging Database's model have a direct impact solely on the Messaging Service. The changes remain invisible to other components within the broader system ecosystem. This encapsulation is achieved by adhering to a strict rule: all data access must flow through the service's API.
The power of this approach becomes evident when we consider the implications for schema evolution. Let's say the team responsible for the Messaging Service decides to denormalize their data model to improve query performance, perhaps by embedding frequently accessed user information directly within message records. In a shared database scenario, such a change could wreak havoc, breaking queries and data access patterns across multiple services. However, with the Database per Service pattern, the change is contained. As long as the Messaging Service continues to honor its API contract—returning the expected data structures and fields—other services remain blissfully unaware of the underlying data reshuffling.
This principle dovetails neatly with the Open-Closed Principle from S.O.L.I.D., which states that software entities should be open for extension but closed for modification. In our context, the Messaging Service's API represents the "closed" part—a stable interface that other services can rely upon. When new requirements emerge or optimizations are needed, the service can "extend" its capabilities by evolving its internal data model or adding new API endpoints, without disrupting existing consumers.
For instance, if there's a need to support richer metadata for messages, the Messaging Service team might add new fields to their database schema and expose this additional information through new API methods or optional parameters on existing methods. Services that don't need this metadata continue to function unperturbed, while those that do can opt-in to the enhanced functionality.
The benefits of this approach extend beyond just ease of change. By forcing all data access through a versioned API, we introduce a layer of abstraction that can smooth over differences between the service's internal data representation and the view of that data presented to the outside world. This can be particularly valuable during incremental migrations or when supporting legacy clients.
Consider a scenario where the Messaging Service is transitioning from storing timestamps as Unix epochs to ISO 8601 strings. Rather than forcing all consumers to adapt simultaneously—a herculean task in large systems—the service's API can continue to accept and return epochs for v1 of the API, while offering the new format in v2. Internally, the service handles the conversion, allowing for a phased migration that doesn't leave any part of the system behind.
Flexible Technology Choices
This benefit is a direct outgrowth of two fundamental patterns: separate data ownership and encapsulation. When data is tucked behind a well-defined API contract, the specifics of the underlying storage mechanism become an implementation detail—opaque and, to a large extent, irrelevant to consumers of that API. Does it matter to the rest of the system whether the Messaging Service persists its data in MySQL, PostgreSQL, or ventures into NoSQL territory with MongoDB? As long as the service honors its API commitments, the answer is a resounding no.
This abstraction opens up a world of possibilities, freeing teams from the "one size fits all" straitjacket often imposed by a shared, monolithic database. In reality, not all data is created equal, and the notion that a single database technology can optimally serve every type of data or access pattern across a complex system is increasingly recognized as a constraining myth.
The Database per Service pattern shatters this constraint, ushering in an era of polyglot persistence—where each service can select a database technology that aligns precisely with its unique data characteristics and operational demands. This isn't just about having choices for the sake of variety; it's about empowering teams to make nuanced, strategic decisions that can significantly impact performance, scalability, and even the fundamental way they model their domain.
The implications of this technological flexibility ripple beyond just query performance or data model fit. Different database technologies bring different operational characteristics—scaling models, backup and recovery procedures, monitoring and management tools. By aligning these with service-specific needs, teams can optimize not just for runtime performance but for the entire lifecycle of managing data in production.
Easier Maintenance and Updates
This capacity for streamlined maintenance and evolution is yet another dividend paid by the twin principles of separate data ownership and encapsulation within the Database per Service pattern. In a landscape where data stores are siloed and accessible only through well-defined service interfaces, routine database upkeep and even transformative changes can be orchestrated with surgical precision, minimizing system-wide disruption.
Consider the cyclic necessities of database maintenance—software updates to patch security vulnerabilities, backup procedures to safeguard against data loss, or performance optimizations to keep pace with growing loads. In a monolithic architecture with a shared database, each of these tasks casts a long shadow. A version upgrade might require extensive regression testing across all application components. A backup process could impose a system-wide performance penalty. An ill-conceived index might accelerate queries for one module while grinding another's write operations to a halt.
The Database per Service pattern redraws these boundaries of impact. When each service lays claim to its own database, maintenance windows can be staggered, tailored to the specific needs and tolerances of individual services. The Customer Support team might schedule their database patching for the wee hours of Sunday morning when ticket volume is lowest, while the E-commerce platform waits until Tuesday afternoon post-sales rush. Backups for a write-heavy Logging Service could run hourly, while a relatively static Product Catalog Service might only need daily snapshots.
This granularity extends to performance tuning. DBAs can optimize with abandon, knowing that an experimental indexing strategy or a shift in isolation levels will reverberate only within the confines of a single service. The blast radius of any potential misstep is contained, fostering an environment where teams can be bolder in pursuing optimizations that might be deemed too risky in a shared-database context.
But the true power of this pattern shines through when we zoom out from day-to-day maintenance to consider more seismic shifts in data architecture. In the life of any sufficiently long-lived system, there comes a time when incremental tweaks no longer suffice—when data models need comprehensive restructuring, or when the limitations of the current database technology become insurmountable barriers to growth or feature development.
Traditionally, such changes have been the stuff of CTO nightmares: "big bang" migrations where months of preparation culminate in a high-stakes cutover weekend, with the entire engineering team on high alert and business stakeholders watching the status dashboard with bated breath. The risks are manifold—data corruption, performance regressions, or the dreaded Monday morning realization that a critical legacy feature was overlooked in testing. Rollback plans for such migrations often boil down to "pray we don't need it," because reversing course once the switch is flipped can be nearly impossible.
Conclusion
As we draw our exploration of the Database per Service pattern to a close, it's clear that this architectural approach offers a compelling vision for data management in the age of microservices. By championing separate data ownership, encapsulation, and the autonomy of individual services, the pattern unlocks a cascade of benefits that ripple through every facet of system development and operation.
We've seen how this decoupling can accelerate development cycles, freeing teams to evolve their data models and optimize performance without fear of collateral damage. We've explored the flexibility it affords in choosing the right database for the job, eschewing one-size-fits-all solutions in favor of technologies tailored to each service's unique needs. And we've witnessed how it transforms maintenance and migration from high-wire acts to managed, incremental processes, reducing risk and increasing the system's capacity for change.
These advantages paint a picture of systems that are not just more scalable and resilient, but more adaptable—better equipped to keep pace with the relentless evolution of business requirements, user expectations, and technological capabilities. In a digital landscape where agility is currency, the ability to pivot quickly, to experiment boldly, and to embrace change without fear can spell the difference between market leaders and those left behind.
Yet, as with any architectural choice, the Database per Service pattern is not without its complexities and trade-offs. The very decoupling that grants such flexibility also introduces new challenges around data consistency, distributed transactions, and holistic data views. The autonomy that empowers individual teams can, if not carefully managed, lead to a fragmented technology landscape that strains operational resources. And while service interfaces provide a powerful abstraction, designing them to be both flexible and stable requires foresight and discipline.
These challenges are not insurmountable, but they do demand thoughtful strategies and sometimes novel solutions. In our next article, we'll pull back the curtain on the potential drawbacks of the Database per Service pattern. We'll confront head-on the questions of how to maintain data integrity across service boundaries, how to handle workflows that span multiple data stores, and how to balance service independence with the need for system-wide governance.
We'll also examine the organizational and operational implications of this pattern. What does it take to successfully manage a polyglot persistence environment? How do we ensure that our decentralized data doesn't become siloed data, invisible to the analytics and insights that drive business decisions? And how might this pattern influence—or be influenced by—emerging trends like serverless architectures, edge computing, or AI-driven operations?
By exploring these drawbacks and challenges, our aim is not to diminish the power of the Database per Service pattern, but to equip us with a complete picture—to ensure that our journey toward decentralized data ownership is undertaken with eyes wide open. After all, true architectural wisdom lies not in dogmatic adherence to any pattern, but in understanding deeply both its strengths and its limitations.
So consider this not an ending, but an intermission. We've celebrated the elegance and potential of a world where each service is master of its own data domain. When next we meet, we'll roll up our sleeves and dig into the gritty realities of bringing that vision to life—the obstacles to be navigated, the pitfalls to be sidestepped, and the trade-offs to be weighed.
The path to robust, evolvable systems is seldom straightforward. But armed with patterns like Database per Service—and a clear-eyed view of both their power and their demands—we can chart a course through the complexity. So stay tuned, as we prepare to balance today's optimism with pragmatism, and turn our gaze from the heights of possibility to the solid ground of practice.
Posted on June 13, 2024
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.