In my work I talk with a lot of people and, today, is very common meet someone who have been working with microservices architecture, but I realize that the most of them doesn’t know enough about it, especially, about databases, making some wrong decision due to some “metropolitan legends”.
It seems that databases topic, in microservices, is a kind of boogeyman for the most of architects or programmers --- it’s scary and gives you nightmares.
In my career I eared each kind of definition of use about databases in this architecture and, maybe I will sound presumptuous, I would like to start busting a few myths:
Each microservice must use their own database, otherwise isn’t a microservice architecture.
No and yes. If your client can pay for an instance of DB for each microservice then this is the best solution. But remember: you should use different instance of DB and each instance should have a replica and backup. It’s very expensive but is the best solution.
However, if your client can’t sustain this infrastructure, you shouldn’t use it, instead is preferred a shared instance (still with replica and backups), or a shared database.
Shared instances are widely used. Doesn’t matter if you use this solution or another, it’s still a microservice architecture, the important thing is how to use it (we will see later).
Shared instance means share all data across microservices
No! Absolutely No! Never, and I repeat it, never a microservice can break its constraints. Each microservice have a single responsibility and must access only to its part of data, if you need some data from other services, you should ask via API or others techniques.
Shared instance doesn’t means shared database, are very different patterns.
You can use only one kind of database
Nope. The microservices architecture is an agnostic technology solution and you can use each kind of database you need to accomplish requested feature, as well as you can use each kind of programming language o framework you need.
You can’t mix database per service and shared instance solution
Partially wrong. Technically you can mix these solutions, especially if you are going to change your structure, but why you should? If you are going to change your structure you should prepare the environment and release it, avoiding middle states.
Now that some myths are busted we can talk about different patterns to persist data in microservices, with pros and cons.
It’s a wide-used pattern to manage databases in microservices architecture, but what does mean?
Database-per-service means that each microservice uses his own database to persists the data and, each database should have a replica to grant high fault tolerance.
In practical example: we have five microservices with their constraints and boundaries, with this pattern you should have five databases instances, one for each microservice.
It’s a very expensive solution, but it’s the solution.
Using this pattern microservices can’t crossing their boundaries and can access to data of others microservice using internal API or CQRS.
When you use a domain decomposition it’s easy handle data but, in case of functional decomposition, it’s a little bit more complicated because if a microservice needs data from multiple domains should maintains his local copy, having as results a different versions of the same data or, in worst case, a multiple variants of the same data.
Pros
Cons
This solution is based on database-per-service and microservices constraints are still mandatory, however each service doesn’t have his database instance, it uses a shared instance.
Using the above example, whit this solution, we don’t have five instances of database, we are going to use one single instance and microservices still can’t access to others data directly, their still have to use internal API or others pattern.
This is a low cost solution and in this way you can switch easily to the database-per-service pattern, if you need.
This is my preferred solution, when I have a low budget.
Pros
Cons
I don’t like very much this approach, but in a few words this solution is used when you need to share data s between microservices, it’s without boundaries and constraints.
Each microservice can access to the data directly and perform his transformation.
This solution is very dangerous. For example if you need to change a table in the database you must check all microservices that use it, and it’s very easy making mistakes and create a lot of side effects. Besides you have to handle concurrency and versioning on data.
Pros
Cons
That’s all folks!