A Migration Journey

akhansari

Amin Khansari

Posted on April 12, 2023

A Migration Journey

In 2018, I received an amazing job offer to lead a team in a rather tense situation. After conducting a brief code audit, I accepted the offer on the condition that I would have carte blanche to make necessary changes. Fortunately, my manager trusted me already and gave me the freedom I needed to make progress.

This journey is about the migration of a very critical 12-year-old legacy system written in C# to a Reactive architecture that uses Event Sourcing in a functional programming language, specifically F#.

Context

At first, I had several challenges to overcome.

Knowledge loss

The former lead developer and other team members left abruptly, taking with them all of the business and technical knowledge. I had only four days with him to gain the necessary knowledge and insight. Luckily, our Product Owner was very familiar with the product. However, he was overwhelmed with support requests, and the documentation was almost non-existent. This posed the biggest immediate risk, and I had to find a solution to address it.

Deprecated stack

The core application was a monolith consisting of 600k lines of code in C# and over 100k lines of code in SQL Server stored procedures. The entire application was deployed manually, without any Continuous Integration or Continuous Deployment (CI/CD) into Windows Servers 2008. Adding to the complexity, there were additional repositories that we were not aware of, and sometimes it took us up to 2 years to locate the relevant machines for deployment. This is not uncommon in a fast-growing company with more than 700 developers.

The database was the only thing that kept the application from collapsing, as it was well-designed and served as the foundation of the architecture. Most of the services were scheduled jobs in pull mode, which was pretty good for a database-centric downstream product that was designed in 2005.

Unclear domain

When I first joined the team, all I knew was that we were part of the accounting entity. However, as time passed, I realized that for years, the team has been viewed as a catch-all for anything loosely related to finance. This approach may have worked during the company's early digitalization days, but it was no longer viable given the company's 3.5 billion turnover and expansion plans in Europe. The company had acquired several other firms, and our product was selected as the primary target in the IT unification project, being regarded as the most comprehensive solution.

For many people, both business and technical, we were seen as a black box that handled multiple essential tasks. It is important to note that the scope is the subsidiary ledger and not the general ledger, which is a very underrated and complex domain.

To make matters worse, we were handed a brand new project that had already been planned and decided upon, and which had nothing to do with our area of responsibility.

Resistant to changes

On the one hand, a complete rewrite of the code presented a high risk of failure, but on the other hand, there were warning signs everywhere indicating that the code had not seen any continuous improvement since its creation. The dilemma was too great and we had to make a trade-off decision.

Furthermore, the maintenance cost was also quite high. The code lacked unit tests and only had a few unmaintainable integration tests with generic names like "billing test 1" that tested the entire application.

The number of exceptions generated in production was also very high, and we didn't even look at them.

Additionally, there was no back office system in place, and the main tool was SQL Management Studio and SQL scripts. It was very difficult to provide support without advanced knowledge of data schemas, flows, and T-SQL.

Journey

While the context may sound negative, that's not the main point. Dwelling on negativity will only hinder progress. To move forward, we must acknowledge the past and learn from it in order to build a better future. I was fortunate to have caring individuals around me who helped along the way. With that said, let me summarize how we were able to tackle and solve the issues one by one.

First year, stabilization and learning

Knowledge is power, but in our case, we felt pretty powerless.
Not to mention the high volume of support tickets we had to handle each day and the frequent incidents that occurred. In addition, we had to continue working on already planned projects.

However, like climbing a mountain, it is important not to look down and instead focus on reaching the top, no matter how difficult or far it may seem. This is the approach we decided to take.

Our first step was to prioritize fixing bugs without any refactoring. This approach helped reduce the number of incidents and stabilize the legacy code with minimal risk. Additionally, we took the opportunity to learn more about the business and the product, which later proved to be a valuable asset. During this phase, we developed our ubiquitous language, identified stakeholders, and gathered information about future needs.

While it was important to understand the legacy system, we knew that solely focusing on it would lead to repeating past mistakes. So, we asked a specialist to conduct an Event Storming session with the business, management, and team to clarify the domain and define the bounded contexts. Through this session, we gained valuable insights and discovered that the accounting sub-domain was just a small part of the overall domain. We also identified other critical sub-domains such as billing, cash flow, and revenues, which helped us understand the domain more comprehensively. This session also helped management and stakeholders provide better support in the future.

Finally, we had to consider how to create value in the short and long term, especially with planned projects that were no longer in our scope. We decided to create a modern stack for two reasons: first, we knew that a modern and well-implemented stack had a higher chance of being accepted by another team, and second, it would serve as the foundation for our future back-office system.

First year, experimentation

As I gained more understanding of the business and product, the future architecture became clearer to me.

In finance and accounting, data must always remain unchanged. Whenever a change is needed, the solution is to cancel the old data by creating a new one with reversed values and then creating new data.

As a downstream product, we react to upstream events, and via workflows, we produce other events. Therefore, a gradual migration was the only viable option. It was too risky to refactor the legacy or start from scratch in one go. In finance, partial deliveries are not possible; you have to complete everything to avoid inconsistencies.

For all these reasons, we decided to use event sourcing, which is based on immutable data, reaction to events to produce other events, tracing evidence for legal purpose, documents, workflows, and allows partial migrations through projections. We conducted POCs and benchmarks with Marten to evaluate the possibilities.

However, the complexity of domains and numerous business rules increased cognitive load, and producing readable, succinct, and maintainable code in C# was challenging. I discovered Scott Wlaschin's excellent book on domain modeling and Jérémie Chassaing's research on the Decider pattern to be helpful. After a few tries, we decided to produce code only in F#.

We faced other challenges. Firstly at this specific time, Marten was not F# friendly enough, and secondly, the company's SREs did not whitelist the EventStoreDB. Consequently, we created our own event store database on top of Postgres, which turned out to be one of our best decisions. Postgres is a solid and polyvalent database and gave us a good understanding of the fundamentals of Event Sourcing like streams, idempotency, and optimistic concurrency. The only issue we faced was the absence of subscriptions and the correct way of handling asynchronous projections.

Second year, implementation

Having achieved stability with our legacy system and experimented some proof of concepts (POCs), it was time to move forward with our plans. This was also the right time, as our product had previously selected as a target for IT convergence.
The legacy system efficiently managed invoicing flow for France, it was not suitable for other countries due to significant differences in its operation. We had only one year until the deadline, which gave us six months to finalize and begin tests.

Personally, I'm against microservices architecture, however I hold the belief that bounded contexts are highly valuable. Therefore, I'm not concerned about having a service per bounded context.
When in doubt, I prefer a modular monolith to a series of microservices. With a modular monolith, the cost of continuous improvement and refactoring is much lower. Additionally, one can always extract and separate a service from a modular monolith, whereas the reverse is mostly impossible. It is important to understand that a service requires an independent lifecycle, which may appear beneficial in the short term, but can ultimately lead to difficulties (hell) in the long term. If multiple lifecycles are added within a bounded context, it can increase cognitive load and potentially lead to abyss and negative effects over time.

For these reasons, we decided to use a monorepo, where our monorepo serves as our domain and includes several subdomains, each representing different deployable services. Surrounded by anti corruption services (ACLs) as connectors between the outer world and us.

💡 Simple and boring systems are difficult to build but easy to maintain and to scale.

To ensure all of these services work effectively together, we chose the reactive manifesto as the foundation of our architecture, along with RabbitMQ as the message broker. Since this subject is vast and this article is already quite dense, I will not elaborate further on the details here. However, it is important to remember that idempotency and resilience are key factors for success in downstream apps.

Third year, path to success

As you can probably infer from the title, the first release was nicely and unsurprisingly successful.
Giving the team free rein, maintaining transparency and open communication, incorporating iterations, and focusing on the outcome (rather than output) were crucial factors in mitigating risks and achieving the desired objective.

Output VS Outcome

Additionally, we made the decision to shift from Scrum to Kanban and adopt a no-estimate methodology.
We also began using Event Modeling more frequently, often through mob-modelling sessions, instead of relying on Jira tickets.

We began placing more emphasis on BDD instead of TDD. With F#, we were able to create a clean DSL and produce human-readable tests.

After 90% migration progress, our 600k of C# codebase went down to 65k of F# and 10k of TypeScript (VueJS).

Architecture involves more than simply creating diagrams and choosing the right technology; it is a socio-technical process that involves modeling, communication, and discovery.
Decisions should flow from that and especially not from an ivory tower.

🖖 Thank you for reading this article. I hope you found it informative and enjoyable. I will be sharing more technical details in future posts, but in the meantime, feel free to check out the following repository which is a POC of the app design.

GitHub logo akhansari / EsBankAccount

Bank Account kata and Functional Event Sourcing in F#

Bank account kata and Functional Event Sourcing

F# template/POC about Functional Event Sourcing, Onion Architecture and WebAssembly.

Wanna file an issue? a suggestion? Please feel free to create a new issue and / or a pull request.
Or start a new discussion for questions, ideas, etc.

Why?

F#

Empowers everyone to write succinct, robust and performant code.
It enables you to write backend (taking advantage of .Net ecosystem) as well as frontend (transpiled to JS or compiled to Wasm) applications.

Functional Event Sourcing

Fully embrace immutability and expressions, in addition to other more traditional ES perks.

Onion Architecture

Leads to more maintainable applications since it emphasizes separation of concerns throughout the system.
It's even quite natural with F#, i.e. compositions and higher-order functions.

WebAssembly

Facilitate the development of powerful UIs and back office apps with minimal effort.
Note that for the sake of simplicity in this demo, the view…

💖 💪 🙅 🚩
akhansari
Amin Khansari

Posted on April 12, 2023

Join Our Newsletter. No Spam, Only the good stuff.

Sign up to receive the latest update from our blog.

Related