10 years ago, Martin Fowler wrote an article titled “StranglerApplication”. In short, a StranglerApplication is a deliberate fade-out of an existing system by gradually letting a new version of the system take over. The article also links to another article by Paul Hammant, where further case studies are described, all using the StranglerApplication approach. I would like to add my experiences/case study here.
We’re in the middle of rolling out a new version of a critical system, using the StranglerApplication tactic. The system is a job management/scheduling system for maritime pilots, where ships can order a pilot, and a team of dispatchers are then responsible for planning the best assignments for all orders each day. But it is really not that important what the system does here – the main lesson that I want to share is, that if you feel like you have no way to do a strangler approach, you should really think twice.
Think twice before rejecting the idea
Initially, we had a hard time realizing that we could replace the system using a StranglerApplication approach. (That was actually before we knew that Fowler had coined the term/idea).
We were locked into a train of thought along the lines of:
We have to replace the old system completely before we can go live. Otherwise, users will be disappointed/reject the new system.
There is no way the users are going to use both systems in parallel – they don’t have the time to do that.
We cannot get users to keep both systems up-to-date, so data will never be current in the new system.
Now, here three weeks after we went “live” with the Strangler, I can say that none of these objections should stop you. Here’s why:
You don’t have to replace the old system entirely
Since you can use shorter release cycles with a strangler you can avoid a lot of the unnecessary features that cut over rewrites often generate.
I have seen this in practice as well already: Our users started to come up with new ideas as soon as they saw the 0.1 release in the production environment, next to the old system. In fact, I think one the risks of a Strangler approach is that the management required to handle all these new user request can be challenging. But then again: I think we should welcome new ideas rather than deal with all the unnecessary features generated by the “but the old system could do that” dogma that I have met before, when trying to do a “big bang” release.
Look for a “continuous migration” solution
One of our excuses for not doing a StranglerApplication at first was that we did not believe that our users were willing to update both systems. They did not want/”do not have time to” to do extra work just to keep the new system up-to-date, while still doing the “real” production jobs in the old system. This, I think, is still true. If you can get the users to do that in a limited period of time, fine, but in most cases, resistance will be too high.
In our case, it turned out that there was a migration path that would allow the users to do their jobs in the old system, while getting satisfactory updates into the new system periodically. A “continuous migration” solution so to speak. From early on in the project, we decided that we wanted to maintain a migration path from the old to the new system. In our case this means primarily a mapping from the old database model to the new model. We are using SQL Server for the database at backend, and we built an SSIS package for handling the migration from the old schema to the new, along with all data pertaining to the “situational picture” that is the most important part of the system.
Now, initially the SSIS package was set up to run once a day, at night time, in order to get good quality test data into the dev/test environment. When we took a look at the work actually done by the package, the core part took approx. 35 seconds to complete each night. After a bit of tweaking and adjustments, the time was cut to 15-20 seconds per run. (Obviously, we’re not migrating huge amounts of data here – it is a matter of getting data for the past 14 days of work data migrated into the new system, around 100-200MB. So we began to see an opportunity for running the migrating with a higher frequency, which would give us a solution to the problem of user refusing to keep two systems up-to-date. For us, we could live with 15 minute delay in the new system, so we are now running the migration every 15 minutes. The users know that whatever work they decide to do in the new system alone, might be wiped out at the next :00, :15, :30 or :45 minute. They can live with that. It still gives them an opportunity to see how realistic data, taken directly from production look and feel in the new system, and they can even do any operation that have been released, in the new system as well, without the risk of breaking anything.
All in all, I am very happy that we came around and found that a StranglerApplication approach was indeed possible here. It has already given us the following benefits:
- The users are comfortable. They can now try things out, and start to get used to the new system, without having to fear a Big Bang release.
- Top management is (a bit) happier. Now we can say that we have released into the production environment, and we can release a new version within 15 minutes. We can start to do shorter and shorter cycles, and increasingly work in an even more agile mode.
- Developers are happier. The pressure has been lifted a bit. Just “a bit” can mean a lot, if everyone around you is asking “when will you be done?” most of the time.
- Ideas are verified by more users, and new ideas are starting to develop. It’s a good thing. Yes, it can be harder to manage, but it is still a good thing. After all, system development is all about putting great ideas to good use in solid systems.