Two weeks ago, I participated in an online panel on the subject of CD for Legacy Applications, as part of Continuous Discussions, a community initiative by Electric Cloud presenting a series of panels about Agile, Continuous Delivery and DevOps.
It gave me the opportunity to gather some learnings from my colleagues at ThoughtWorks, as well as reflecting on my experience working with a variety of enterprise applications not built for Continuous Delivery.
What is a Legacy Application?
I find interesting how legacy is a really scary word when talking about software applications. After all, a definition of legacy is “a gift handed from the past“. The issue with old software applications is that they have been written at a time where systems constraints where very different from now, written in languages designed to run with a small memory footprint like Assembler, COBOL, or FORTRAN. As we are loosing expert knowledge of those languages, the ability of today’s programmers to understand them well enough to refactor a legacy codebase (or replace it with some Ruby, .NET or GoLang) is reduced.
IBM 3270 terminal screen
But legacy applications are not only those written in the late 80s. Some are actually developed right now, under our nose. Someone, somewhere, is writing or releasing at this instant an application that is already legacy, because written by people whose constraints are different from ours. These people favour:
Enterprise application servers over lightweight containers
Large relational databases over small bounded polyglot data-stores
Complex integration middleware over dumb pipes
Manual installers over configuration scripts
UI based development environments over CLIs
A multi-purpose product suite over many single purpose applications
That is, while there is value in the items on the right, their organisation is prescribing the items on the left. And somehow these systems have a ended up right at the core of our entire IT infrastructure (remember ESBs?). So they are as hard to get rid of as an IBM mainframe.
Whether deciphering the codebase is an archeological endeavour, or the application has not been built for Continuous Delivery, the common denominator of legacy applications is that they are hard to change. Here are some of their characteristics that make it so:
- Coupled to (or at the core of) an ecosystem of other applications
- Composed of multiple interdependent modules
- Designed around a large relational database (that may contain business logic)
- Release and deployment processes are manual and tedious
- There is little or not automated tests
- Run on a mainframe or in a large enterprise application server / bus
- Use an ancient programming language
- Only the aficionados dare touching them
What is Continuous Delivery?
Doing due diligence, let’s define Continuous Delivery as well. Continuous Delivery is the set of principles and techniques that enable a team to deliver a feature end-to-end (from idea to production) rapidly, safely, and in a repeatable way. It includes technical practices such as:
- Continuous integration of small code changes
- Incremental database changes (aka migrations)
- A balanced test automation strategy (aka pyramid of tests)
- Separation of configuration and code
- Repeatable and fast environment provisioning
- Automated deployments
- Adequate production logging & monitoring
Implementing Continuous Delivery is not just a matter of good DevOps. It also flows down to application design. For instance, it will be quite hard to script configuration changes if the application itself does not separate configuration from code. Also, provisioning environments quickly cannot be done if a 80GB database needs to be created for the app to deploy successfully. Designing an application for Continuous Delivery is really important. Some of these good practices are echoed in the The 12 Factor App specification, which is a good reference to whomever wishes to build web applications for the cloud.
If you are the lucky winner of the legacy lottery, and have to face the prospect of working with a legacy application, what should you do? The two options are:
- Kill it – by working around it and slowly moving its capabilities to new software
- Improve it – so that you can make changes and it is not so scary anymore
The decision whether to kill it or improve it comes down to one question raised by my colleague John Kordyback: is it fit for purpose? Is it actually delivering the value that your organisation is expecting? I worked for a retail company who used Microsoft Sharepoint as a web and security layer. Although it was extremely painful to develop on this platform, none of our applications were using any of the CMS features of Sharepoint. Instead, it was used to provide Single Sign On access to new and legacy apps, as well as easily integrate to Active Directory. It turned out that both of those features are readily available to .NET4.1 web applications (back in 20013), so we moved to plain old MVC apps instead.
Fit for purposefulness should also be measured by an organisation’s incentive to invest on the application as a core capability. If program managers are trying to squeeze all development effort into the BAU/Opex part of the budget, that is a sign that end of life should be near.
Instead, if a system is genuinely fit for purpose, and there is a strong drive to keep it for the long term (at an executive level – don’t try to be a hero), then understanding where to invest and make improvements is the next logical step.
How to Improve Legacy?
The main hurdle is notoriously the people. Culture, curiosity, and appetite for change are always key when introducing Agile, Continuous Delivery, Infrastructure as Code, or other possibly disruptive techniques. Teams and organisations that have been doing the same thing for a long time are those that are hard to convince of the benefits of change. Some of their developers probably still think of Continuous Delivery as something that is passing them by. One answer to that could be to start with continuous improvement. Find out what is really hard for the “legacy” team and regularly agree on ways to improve it.
To do that, it is important to have a good idea of where the pain really is. Visualising the current state of your legacy codebase, application, or architecture is key. You could for instance look for parts of the application that change often (e.g. based on commits), or parts of the code that are extremely complex (e.g. static code analysis). The picture below shows a series of a D3JS hierarchical bundling edge graphs drawn from analysing static dependencies of several back-end services. As you can see the one on the bottom right is the likely candidate for refactoring.
Visualisation of static dependencies of multiple back-end services
If you face a legacy codebase that needs refactoring, reading the book Working Effectively With Legacy Code by Michael Feathers is a must. In his book Michael provides techniques and patterns for breaking dependencies, which would prove useful if you had to deal with the codebase on the bottom right here.
But before the team endeavours a large refactor, you will want to encourage them to have good build, test, and deployment practices. These days there is hardly any language that does not have its unit test library (if not write your own). These days there is hardly any enterprise application server or web server that does not come with a scripting language or command line API (e.g. appcmd for IIS, wlst for WebLogic, wsadmin for WebSphere). These days there is hardly any platform that does not have its UI automation technology (e.g. x3270 for IBM mainframes, win32 API for Windows Desktop applications).
Enabling your team to build, test, and deploy a code change programmatically is the cornerstone of Continuous Delivery for any system, including legacy, and should be what to aim for in the first place.