What is legacy code?

28 April 2025

Authors

Cole Stryker

Editorial Lead, AI Models

What is legacy code?

Legacy code refers to software code that still serves its purpose but was developed using now outdated technologies. It encompasses code inherited from another team or an older software version and source code no longer actively supported or maintained. It also includes code written using obsolete hardware or operating systems, phased-out compilers or application programming interfaces (APIs), or out-of-date programming languages or software development environments. As a result, legacy code no longer complies with new coding standards, current software design principles or the latest computing architecture.

In his 2004 book Working Effectively with Legacy Code, Michael Feathers offered another description—“code without tests.”1 This definition of legacy code means that programmers have no way of verifying that the code works and works as expected. Many legacy systems also lack adequate documentation vital to understanding their behavior, which makes extending or enhancing them a heavy lift for developers.

Challenges of maintaining legacy code

Legacy code contributes to technical debt, which needs to be “repaid” over time through ongoing code maintenance. Here are a few common challenges that organizations might encounter while maintaining legacy code:

    ● Adaptability

    ● Cost

    ● Performance

    ● Scalability

    ● Security and compliance

Adaptability

Because of its outmoded nature, legacy code can be incompatible or difficult to integrate with more modern systems. This lack of adaptability can impede innovation and slow business growth, with enterprises potentially losing their competitive edge.

Cost

Legacy systems can be expensive to maintain. These operational and maintenance costs can add up, with third-party support fees increasing for older software and hardware versions. Moreover, finding developers skilled in outdated computing practices or programming languages can be challenging and comes at a price.

Performance

Clunky, monolithic architectures lead to high latency, slow response times and frequent downtime. This sluggish performance can negatively affect user experience, lowering customer satisfaction. It can also hamper productivity and efficiency for team members working with and maintaining these systems.

Scalability

Outdated systems can suffer from rising user loads. They struggle to meet a surge in demand and scale up or down as required. Their tightly coupled components also make it difficult to upgrade existing functionality or add new features.

Security and compliance

Old code might not be actively updated with security patches and follow the latest security standards, so it becomes vulnerable to cyberattacks and breaches. Legacy systems might also lack compliance with current regulations.

How to modernize legacy code

Modernizing legacy applications requires careful planning. Here’s a 5-step methodology to help streamline the process:

    ● Understand the codebase

    ● Divide and conquer

    ● Craft characterization tests

    ● Refactor, migrate or rewrite

    ● Test and document

Understand the codebase

The first step is to understand the codebase, and it’s usually the most difficult part. Start with reviewing any available documentation, be it requirements documents, inline code comments or version control history such as commit logs or change logs.

When documentation is insufficient, try using static code analysis tools that automatically examine code without running it. Additionally, code visualization tools can create a graphical representation of the source code structure, helping map out dependencies and interactions between elements.

Divide and conquer

Once software development teams have enough of a grasp of the legacy system, they can start tackling it. These sprawling codebases can be overwhelming to deal with, so divide them into smaller, more manageable modules and work on 1 module at a time.

Craft characterization tests

Tests are typically written to validate the correctness of code and its intended behavior. However, characterization tests are created to comprehend what the code does and how it functions. This is useful for understanding legacy code.2  

Refactor, migrate or rewrite

Enterprises generally have 3 options when it comes to modernizing legacy code: refactor, migrate or rewrite. They can also combine any of these approaches. Deciding on which path to pursue requires the involvement of both the software engineering team and the business leadership team.

Code refactoring alters the internal structure of the source code without modifying its external behavior or impacting its functionality. These small changes are less likely to introduce bugs and can result in clear, clean code that’s more maintainable.

For legacy code, teams can begin with minor modifications for each module, including renaming variables, removing duplicate or unused methods and standardizing formatting. They can then proceed with more logic-based restructuring such as breaking down large methods into smaller ones, simplifying complex conditionals and moving features between functions to lessen dependencies and enhance cohesiveness.

Migration is another route toward modernizing legacy code. This entails migrating all or parts of the code to newer platforms or tech stacks, such as transitioning from a monolithic architecture to microservices or shifting from on-premises to the cloud. It’s important to check compatibility with the platform or tech stack and confirm whether providers offer any support during migration.

Rewriting legacy code is often the last resort because it involves creating entirely new code to replace the old code. This is a new project in itself—a huge undertaking that might require a separate development team to handle.

Both migration and rewriting can be a daunting task for huge legacy codebases, so teams can consider the “strangler fig” strategy.3 A strangler fig grows high on a tree’s surface, its roots descending to the ground, slowly wrapping its host tree in a constricting lattice that eventually causes it to wither away.

In terms of legacy systems, teams can incrementally migrate or rewrite tiny code fragments until the entire codebase has been switched to a modern framework or developed in a current programming language. However, teams must build a transitional architecture for the existing code and the new code to coexist. This transitional architecture will then be decommissioned once the migration or rewrite is complete.3

Test and document

It’s crucial to thoroughly test refactored, migrated or rewritten code to make sure that no bugs appear. Developers can write their own integration and unit tests, but it’s also essential to involve QA teams who can run functional, regression and end-to-end testing to check that features and behaviors are intact.

Documentation is another critical part of the modernization workflow. Document source code changes, whether by annotating code through inline comments, creating detailed change logs or writing comprehensive architecture and design documents and other technical documentation.

3D design of balls rolling on a track

The latest AI News + Insights 


Discover expertly curated insights and news on AI, cloud and more in the weekly Think Newsletter. 

Tools for modernizing legacy code

Several tools can help speed up and automate the process of modernizing legacy code. Here are some common ones:

    ● Static code analyzers

    ● Code visualization applications

    ● Test automation frameworks

    ● Migration platforms and toolkits

    ● Document generators

Static code analyzers

Static analyzers can help with debugging legacy code for programming flaws, quality issues and even security vulnerabilities. Many static code analysis tools support legacy programming languages such as C, COBOL, PL/I and RPG. Examples of static code analyzers include CodeSonar, Klocwork, the open source PMD and SonarQube.

Code visualization applications

Code visualizers represent source code graphically to provide a better picture of how it works, especially for large or complex codebases. These graphical representations come in different formats such as code maps, flowcharts and unified modeling language (UML) diagrams. Examples of code visualization apps are CodeScene, CodeSee and Understand, among others.

Test automation frameworks

These frameworks create and run automated tests and produce reports on those tests. Popular test automation frameworks include Cypress and Selenium for web applications and Appium for mobile apps.

Migration platforms and toolkits

These platforms and toolkits help simplify and automate migration workflows for legacy systems. Some major migration platforms are AWS Application Migration Service, Azure Migrate, Google Cloud migration toolkits, IBM Cloud Transformation Advisor and Red Hat Migration Toolkit for Applications.

Document generators

These tools automatically generate documentation from source code and other input files. Examples of document generation tools are Doxygen, Sphinx and Swimm, among others.

AI for modernizing legacy code

Artificial intelligence (AI) can aid in legacy code modernization. Generative AI applications are backed by large language models (LLMs) that can analyze complex or huge legacy codebases.

Generative AI can be employed to assist with these legacy code modernization tasks:

    ● Code explanation

    ● Code refactoring

    ● Code transformation

    ● Test generation and documentation

Code explanation

Generative AI can understand the context and semantics underpinning legacy codebases. This makes them capable of outlining the logic and function behind them, explaining code in a way programmers can understand.

Code refactoring

AI-powered tools can offer real-time refactoring recommendations. For instance, IBM® watsonx Code Assistant™ harnesses IBM Granite® models to identify bugs and optimizations. It then suggests targeted fixes that align with a team’s established coding conventions, helping simplify and accelerate code refactoring.

Code transformation

AI systems can suggest ways to implement source code from a legacy programming language to a more modern one. For example, IBM watsonx Code Assistant for Z blends automation and generative AI to help developers modernize mainframe applications. These generative AI capabilities include code explanation for COBOL, JCL and PL/I and converting COBOL to Java code.

Test generation and documentation

Like test automation frameworks, AI coding assistants can also generate tests automatically. Also, they can create inline comments to document what certain code fragments or snippets do.

As with any AI-powered application, programmers must still exercise caution when using AI for modernizing legacy code. They must review the outputs for accuracy and test any suggested changes or fixes.

AI Academy

Putting AI to work for application modernization

Learn how generative AI can transform your application modernization journey by enhancing productivity, reducing compliance risks and streamlining updates.

Related solutions
Cloud Migration - IBM Instana Observability 

Instana simplifies your cloud migration journey by offering comprehensive monitoring and actionable insights.

Explore Instana
Mainframe Application Modernization Solutions

Leverage generative AI for accelerated and simplified mainframe application modernization.

Explore mainframe modernization
Application Modernization Consulting Services

Optimize legacy applications with hybrid cloud and AI-driven modernization services and strategies.

Application modernization services
Take the next step

Optimize legacy applications with hybrid cloud and AI-driven modernization services and strategies.

Explore application modernization services Download the guide
Footnotes

1 #195 - Working Effectively with Legacy Code and AI Coding Assistant - Michael Feathers, Tech Lead Journal, 14 October 2024

2 Characterization Testing, Michael Feathers, 8 August 2016

3 Strangler Fig, Martin Fowler, 22 August 2024