IBM Dev Day: Bob Edition Building Intelligent Apps with Agents and MCP | Register now
Two people looking at software code on a screen

What is code quality?

Code quality, and how to improve it

Code quality refers to the robustness of code beyond whether or not it simply runs and performs its desired function. High-quality code is differentiated by its efficiency, maintainability, readability and reusability, whereas low-quality code is brittle, difficult to parse and prone to piling up technical debt over time.

High coding standards are to the software development process what proper mise en place and “working clean” are to the operation of a commercial kitchen. Practices that elevate the quality of your code can yield better functionality in the short-term, but their more important benefits are fewer issues, faster progress and lower maintenance costs in the long term.

The long-term benefits of higher-quality code can sometimes be difficult for programmers to communicate to management less versed in the minutiae of the software development lifecycle. Balancing the holistic benefits of optimal code with the immediate pressures of business priorities often entails complex tradeoffs. That said, a 2022 study of 39 proprietary production codebases asserted, among other findings,1 that: 

  • Technical debt from rushed code wastes up to 42% of developers’ time.

  • Low-quality code leads to 15 times more defects than high-quality code.

  • Resolving issues in low-quality code takes (on average) 124% more time than resolving issues in high-quality code.

Higher-quality code increases the ease and speed of understanding, refactoring, debugging and adding new features to a codebase. Clear, consistent, well-written code facilitates smoother coordination across development teams and reduces the complexity and complications of code changes. It drives not only strong software quality, but also strong developer experience and user experience.

Whether a piece of code compiles and successfully executes its purpose at runtime is not enough to determine its overall quality. Writing code is not like completing a crossword puzzle, in which there exists one single way to correctly complete your task: there are often countless solutions to a given coding problem. Functionality therefore represents the mere  for acceptable code. The value of high-quality code is manifested in its secondary effects on the context around it.

What defines high code quality?

High-quality code is a relatively abstract concept. The overall quality of a piece of code is defined as much by how it’s made and how it interacts with the greater codebase it exists within as by any specific set of discrete, objective code quality metrics (though many such metrics exist).

Common traits of high-quality code include:

  • Readability: Code readability is essential to maintenance, debugging and coordination across teams and time. Can another team member easily understand your code? Can another programmer, working years from now, accurately interpret your code without you being around to provide context?

  • Maintainability: Code complexity is often inversely correlated with code maintainability. Is your code easily testable, enabling your team to efficiently evaluate it for security vulnerabilities and optimization opportunities? Is it robust enough to add new features without breaking core functionality, or is it too brittle to adjust without requiring major refactoring? Prioritizing maintainable code might entail additional upfront hassle, but saves significant time and energy moving forward.

  • Efficiency: Well-written code can reduce a system’s latency and resource consumption. For instance, carefully chosen data structures can minimize the number of CPU operations required for a given function. Thoughtful data caching strategies cut down on costly input/out (I/O) operations by eliminating redundant database queries and network requests, while promptly releasing unused memory avoids unnecessary RAM bloat.

  • Reliability: Fewer defects and structures robust to code changes means less frequent failures and downtime. Reliability is essential to user experience and trust, as well as to the health of critical systems on which your company relies.

Writing in Harvard Data Science Review in 2023, researchers from Calvin University, Amherst College and Columbia proposed a prescriptive framework for good code: “The Four C’s.”2

  • Correctness: The code does what it’s supposed to do. The authors emphasized two corollaries of this obvious inclusion: “First, correctness is a necessary but insufficient metric for good code. Second, the other goals support and promote correctness.”

  • Clarity: Anyone reading and writing the code can tell what it’s intended to do and intuitively make modifications as necessary.

  • Containment: Avoid sprawl, redundancy and unnecessary dependencies. Proper containment entails, among other things, “using functions to contain reusable code, and keeping code used across files or projects in a module or package.”

  • Consistency: A codebase should maintain internal consistency of style, naming conventions, commentary, indentation and other practices.

As modern software development continues to be increasingly driven by agentic coding assistants such as IBM Bob, containment and consistency are particularly useful to maximizing the efficacy and accuracy of automated tools. That said, the ability of next-generation platforms like IBM Bob to identify AI refactoring opportunities in real time can reduce the effort needed to enforce that level of stylistic discipline.

Best practices for high-quality code

Though each programming language and use case has its own specific nuances and granular considerations, there exist some universal best practices for quality code in any scenario.

One approach to conceptualizing high code quality is to simply think of it as code that avoids as many markers of bad code—which will be explored later in this article—as possible.

For a more normative approach, the aforementioned Harvard Data Science Review (HDSR) paper outlines a series of guidelines for ensuring code quality. Though these guidelines are ostensibly optimized for the needs of data scientists, most are applicable across any coding discipline.

Choose good names

Strong naming conventions are essential to code readability and consistency. The authors suggest the following practices:

  • The length of names should be proportional to their scope. The greater the distance—whether in terms of time, lines of code or organizational structure—between a term’s initial definition and its use, the more essential it is that its name clearly communicates its role.

  • Keep a digest of abbreviations. Short or even single-character variable names can help declutter code, but they can also become inscrutable to people less familiar with the project. Maintaining a “glossary” of sorts mitigates that tradeoff.

  • Use capitalization consistently. It’s also wise to avoid instances in which two names are differentiated solely by their capitalization.

  • Avoid nondescript names. They significantly elevate the difficulty of code interpretation.

  • Choose file naming conventions that sort naturally. For instance, you can adopt the ISO 8601 standard for date and time, or pad numbers with 0s to ensure that they all have the same amount of digits.

Clear and consistent naming conventions can also help simplify the act of prompting AI coding assistants and increase the accuracy of their output. For instance, rather than prompting an agent to inspect certain types of variables or explore all files from a certain date range—which might require your AI agent to probabilistically deduce from context which variables or files meet the criteria—you can explicitly prompt an agent to explore all files beginning with a specific number, or all variables with a specific name.

Follow a style guide consistently

In addition to codifying strong naming conventions, a comprehensive style guide should ideally standardize formatting elements including use of white space and indentation, commenting and data types, as well as “coding dialects.” A wide array of proven style guides for various programming languages can be found on GitHub, such as those included in this curated list.

A style guide is also an invaluable reference for an AI coding assistant, serving as context for specific tasks or even as part of your AI agent’s system prompt.

Select a coherent and minimal (but adequate) toolkit

Leveraging toolkits and libraries is a natural boon for code reusability and efficiency, helping to standardize workflows and outputs across different teams and broadly accelerate code creation. They’re particularly useful when code must mediate interactions with complex third-party systems or handle repetitive, “solved” problems.

But over-reliance on toolkits can add unnecessary bloat and introduce external dependencies, reducing code’s robustness and maintainability. They also have a tendency to abstract away the code’s underlying logic, decreasing its readability. The HDSR authors articulate the ideal balance in simple terms: “We want our toolkit to be as simple as possible, but no simpler.”

Don't repeat yourself (DRY)

If you find yourself frequently copying, pasting and modifying the same code blocks, you might be better served by a function that encapsulates the repeated code in one place. The parameters of that function can reflect the elements that change from instance to instance. This cleans up code and simplifies maintenance, as it allows you to adjust all instances of that function in a single step and place (rather than individually adjusting every duplicate of a code block).

Employ consistency checks

Consistency checks are automated validations that verify whether potential data or system states adhere to predefined logical rules and conventions, helping to avoid and account for unforeseen conflicts and contradictions in code. These automated tests are typically a standard, critical component of CI/CD pipelines (continuous integration/continuous deployment).

This is a quintessential example of the importance of code testability. It’s difficult or impossible to design unit tests that exhaustively validate every function if your code is too complex or contains too many tightly coupled dependencies.

Enforce version control

Version control systems help foster consistency, quality control, coordination and code review processes across teams. When using AI-driven coding frameworks—especially those that might autonomously adjust your codebase—ensure that you have a means to easily roll back any undesired or adverse changes. IBM Bob, for instance, automatically versions your workspace files as checkpoints to allow easy rollback of code changes as needed.

What defines bad code?

Broadly speaking, bad code is difficult to read and maintain, brittle to changes and new features, inefficient and unreliable. It often features unnecessary dependencies, wherein different modules are intertwined with one another and any change to one requires extra work to avoid breaking the other. It lacks proper documentation and is poorly organized, devoid of coherent, logical structure, a state of affairs often referred to as “spaghetti code.”

Bad code is often the product of not (only) poor coding skills, but poor incentives and organizational structure: over-aggressively prioritizing features launches at the expense of code quality typically yields faster time-to-market but greater future complications and technical debt.

It’s important to remember that bad code often works—at least temporarily. Technical debt wouldn’t pile up if this weren’t the case, because undeniably broken code would have to be addressed. Refactoring: Improving the Design of Existing Code, the seminal book by Martin Fowler and Kent Beck first published in 1999 (and updated many times since), therefore used the term code smells to describe bad code. They’re not usually bugs and don’t inherently prevent a program from functioning, but they indicate design weaknesses and code quality issues that might slow development or cause bugs in the future.

Fowler and Beck’s list of code smells to be wary of includes examples such as:

  • Long function (long method): A method containing too many lines of code.

  • Large class: A class trying to do too much and containing too many variables, lacking cohesion.

  • Primitive obsession: Using primitive data types instead of specialized small objects.

  • Mysterious name: Functions or variables named poorly, hiding their actual intent.

  • Data clumps: Groups of variables that frequently appear together everywhere.

  • Lazy elements: Classes or functions that do too little to exist.

  • Long parameter list: Functions requiring too many arguments to operate correctly.

  • Shotgun surgery: One change requires modifying many scattered modules simultaneously (which is essentially the opposite of large class).

  • Duplicated code: Identical or very similar code structures in multiple places.

A comprehensive list of code smells, complete with explanations, examples and citations, can be found here. Their presence is usually a sign that refactoring is needed. A thorough, organization-wide understanding of these issues and the complications that arise from them is helpful to establishing a shared conception of quality standards across development teams.

Think Keynotes

How enterprises excel in the AI era

Move beyond AI hype to measurable value. See how IBM is transforming into an AI-first enterprise and turning agentic AI into productivity, reinvestment and real business impact.

Measuring code quality

Measuring code quality should always entail both qualitative and quantitative evaluation. While objective metrics such as cyclomatic complexity can be useful, they can also be misleading without proper context.

For instance, your team might write an automated test suite and your code might achieve 100% code coverage across a battery of unit tests. But if the test suite lacks some of the meaningful assertions needed to genuinely validate that your code fully works as needed, the false confidence from that test coverage might do more harm than good.

Likewise, robust review structure should include both manual code review and AI code review. A modern agentic coding tool like IBM Bob can perform extensive static code analysis and refactoring in real time, but benefits greatly from custom rules and custom modes that convey the developer’s specific needs and intent. Humans are not exhaustive and AI is not infallible, but buttressing one with the other is the surest way to confirm that all potential issues have been investigated with proper context.

Always remember that code quality is context dependent. Imagine a programmer on your team has written an eloquent, efficient, flawlessly articulated algorithm or code block that neatly achieves its intended function. If the problem could have been effectively solved using a standard, built-in library function with which everyone is already familiar, that eloquent code is actually a quality issue because it adds unnecessary complexity and mental overhead.

Author

Dave Bergmann

Senior Staff Writer, AI Models

IBM Think

Related solutions
IBM Bob

Accelerate software delivery with Bob, your AI partner for secure, intent-aware development.

Explore IBM Bob
AI coding solutions

Optimize software development efforts with trusted AI-driven tools that minimize time spent on writing code, debugging, code refactoring or code completion and make more room for innovation.

Explore AI coding solutions
AI consulting and services

Reinvent critical workflows and operations by adding AI to maximize experiences, real-time decision-making and business value.

Explore AI consulting services
Take the next step

Harness generative AI and advanced automation to create enterprise-ready code faster. Bob models to augment developer skill sets, simplifying and automating your development and modernization efforts.

  1. Discover IBM Bob
  2. Explore AI coding solutions
Footnotes

1. “Code red: the business impact of code quality – a quantitative study of 39 proprietary production codebases,” Proceedings of the International Conference on Technical Debt (accessed through the Association for Computing Machinery Digital Libtary), 16 August 2022

2. “Fostering Better Coding Practices for Data Scientists,” Harvard Data Science Review, 27 July 2023