AI can revolutionize application development by generating, optimizing, and translating code across the entire software development lifecycle. The adoption of generative AI can lead to consistent software creation, optimal utilization of developer creativity, and enhanced developer skills. For example:
Applying generative AI to application development has major potential benefits:
Applying generative AI to review, refactor, and apply enterprise coding standards to software results in more consistent software, e.g., common approaches to solving recurring problems, common code structure, self-documenting code, etc., regardless of which developer wrote the code. This in turn makes the resulting application easier to troubleshoot and maintain as there is less need for maintainers to first understand the structure and idiosyncrasies of different sections of code.
As with other domains, generative AI has the potential to free application developers from low-value tasks such as writing simple / rote code, or identifying the source of a troublesome bug. With more time to focus on higher-value tasks, developers can enable shorter development cycles, more functionality per software release, and smaller, more frequent changes.
Finally, applying generative AI to application development can amplify the skills of developers, enabling junior developers to perform at a senior or even expert level. Senior developers can incorporate model training into their release cycles, incorporating leading practices as code is improved. Generative AI can act as an expert mentor to junior staff, freeing up senior developers to focus other tasks and improving the skill level of the overall development team.
Generative AI can be applied to application development in several broad use cases. Many general models, such as Llama 2, are trained on application code written in multiple contemporary programming languages, and tuned models for code generation are also available.
The use cases we see benefiting from generative AI include:
Each of these use cases are described below.
The native text generation capability of Large Language Models (LLMs) can be used to generate new code from natural language prompts. For example, a developer can submit the prompt, "Write a SQL query to retrieve a customer's first and last name from the customer table," and receive a SQL query in return.
Using LLMs for code generation can significantly augment the application development skills of junior or even non-developers, but it can quickly reach a point of diminishing returns as the required outputs become more complex, or as the level of detail required in the prompts approaches the code that will be generated.
Code optimization and refactoring, the process of improving code to be more performant and better structured, can be thought of as a combination of two LLMs capabilities: text generation and text summarization. Using a general or tuned LLM, a developer can prompt to optimize or restructure a piece of code to make it more performant and/or eliminate duplicate code.
LLMs work well for optimizing and refactoring smaller pieces of code that fit within the models' context window but require larger solutions that maintain metadata about the whole software application to achieve acceptable results on larger pieces of code and/or complete software systems.
Similar to code optimization, LLMs can be used to apply and enforce enterprise coding standards around topics such as function and variable naming, code structure, and enterprise coding conventions. Typically applied at the repository level as part of the code review and commit process, LLMs tuned on the enterprise's coding standards can translate submitted code to comply with enterprise standards. These standards can also include hardening conventions that help the enterprise comply with regulatory standards.
As programming languages are like any other language, the native text translation ability of LLMs can translate software written in one programming language to another, for example convert C# to Java.
Like code optimization, LLMs alone work well to convert small pieces of code that fit within the model's context window, but larger solutions that maintain meta-data and other important contextual information are necessary to convert larger pieces of code or complete software systems.
Through code conversion capabilities, GenAI can also facilitate the translation of code between different programming languages, such as translating COBOL code to Java. This can be particularly useful in multi-language environments or during system migrations, saving developers the time and effort of manually rewriting code.
Code understanding is the analogue of code generation. Instead of converting natural language prompts to code, code explanation takes a piece of code as input and generates a natural language explanation of the code's functionality. For example, a prompt like "Explain the function of this piece of python code" followed by a section of python can generate a line-by-line and overall summary of the code's purpose.
This capability can also be used to detect errors in code, also known as bug hunting, by prompting the model to "Identify why this piece of code is failing."
API and Library Selection is the application of retrieval augmented generation (RAG) to enterprise API and software library management. A developer searching for an API to use with an application could compose a RAG prompt that queries an enterprise database of API names, descriptions, endpoints, etc., to answer questions like, "Do we have an API that does xyz?" To the extent that API and code library descriptions are maintained with high quality and keywords, such an application could be tuned to provide consistent responses that speed application development as well as developer onboarding.
Architects must make a number of significant architecture decisions when designing application development solutions using LLMs.
Does the model offer indemnification / copyright protection, and how are you able to identify if the generated code is encumbered by licensing terms? Even models trained on permissive licenses may be encumbered by licensing clauses such as granting credit to the original copyright holder.
Architects creating solutions meant to enforce and apply enterprise coding standards must consider the effort necessary to tune an LLM to 'understand' the enterprise standards; and make an informed decision on whether other methods such as linting tools are better instrumented to achieve similar capability.
Auto-complete style code assistance must respond quickly to not interfere with a developer's train of thought. Architects must consider the placement and connectivity of developer assistance models to ensure developer supports are beneficial and not intrusive.
Large language models are not guaranteed to produce functionally correct code, particularly if the generated or revised code must fit into a larger software system. While there is no direct solution to this problem (and it's becoming less of one as LLMs evolve) architects need to be aware that LLM generated code needs to be subjected to the same quality assurance and security controls as code produced by human developers.
Generally available LLMs are typically trained on a small number of contemporary programming languages such as Python, Javascript, C#, and others. Solution architects needing to support older or niche languages may find themselves with few choices for available models, or may be required to aggressively tune a general model to meet their specific needs.
IBM's Generative AI Architecture is the complete IBM Generative AI Architecture in IBM IT Architect Assistant (IIAA), an architecture development and management tool. Using IIAA, architects can elaborate and customize the architecture to create their own generative AI solutions.