A compiler is a type of computer program that converts code from one programming language (the source language) into another programming language (the target language).
Compilers are used to transform high-level source code into low-level target code (such as assembly language, object code or machine code) while preserving the program functionality.
A critical tool for modern, practical computer programming, compilers enable programmers to work in human-readable high-level code and then convert their source code into executable target code. Compilers also help software developers create efficient executable programs with improved security, stability and portability. This is because compilers assist in identifying and addressing errors, thus creating portable executable applications.
Although all compilers convert high-level code into low-level, executable code, different types of compilers are used for different programming languages and applications. For example, a cross-compiler is used to produce code for a different type of CPU or operating system than the one on which it is running.
When the ideal compiler isn’t available, or hasn’t yet been built, a temporary bootstrap compiler is used for compiling a more permanent compiler that’s better optimized for compiling any specific programming language.
A brief list of other related software includes:
Industry newsletter
Stay up to date on the most important—and intriguing—industry trends on AI, automation, data and beyond with the Think newsletter. See the IBM Privacy Statement.
Your subscription will be delivered in English. You will find an unsubscribe link in every newsletter. You can manage your subscriptions or unsubscribe here. Refer to our IBM Privacy Statement for more information.
In practice, using a compiler can be as simple as entering a command into a command line in any Linux (or equivalent) system, specifying the compiler executable file and the source files to be compiled. This command instructs the system to process the source code, compiling it into a target machine code and resulting in the requisite object files to produce an executable program.
Open-source compilers like the GNU Compiler Collection (GCC)—a robust C compiler collection commonly used to compile C code into C programs—or the alternative Clang are available on repositories like GitHub. Other compilers can be freely installed or purchased from a wide array of distributors. They can also be built into popular integrated development environments (IDEs), which bundle various utilities for software development, including text editors, API documentation and debugging tools.
Regardless of the specific compiler being employed, the process of compiling code involves passing the source code through various levels of analysis, optimization and ultimately code generation. Source code passes through the different analytical layers sequentially and is evaluated through each step in the process.
If the compiler recognizes any issues with the original source code, it might return an error message, prompting developers to address identified errors before proceeding with compiling the rest of the code. Generally, compilers proceed through the following steps:
Some compilers might not adhere strictly to the preceding structure. However, while some compilers might contain more or less steps, all phases of compilation can be ascribed to one of three stages: a front end, a middle end and a back end.
This three-stage structure enables compilers to take a modular approach. It allows combining multiple front ends for different languages with back ends for different CPUs, all while sharing the optimization capabilities of various applicable middle ends.
The three stages of a compiler entail the following distribution:
While compilers are not explicitly necessary for producing workable code, the wide variety and complexity of both coding languages and machine environments make compilers a practical necessity for creating executable software. These are the four main benefits of using software compilers.
High-level programming languages use syntax and keywords that are closer to spoken languages, making them much easier for developers to use. Compilers convert this human-readable code into the more complex machine code needed to run optimized software applications.
Some examples of high-level languages include the following languages:
Compilers help improve efficiency by converting high-level code into executable machine code. The compiler's output is stored with a .exe file extension, which is then directly executed by a computer. Due to the compiler, writing an executable program becomes a one-time effort task.
Once completed, the compiled code can be executed as many times as necessary. This process helps programs generally run faster and more efficiently, as certain applications or parts of applications can be executed separately from runtime software tasks.
Not all systems can run all types of programming code. Compilers are used to convert the types of code developers prefer to use into the types of code that systems require to operate. In this way, compilers improve program portability by converting software into a wide variety of compatible languages that can be easily stored, transferred and executed in various operating systems and hardware architectures.
During the compiling process, compilers can be used to identify and address software errors and flaws, resulting in more stable and better-optimized programs. Compilers can also help improve software security by preventing memory-related errors, such as buffer overflows, and generate warnings if potential memory issues are detected.
While compilers are used to convert source code into executable machine code, interpreters are another type of program that can provide similar functionality, but through a different mechanism.
Instead of converting the source code, interpreters either directly execute source code or use an intermediate code known as bytecode, a low-level, platform-independent representation of the source code. Bytecode serves as an intermediary between human-readable source code and machine code, designed for execution by a virtual machine (VM) instead of directly on a computer’s hardware.
Theoretically, any programming language can be executed with either a compiler or an interpreter. However, individual programming languages tend to be better suited to either compilation or interpretation.
In practice, the distinction between compiler languages and interpreter languages can sometimes blur—just as the distinction between compilers and interpreters themselves—as both types of programs can feature overlapping functionalities. While some languages are more commonly compiled and some more commonly interpreted, it is possible to write a compiler for a language that is commonly interpreted and vice versa.
High-level languages are typically created with a type of conversion—either compilation or interpretation—in mind, but these are more suggestions than hard limitations. For example, BASIC is often referred to as an interpreted language and C a compiled language, but there exist compilers for BASIC just as there are C interpreters.
The primary difference between interpreters and compilers lies in timing and optimization. Both types of programs attempt to convert source code into target code that is first functional and then optimized.
Depending on the operating environment, compiled or interpreted code might be better suited to efficiently run with considerations made for hardware capability, memory and storage capacity. Depending on the constraints of any specific program, application and hardware, either compilation, interpretation or a combination of both might yield the best results.
As such, interpretation cannot stand in for compilation entirely, but it can move compilation duties to the background through a gradual conversion process. Compilers employ an ahead-of-time (AOT) conversion strategy that converts source code into target code entirely before creating an executable file.
Interpreters, alternatively, either run code directly as an application requires it or use bytecode as the intermediary to output virtual machine executable source code. In this way, interpreters might provide some speedups or flexibility, but at some point, a set of directly executed machine instructions must be provided toward the end of the execution stack.
In some instances, when lightweight efficiency is a priority, special interpreters can be preferable over compilers for their ability to perform just-in-time (JIT) conversion. JIT is a strategy that compiles pieces of source code into target code into a memory buffer for immediate execution. JIT interpretation compiles code on demand, combining the one-time compilation efficiency of a traditional compiler with the flexibility to repeatedly execute code—often faster than standard bytecode interpreters.
However, as modern trends toward JIT compilation increase along with situationally dependent bytecode interpretation, many compilers are being designed to offer both compilation and interpretation features. This overlap further blurs the lines between these two categories.
Optimize your cloud with unified lifecycle automation - secure, scalable hybrid infrastructure designed for resilience and AI.
Optimize your cloud spend, improve efficiency, and gain visibility into resource usage with IBM’s cloud cost management solutions.
Accelerate, secure, and optimize your hybrid-cloud and enterprise infrastructure with expert guidance from IBM Technology Expert Labs.