![]() |
|
|||||||||||||||
|
||||||||||||||||
|
| Make your software behave: Learning the basics of buffer overflows | ||||
| Get reacquainted with the single biggest threat to software security
Buffer overflows have been causing serious security problems for decades. In the most famous example, the Internet worm of 1988 used a buffer overflow in fingerd to exploit tens of thousands of machines on the Internet and cause massive headaches for server administrators around the country; see Resources later in this column. But the buffer overflow problem is far from ancient history. Buffer overflows accounted for over 50 percent of all major security bugs leading to CERT/CC advisories last year. (The CERT/Coordination Center is part of the Software Engineering Institute in Pittsburgh; see Resources.) And the data show that the problem is growing instead of shrinking; see "Buffer overflow: Dejavu all over again". Clearly, you would think by now that buffer overflow errors would be obsolete. So why are buffer overflow vulnerabilities still being produced? Because the recipe for disaster is surprisingly simple. Take one part bad language design (usually in C and C++), mix in two parts poor programmer practice, and you have a recipe for big problems. Buffer overflows can happen in languages other than C and C++, though without some incredibly unusual programming, modern "safe" languages like Java are immune to the problem. In any case, legitimate reasons often justify the use of languages like C and C++, and so learning their pitfalls is important. The root cause of buffer overflow problems is that C (and its red-headed stepchild, C++) is inherently unsafe. There are no bounds checks on array and pointer references, meaning a developer has to check the bounds (an activity that is often ignored) or risk encountering problems. A number of unsafe string operations also exist in the standard C library, including:
For these reasons, it is imperative that C and C++ programmers who are writing security-critical code educate themselves about the buffer overflow problem. The best defense is a good education on the issues. This is why our next four columns will deal with buffer overflow. This column gives an overview of the buffer overflow problem. The next column covers defensive programming techniques (in C), and explains why certain system calls are problematic and what to do instead. In the final two columns in this series, we'll examine the engine's workings and explain how a buffer overflow attack does its dirty work on particular architectures. What is a buffer overflow? When writing to buffers, C programmers must take care not to store more data in the buffer than it can hold. Just as a glass can only hold so much water, a buffer can only hold so many bits. If you put too much water in a glass, the extra water has to go somewhere. Similarly, if you try to put more data in a buffer than fits, the extra data have to go somewhere, and you might not always like where it goes! When a program writes past the bounds of a buffer, this is called a buffer overflow. When this happens, the next contiguous chunk of memory is overwritten. Since the C language is inherently unsafe, it allows programs to overflow buffers at will (or, more accurately, completely by accident). There are no run-time checks that prevent writing past the end of a buffer, so a programmer has to perform the check in his or her own code, or run into problems down the road. Reading or writing past the end of a buffer can cause a number of diverse (and often unanticipated) behaviors: 1) programs can act in strange ways, 2) programs can fail completely, or 3) programs can proceed without any noticeable difference in execution. The side effects of overrunning a buffer depend on:
The indeterminate behavior of programs that have overrun a buffer makes them particularly tricky to debug. In the worst cases, a program may be overflowing a buffer and not showing any adverse side effects at all. As a result, buffer overflow problems are often invisible during standard testing. The important thing to realize about buffer overflows is that any data that happen to be allocated near the buffer can potentially be modified when the overflow occurs. Why are buffer overflows a security problem? In the simplest case, consider a Boolean flag allocated in memory directly after a buffer. Say that the flag determines whether or not the user running the program can access private files. If a malicious user can overwrite the buffer, then the value of the flag can be changed, thus providing the attacker with illegal access to private files. Another way in which buffer overflows cause security problems is through stack-smashing attacks. Stack-smashing attacks target a specific programming fault: careless use of data buffers allocated on the program's run-time stack, namely local variables and function arguments. The results of a successful stack-smashing attack can be far more serious than just flipping a Boolean access control flag as in the previous example. A creative attacker can take advantage of a buffer overflow vulnerability through stack-smashing and then run arbitrary code (anything at all). The idea is pretty straightforward: Insert some attack code (for example, code that invokes a shell) somewhere and overwrite the stack in such a way that control gets passed to the attack code. (We'll go into the details of stack smashing in our third and fourth columns on buffer overflows.) Commonly, attackers exploit buffer overflows to get an interactive session (shell) on the machine. If the program being exploited runs with a high privilege level (such as root or administrator), then the attacker gets that privilege in the interactive session. The most spectacular buffer overflows are stack smashes that result in a superuser, or root, shell. Many exploit scripts that can be found on the Net (see Resources) carry out stack-smashing attacks on particular architectures. Buffer overflow: Dejavu all over again
Number of vulnerabilities resulting in CERT/CC advisories
for the last eleven years
In chart above, the number of vulnerabilities that can be directly attributed to buffer overflows is displayed. As the data show, the problem is not getting any better. In fact, buffer overflows are becoming more common. Heap overflows versus stack overflows Let's dig deeper into why some kinds of buffer overflows have big security implications. A number of interesting UNIX applications need special privileges to accomplish their jobs. They may need to write to a privileged location like a mail queue directory, or open a privileged network socket. Such programs are generally suid (set uid) root, meaning that the system extends special privileges to the application upon request, even if a regular old user runs the program. In security, anytime privilege is being granted (even temporarily) there is potential for privilege escalation to occur. Successful buffer overflow attacks can thus be said to be carrying out the ultimate in privilege escalation. Many well used UNIX applications, including lpr, xterm and eject, have been abused into giving up root through exploit of buffer overflow in suid regions of the code. A common cracking technique is to find a buffer overflow in an suid root program, and then exploit the buffer overflow to snag an interactive shell. If the exploit is run while the program is running as root, then the attacker will get a root shell. With a root shell, the attacker could do pretty much anything, including viewing private data, deleting files, setting up a monitoring station, installing back doors (with a root kit), editing logs to hide tracks, masquerading as someone else, breaking stuff accidentally, and so on. Very bad. Meanwhile, many people believe that if their program is not running suid root, they don't have to worry about security problems in their code, since the program can't be leveraged to achieve greater access levels. That idea has some merit, but is still a risky proposition. For one thing, you never know who is going to take your program and set the suid bit on the binary. When people can't get something to work properly, they get desperate. We've seen this sort of situation lead to entire directories of programs needlessly set setuid root. Once again, very bad. There can also be users of your software with no privileges at all. That means any successful buffer overflow attack will give them more privileges than they previously had. Usually, such attacks involve the network. For example, a buffer overflow in a network server program that can be tickled by outside users may provide an attacker with a login on the machine. The resulting session has the privileges of the process running the compromised network service. This type of attack happens all the time. Often, such services run as root (and generally for no good reason other than to make use of a privileged low port). Even when such services don't run as root, as soon as a cracker gets an interactive shell on a machine, it is usually only a matter of time before the machine is "owned" -- that is, the attacker gains complete control over the machine, such as root access on a UNIX box or administrator access on a Windows NT box. Such control is typically garnered by running a different exploit through the interactive shell to escalate privileges. The extent of the real-world problem For example, Sendmail, an email server and one of the most widely used programs on the Net, is notorious for having security problems. Many people use this software, and many have scrutinized it carefully, but serious security problems (including buffer overflows) continue to be unearthed. In September of 1996, after several new exploitable buffer overflows were found and fixed in Sendmail, an extensive manual security audit was performed. But not four months later, yet another exploitable buffer overflow was discovered that the manual audit missed. To fast-forward a bit, a graduate student from the University of California in Berkeley, David Wagner, found a handful of new buffer overflows in Sendmail as recently as this year (see Resources). However, the overflows he found do not seem to be exploitable. In other words, it appears that none of the buffers could be overflowed arbitrarily by carefully-crafted attacker input (something that is generally a requirement). Nonetheless, at least one of the overflows Wagner found is known to have survived the manual audit from 1996. And just because a bug "doesn't seem to be exploitable" by trained security experts doesn't mean that it isn't exploitable. In 1998, researchers at Reliable Software Technologies (not the authors) found three buffer overflow conditions in the Washington University ftp daemon (wu-ftpd) version 2.4 using a fault injection tool called FIST. (We'll revisit FIST later in the series when we talk about dynamic analysis.) This in itself was interesting, because wu-ftpd had recently been heavily scrutinized after four CERT Advisories on wu-ftpd were issued between 1993 and 1995. Experts at Reliable Software Technologies examined the potentially vulnerable code and were unable to learn of a way to exploit any of the conditions. They declared the likelihood was very low that user input could be successfully manipulated to reach a vulnerable buffer in such a way as to cause a security violation. Then about a year later, a new CERT advisory on wu-ftpd appeared. One of the three overflows they had detected turned out to be a vulnerability after all! These anecdotes show how difficult manual analysis can be. By its very nature, code tends to be complex. Analyzing large programs like Sendmail (approximately 50,000 lines of code) is no easy task. Problems often slip by unwary developers, but problems slip by the experts, too. Windows versus UNIX: Is security by obscurity good? Some people believe that it's harder to find buffer overflows in Windows programs than in UNIX programs. There is some truth to this, because Windows programs tend to ship without source code, and UNIX programs tend to ship with source code. It's a lot easier to find potential buffer overflows (or most security flaws, for that matter) when you have the source code for a program. When the source code is available, an attacker can look for all suspect function calls, and then try to determine which ones might end up being vulnerabilities. Looking at the program, it is also easier for an attacker to figure out how to cause a buffer overflow with real inputs. It takes a lot more skill (and probably a measure of luck) to do these things without source code. The security you get for free when you don't release your source code is commonly known as "security by obscurity." It is a bad idea to rely on this kind of security, however, because skilled attackers are still going to be able to break your code if it wasn't designed properly in the first place. A much better solution is to create something solid and let others see what you've done so they can help you keep it secure. Many Windows developers rely on the security by obscurity model, whether they know it or not. Often, they don't know it, mainly because they never really give a thought to security when developing their applications. In such a case, they are implicitly relying on a flawed approach. Security experts generally believe that Windows code has more exploitable buffer overflow problems than UNIX code. This reasoning can be attributed to the "many eyeballs" phenomenon, which holds that open source software tends to be better scrutinized, and therefore is more likely to be secure. Of course, there are caveats. First, having people look at your source code doesn't guarantee that bugs will be found. Second, plenty of open source software exists for Windows, and at least as much closed source software exists for UNIX machines. For these reasons, the phenomenon really isn't architecture dependent. Conclusion
| ||||||||||||||||
| About IBM | Privacy | Terms of use | Contact |