Skip to main content

skip to main content

developerWorks  >  Rational  >

How vulnerable are you?

A crash course in software security

developerWorks
Document options

Document options requiring JavaScript are not displayed

Discuss


My developerWorks needs you!

Connect to your technical community


Rate this page

Help us improve this content


Level: Introductory

Bob Breznak, Candidate for Computer Science Degree, Worcester Polytechnic Institute

15 May 2008

Journal icon from The Rational Edge: Read this primer on security vulnerabilities that plague any software system connected to the Internet.

From The Rational Edge.

illustration Software security remains a hot topic. Everyone from grandmothers to Fortune 500 companies has heard the stories of identity theft, data loss, and general mayhem caused by viruses and attackers on the Internet. In the first quarter of 2008 alone, 1,474 different software vulnerabilities were reported with only 64 of them having posted solutions. 1 That's a resolution rate of about 4%. With all the buzz about software and system security, the computer world may seem to be in total chaos, leading many to ask, "How vulnerable am I?"

In this article I present some of the results of a recent security project at Worcester Polytechnic Institute (WPI), along with additional research. My intention is to demonstrate what software security is by demystifying common terminology and providing realistic examples of typical security exploits.

This article is not intended to provide a comprehensive computer security education, but rather to serve as an introduction to some of the key topics in the vast and expanding field of information security. If you would like to engage in further discussion, please feel free to contact me at rbreznak@wpi.edu.

Background

My early interest in system security is actually what sparked my initial interest in computer science and has been a focus of mine ever since. So needless to say, I was a bit disappointed by the lack of undergraduate software or system security courses offered at WPI. Without any formal courses dedicated to security, two other computer science students and I decided to do independent study to learn more about software security. We talked to Prof. Kathryn Fisler, a WPI computer science professor, earlier this year about undertaking an independent study project involving software security. We planned to develop a secure software engineering course that will be first offered during the 08-09 academic year. CS4400x, as it is currently known, will be the first undergraduate computer science security course offered at WPI.

While developing this course, we discovered several facts about software vulnerability and prevention that we were never exposed to in our standard course work. We found that the majority of vulnerabilities are usually caused by a small logic error or a case that the software developer hadn't uncovered or accounted for. Some vulnerabilities are not overly complex, and are thus easy for hackers to exploit. They can be found using only a few simple resources, like a Web browser and a text editor. In other cases where the software development team has not made a mistake, the security breach can be caused by the end user improperly configuring or using the software.

We concluded that the biggest security flaws stem from developers assuming incorrectly that security will be handled elsewhere in a system -- for example, they assume that incoming data can be trusted. Lesson: We have to realize that security is everyone's responsibility, not the other person's.



Back to top


Vulnerability 101

What is vulnerability? "Exploits," "attacks," "vulnerabilities," and other terms are commonly used to describe what proper software security aims to correct: flaws in a system that allow an attacker to gain privileged access to information or to damage a system. The Mitre Corporation defines a vulnerability as "a state in a system or network that allows an attacker to execute commands as another user, access data that they shouldn't, pose as someone else, and/or conduct a denial of service." 2 Using this definition, being vulnerable would mean to exist in a state where an attacker, be it a person or malicious program like a virus or spyware, has access to a greater portion of a system then they should.

Vulnerabilities occur across a wide spectrum, from the obvious -- like using a weak password or storing unprotected private data -- to the more nuanced -- like unchecked input.

Overflow attacks

Some of earliest to be perpetrated, and still prevalent, attacks rely on a developer's assuming that data entered by the end user can be trusted. Most programmers do not expect to get 40,000 lines of text in a username field or have obscure characters that aren't even on the keyboard entered in a password field, so the entered data is never verified as correct. This assumption gives rise to the overflow family of attacks. For instance, using a text editor and some knowledge of the Microsoft PowerPoint file format, 3 one can manually edit a PowerPoint file. Editing a PowerPoint file 4 to have more data in an internal field than the format allows causes Microsoft PowerPoint XP to crash, then execute any program that the attacker wants. In one familiar example of this vulnerability, the built-in Windows calculator program is executed; however, the executed program could have just as easily been something of a more malicious nature. This is just one example of the innumerable exploits of this type.

Essentially, what happens in an overflow style of attack is too much data is pushed into what the original programmer thought was enough space. The extra data spills over into memory near the intended storage area and overwrites data that may have nothing to do with that area's original purpose. As the rest of the program is executed, it uses the newly overwritten data. If the attacker is able to fill in just enough space with dummy data (i.e., NOPs) and then add a bit of malicious code or a value, the program will execute the malicious code or use that new value. This could lead to a number of different results. The attacker may be able to override a login 5 and obtain administrator privileges for a program. If the attacked program was initiated by a system administrator, the malicious code will then run as a part of the original program, giving the attacker administrator privileges on the system. Overflow vulnerabilities, although not always apparent, can, in some cases, be easily remedied by using a "safe" library 6 when developing an application, using stack protection 7 (i.e., StackGuard 8 ) or running checks on incoming data to be sure it's the proper size and type. Exploits in this family all work in a similar manner, but will vary regarding the type of memory affected and the intended effect.

Buffer overflow attacks

In the instance of a buffer overflow attack, an internal value in a program is overflowed to alter how the program runs. 9 During normal operation of a program, when a function is called, any arguments for the called function along with a pointer to the return location are placed onto the stack. After the function has been completed, the return pointer is used to go back to the original location and the program can continue. Attacking using a buffer overflow can change this process and allow an attacker to execute any function they wish. This is done by entering just enough data to overwrite the arguments with dummy data and a new return pointer to a different function; the new function is now executed. 10

SQL injection

In addition to the overflow exploits, SQL injection is one other type of attack that relies on developer oversight by not testing incoming data. Most people have passwords that are alphanumeric or, in the case of a security-conscious individual, alphanumeric with the addition of other keyboard symbols. With this in mind a developer may want to let any character be fair game for a password. This is usually fine, unless they forget to sanitize, or check, the incoming data. This occurs more often than it should. A password system that uses an SQL database (a very common scenario on many Websites) may run a query along the lines of:

SELECT * FROM users WHERE 'username' = '$USER' AND 'password'='$PASS';

Where $USER and $PASS would be substituted with the username and password provided by the user. So if a user enters 'bob' and '1234' the resulting query would look like:

SELECT * FROM users WHERE 'username' = 'bob' AND 'password' = '1234';

and the returned value from the database would be any rows of data that had both a username with bob and a password with 1234. Now if an attacker entered admin and <<'hi' or 1=1>>-- the query looks like:

SELECT * FROM users WHERE 'username' = 'admin' and `password` = 'hi' OR 1=1--'

Notice how the quote that the user entered matches the third quote in the original query. The database would now return any rows with a username of admin and would negate checking for the password because the AND 'password' = 'hi' OR 1=1 commissions the database to find rows where the password is hi or where 1=1, and since 1 is always 1, every row is a candidate for return. The --, SQL comment denoter, negates the extra quote originally in the query and would also negate any additional checks, so if there was an additional credential (i.e., a keyfob 11 or captcha 12 ) it would be ignored. Now the attacker can enter into the system as the administrator and not even have to give a legitimate password. By using more and more complex queries, an attacker can change, add, or look up data. 13 This gives the attacker the same privileges on the database as the application.

This type of vulnerability has proven to be one of the most effective types of attacks on Web applications, and as the reliance on Web applications grows, the power of this exploit will become even more daunting. The fortunate news is that like the overflow type of attacks, this vulnerability can in large part be prevented by sanitizing the incoming data and never immediately trusting user input (at least for entered data).



Back to top


Hash hacking

Anyone who's dealt with storing user credentials in a database will tell you that one of the first rules is to never store a password or other private data directly without encrypting it first. Using an encryption scheme will prevent exposing a user's password if the database ever becomes compromised. For added benefit, using a one-way encryption algorithm or hash, like MD5 14 or Blowfish, 15 will make decrypting the password impossible. This is because hashing an input value converts it into a new value that is mathematically impossible to undo to produce the original value. Traditionally, the way to attack passwords stored in this manner involves trying as many different potential passwords as possible until finally one works, a technique known as "brute forcing." Although quite simple to initiate, this type of attack usually proves to be fruitless because of the sheer amount of time needed to try enough combinations, sometimes well into the range of thousands of years, to be statistically successful, or so it is generally assumed.

Every day you probably go to a hash hacking tool and don't even know it. Google is really good at what it does: finding links between information. For example, Googling "Bob Breznak" will turn up results all about me: the last book review I wrote for The Rational Edge, personal sites that I started (and abandoned), etc. Now, if you abstract what is going on here a bit you'll end up with the idea of a big table that will quickly return results. Apply the idea to looking up a hash. For example, take the MD5 hash of "foobar": 3858f62230ac3c915f300c664312c63f. Now put that into Google and... voila! In 0.21 seconds, the first result is "Google Hash: md5(foobar) = 3858f62230ac3c915f300c664312c63f". Most of the results you'll find are Hash indexing sites -- sites intentionally built to connect hash values and their corresponding keys. Trying more complex strings (e.g., "bobby," "crayon," "rational") will yield mixed results. If you were to try a password (or better yet a pass-phrase) that had any complexity to it, odds are that you'd get nothing back.

This might not seem to be the biggest security hole you need to worry about, but what gives this exploit its teeth is the fact that it's not as well-known and there isn't an easy patch for it. As more and more complex keys are matched to hashes, the likelihood of coming up with a match for any given password increases. Changing to a more complex algorithm, say from MD5 to SHA-2, 16 does little to restore password integrity. This is because although completely different hashes are returned for the same string, it just becomes a matter of time before keys are linked to hashes. And although cryptography has almost always used taking time to brute-force as a safety net, searching reduces the time required to successfully compromise the hash from millennia to within years or possibly even months.

Now, two major arguments exist that pull hash-lookup off of the list of contenders for the biggest security flaw of the year. This type of attack is only possible if the database or password store is already compromised. The database, shadow file (where UNIX or LINUX systems keep their user passwords), or I/O stream needs to be read. In some cases, this may be as easy as going to the right URL or executing a SQL injection, while in other cases it may require much more effort on behalf of an attacker. The other argument mitigating the potential severity of this attack is actually derived from the use of hashes. Since multiple strings can generate the same hash, the original string (the user's real password), cannot necessarily be determined. However, finding any of the strings that hash to the same value results in being able to reproduce the same hash value. This means that although an attacker may not get the actual password, since the system uses the stored hash and the attacker has a string that will produce an equivalent hash, the system responds just as though the attacker had the original password.



Back to top


End-user issues

Even with the most thoroughly-tested and secured software, once it's packaged and given to the end-user, all bets are off. How software is configured and implemented can play just as pivotal a security role as any other steps that lead to that point.

The fallacy of defaults

One of the biggest mistakes a user can make when implementing a new piece of software is being content with default values. Many pieces of software will have default values defined for various options so that the user can get running as quickly as possible. While sometimes this can be very beneficial, like having Web servers default to the standard Web server communications port, this can also lead to a number of security issues. For instance, many routers and other networking equipment come with a default username and password for logging in. The problem arises when a user doesn't change these defaults to unique values. Searching for "default router passwords" will turn up a number of sites that list the default usernames and passwords for most routers on the market. If an attacker is looking to get into a network, then usernames, passwords, IP addresses, and other default values that remain untouched can be the set of keys left in the door.

Unsecured systems

Another problem arises when the end-user is uncertain as to how their added software fits with the rest of their system and what additional security measures need to be taken. Take again the Web search example. Searching for "view/index.shtml axis" will return lists of unprotected network cameras. In some cases, the end user has successfully set up a network camera they want to be viewable by the entire world. In other cases, a network camera was added to some network and never configured to not be publicly available. This unprotected camera now becomes viewable by anyone with an Internet connection and can quickly become an invasion of privacy. There are books and Websites devoted to providing search queries that return data the owner may have never thought twice about securing and is now publicly available. 17

There are a few lessons to draw from these novel searches. First, default values are universally known, therefore when leaving a default value untouched consider how harmful it would be if an attacker knew that information. Second, you should know how parts of a system are accessible and consider if that level of accessibility is appropriate. Third, remember that security through obscurity does not work. No matter how insignificant or unpublicized something publicly available may be, either though an automated search or though some manual labor, anyone interested in finding that data will get it. 18



Back to top


Conclusion

These threats are just a drop in the bucket as more and more vulnerabilities are being discovered each day. As with these examples, most, if not all, vulnerabilities stem from an oversight: be it the developer who neglects to implement a secure practice, or the end-user who overlooks changing a configuration. Granted, the examples I've provided may be superficial and may seem quite simple to avoid, but as a project grows, so does the chance for a simple unchecked case. An attacker may only need the slightest opportunity to wreak havoc on a system and cause permanent damage. Developers should fully understand the impact of their assumptions and test beyond the limits of normal usage. By fully knowing how the system works and assuming an attacker role on their products, developers should regularly audit their work and release patches whenever necessary.

Individuals may have no say in how flawed software may be, and although there is no such thing as a completely secure system, one can minimize vulnerability by using multiple layers of security. Strong passwords, keeping software up-to-date and properly patched, and knowing the impact of each piece of the software puzzle as it is added are small ways the end user can add layers of protection. In the end, it is the responsibility of everyone associated with a piece of software to keep security a focus and to try to mitigate vulnerabilities.



Back to top


Notes

  1. http://www.cert.org/stats/vulnerability_remediation.html
  2. http://cve.mitre.org/about/terminology.html#vulnerability
  3. http://poi.apache.org/hslf/ppt-file-format.html
  4. http://www.astalavista.com/index.php?section=exploits&cmd=details&id=4818
  5. http://nsfsecurity.pr.erau.edu/bom/Spock.html
  6. http://www.openbsd.org/cgi-bin/man.cgi?query=strlcpy&sektion=3
  7. http://nsfsecurity.pr.erau.edu/bom/StackGuard.html
  8. http://www.usenix.org/publications/library/proceedings/sec98/full_papers/cowan/cowan_html/cowan.html
  9. http://nsfsecurity.pr.erau.edu/bom/Smasher.html
  10. http://insecure.org/stf/smashstack.html
  11. http://en.wikipedia.org/wiki/SecurID
  12. http://captcha.net/
  13. http://www.youtube.com/watch?v=MJNJjh4jORY
  14. http://userpages.umbc.edu/~mabzug1/cs/md5/md5.html
  15. http://en.wikipedia.org/wiki/Blowfish_(cipher)
  16. http://en.wikipedia.org/wiki/SHA-1
  17. http://www.oreilly.com/catalog/googlehks
  18. http://www.linux.com/articles/23313


Resources



About the author

author photo

Bob Breznak is studying computer science, with an emphasis in software engineering and robotics, at Worcester Polytechnic Institute, in Worcester, Massachusetts. He is currently employed by the computer science department as a Senior Assistant and also by Sun Microsystems. On campus, he's heavily involved with the Association for Computer Machinery (ACM), the IEEE, and the honor society for the computing sciences (Upsilon Pi Epsilon, or UPE) of which he was recently elected president. He can usually be found in the sub-basement of the computer science building tinkering with projects ranging from system security to robotics.




Rate this page


Please take a moment to complete this form to help us better serve you.



 


 


Not
useful
Extremely
useful
 


Share this....

digg Digg this story del.icio.us del.icio.us Slashdot Slashdot it!



Back to top