Kerberos is a security protocol that originated at the Massachusetts Institute of Technology (MIT) as part of Project Athena in the 1980s. The research work that gave birth to Kerberos was a research paper by Needham and Schroeder that was published in 1978 (see references). This particular paper discussed using encryption for authentication over computer networks. Authentication is simply proving your identity. Encryption is simply scrambling data in such a way that it can be reconstituted only by the appropriate recipient.
So what is a "kerberos"? Kerberos was the mythological three-headed dog that guarded the entrance to the underworld. Unless you could get past Kerberos, you could not enter (or leave!) the underworld. In much the same way, Kerberos guards the entrance to services on the networks.
Mutual authentication: Kerberos is an authentication protocol that identifies principals (users and services) by requiring them to present proof of identity. A very useful facet of Kerberos is that it provides for mutual authentication: not only do users of a service identify themselves to the service, but users can challenge the service to prove its identity. In a world of very open networks like the Internet, this is a very comforting feature.
To accomplish the feat of mutual authentication, Kerberos makes use of what's called "trusted third-party authentication". This means that the Kerberos server must know who all the parties are and how they prove their identity (their passwords). This concentration of secret information makes the Kerberos server (more technically known as the Key Distribution Center or KDC) a very important resource to protect and monitor. Of course, it also attracts the hackers because of its importance. (Recall the words of the notorious bank robber, who when asked why on earth he robbed banks, replied sagely, "Because that's where the money is.") More importantly, the physical security of the KDC must be guaranteed, as the actual computer(s) housing the KDC may be more vulnerable than the network protocols that shield the KDC.
The Kerberos protocol assumes that all network traffic is vulnerable to capture, examination and substitution. It also assumes that it needs to work correctly even in the face of these challenging assumptions. These environmental assumptions match quite well with today's open networks and keep Kerberos in the game despite its relative (for the Internet) antiquity.
So in an environment where all network traffic might be captured, how does one authenticate (or "log in") to Kerberos? One cannot simply wrap the password up and send it over to the server (which is what happens with network protocols such as HTTP 1.0 and "basic authentication"). Anyone snooping packets off the network would know your password. So how does Kerberos fix this problem?
At a very simple level, Kerberos uses encryption technology. The user's password is utilized (while still on the user's workstation) to generate an encryption key. The key encrypts certain pieces of information that are exchanged with the KDC. After a few exchanges, the KDC returns information to the user that is usable only by software on the workstation that knows the temporary encryption key derived from the password. Now when users wish to contact a Kerberos-protected service, they first contact the Kerberos ticket-granting service and ask for a ticket to the service. A ticket is a chunk of information that proves the user's identity to the service; but it's encrypted in the services' long-term key so it's unintelligible to the user.
Without getting terribly bogged down in the details of which key is used when, Kerberos is going to return information to users and services that is useful to them if they can decrypt it. Users are able to decrypt the data if they are who they told Kerberos they were. Furthermore, Kerberos can make use of temporary keys wherever possible, to make it harder for hackers to break in. When a user and a service are interacting, they are doing so with a key that was specially generated just for this particular interaction and that expires within a relatively short period of time. The key lifetime is configurable, but it is usually good for hours, not days.
Data integrity: Assuming that packets may have been tampered with on their way either from the client to the service or from the service to the client, does it do any good at all to have authentication? Is there anything that can be done to prove not only who or what is on the other end of the wire, but also that the data is authentic? This authenticity is more commonly referred to as "data integrity", and Kerberos once more applies encryption technology to offer this service. Assuming that the client and service have authenticated as above and now each know the key for the current interaction (or session), we have all the pieces necessary to guarantee either data integrity or data confidentiality.
Since encryption is a costly operation in terms of time and CPU power, and we are only looking to ensure that the data is authentic; we need not encrypt all the data that is transmitted. Instead, an encrypted one-way hash is computed and transmitted with the plaintext data. Well, that is what happens, and Kerberos provides services to do it, but let's go through that statement a little more slowly and explain the terms used.
"one-way hash" is a cryptographic operation that quickly transforms any arbitrary message into a very short sequence of bytes. "Quickly" here means hundreds of times faster than encryption. A good hashing algorithm is very sensitive to the contents of the original message such that even a small change in the message should yield a very different hash value. And "one-way" means that one cannot reverse-engineer the hash to learn anything about the original contents of the message.
Kerberos encrypts the much-shorter one-way hash, and bundles that together with the "plaintext" data, which is the original, unmodified message. The sender can then transmit this package to the receiver, who can look at the package, see what algorithm was used for the one-way hash and quickly compute the hash. Then the receiver can decrypt the received encrypted one-way hash and compare it with the hash that was just computed. If the two hashes match, the receiver knows exactly who sent the message and knows that the message was transmitted without modification.
Data confidentiality: There are always occasions when it is insufficient merely to know with whom you're talking and that no one can successfully change the conversation without being detected. Sometimes, you need to know that the conversation is completely private. A more technical term for privacy is "data confidentiality" and once again Kerberos addresses this need. Kerberos provides services that encrypt the entire plaintext message and (optionally) computes a one-way hash of the ciphertext (the output from the encryption engine). The sender transmits the package to the receiver, who decrypts the ciphertext and (optionally) verify the authenticity of the data. Although the data integrity feature is optional, if one is using data confidentiality, it is usually done as the cost of computing, and encrypting the one-way hash is minor compared to the cost of encrypting the whole message.
Encryption is a costly operation, both in terms of processing power and time. If it were not, then data confidentiality would always be used. But it is, so it is important to allow applications to pick the level of protection that they need at the point where it is needed, and Kerberos provides this.
We have described only the high-level mechanics of Kerberos. One may wonder how hackers might attack a Kerberos-secured network and what some of its vulnerabilities might be. One of the main vulnerabilities (outside of physical attacks on the machine housing the KDC) is the human element. The only entrance into the Kerberos protocol is where a user specifies a password to start the process. Users are apt to use a short, easy-to-remember password, like a word one might find in a dictionary. A "hacker" might capture packets from the initial authentication flows, generate encryption keys based on words found in dictionaries, encrypt some packets containing the fairly-predictable initial Kerberos data exchanges, and look for a match. Once a match is found, a hacker has the password to another user's Kerberos account, they can access the network and do whatever that user could do.
A counter for this particular attack is not to allow common words to be used as passwords. The end-user impact is that it makes it harder to remember a password. Another counter is to use Public-Key Cryptography or smartcards for the initial authentication (see "Further Reading"), which takes simple passwords completely out of the picture.
Kerberos has been incorporated into many different product offerings for several reasons. First, it was designed for an environment that closely resembles today's Internet. Second, it has withstood the test of time very well. Another pragmatic point is that the specification for Kerberos only deals with the network protocols; i.e., what the control flows might be and the format of the packets over the network. There are no published standards for application programming interfaces (APIs). This lack of API specification makes it relatively easy to incorporate Kerberos into a product and claim Kerberos compliance, since all that has to be done is obey the network protocol.
Although base Kerberos, itself, is still undergoing some changes, the direction that the industry seems to be taking is to offer up the services that Kerberos supplies, but to generalize the interfaces a bit so that they are not so Kerberos-specific and to standardize on these more generic APIs. This approach is an acknowledgement of the great work that was originally done in the design of Kerberos. Even 15-20 years later not much is being added to base Kerberos except to generalize and then standardize services akin to those originally envisioned by Kerberos' creators. These efforts to create a more generic security API are known as "Generic Security Service Application Program Interface" or GSSAPI. There are even proposed standards for implementations of GSSAPI in Java that are close to acceptance, which would make Kerberos services that much more accessible to modern programming environments like Java (see next section for references).
Bruce Rich is the team lead for the IBM Java Security project. He has been involved in software for 21 years, first in operating systems development, then in secure distributed file systems, and recently in secure Web server applications and Java. He has filed a number of patents and contributed to a book on distributed computing. You can contact him at rbruce@us.ibm.com.
Anthony Nadalin is the lead architect for the IBM Java Security project. As senior architect, he is responsible for infrastructure design and development across IBM. He serves as the primary security liaison to JavaSoft for security design and development collaboration. You can contact him at drsecure@us.ibm.com.
Theodore Shrader is a feature lead in the IBM Java Security project. He has written numerous patents and articles dealing with Internet and Java development, distributed computing, object-oriented design, and database architecture and programming. He is a co-author of an operating systems programming guide published by John Wiley and Sons. You can contact him at tshrader@us.ibm.com. Viva Java Security!
