The documentation to the OpenSSL API is a little vague. Not many tutorials on the use of OpenSSL exist either, so getting it to work in applications can be a little troublesome for beginners. So how can you implement a basic secure connection using OpenSSL? This guide will help to solve that problem.
Part of the problem with learning how to implement OpenSSL is the fact that the documentation is not complete. An incomplete API documentation normally keeps developers from using the API, which normally spells doom for it. Yet OpenSSL is still around and going strong. Why?
OpenSSL is the best-known open library for secure communication. A Google search for "SSL library" returns OpenSSL at the top of the list. It started life in 1998 being derived from the SSLeay library developed by Eric Young and Tim Hudson. Other SSL toolkits include GNU TLS, distributed under the GNU General Public License, and Mozilla Network Security Services (NSS) (see Resources later in this article for additional information).
So what makes OpenSSL better than GNU TLS, Mozilla NSS, or any other library? Licensing is one issue (see Resources). In addition, GNS TLS (thus far) supports only TLS v1.0 and SSL v3.0 protocols, and not much more.
Mozilla NSS is distributed under both the Mozilla Public License and the GNU GPL, allowing the developer to pick. But Mozilla NSS is larger than OpenSSL and requires other external libraries to build the library, whereas OpenSSL is entirely self-contained. And like OpenSSL, much of the NSS API is not documented. Mozilla NSS has PKCS #11 support, which is used for cryptographic tokens, such as Smart Cards. OpenSSL lacks this support.
To get the most out of this article, you should:
- Be proficient in C programming
- Be familiar with Internet communication and writing Internet-enabled applications
A familiarity with SSL is not absolutely required, as a short explanation of SSL will be given later; however, look in the Resources section if you want to find links to articles discussing SSL in detail. A knowledge of cryptography is a plus as well, but not required.
SSL is an acronym that stands for Secure Sockets Layer. It is the standard behind secure communication on the Internet, integrating data cryptography into the protocol. The data is encrypted before it even leaves your computer, and is decrypted only once it reaches its intended destination. Certificates and cryptographic algorithms are behind how it all works, and with OpenSSL, you have the opportunity to play around with both.
In theory, if the encrypted data were intercepted or eavesdropped before reaching its destination, there is no hope of cracking that data. But as computers become ever faster as each year passes, and new advances in cryptanalysis are made, the chance of cracking the cryptography protocols used in SSL is starting to increase.
SSL and secure connections can be used for any kind of protocol on the Internet, whether it be HTTP, POP3, or FTP. SSL can also be used to secure Telnet sessions. While any connection can be secured using SSL, it is not necessary to use SSL on every kind of connection. It should be used if the connection will carry sensitive information.
OpenSSL is more than just SSL. It is capable of message digests, encryption and decryption of files, digital certificates, digital signatures, and random numbers. There is quite a bit to the OpenSSL library, much more than can be put into one article.
OpenSSL is more than just the API, it is also a command-line tool. The command-line tool can do the same things as the API, but goes a step further, allowing the ability to test SSL servers and clients. It also gives a developer an idea of OpenSSL's capabilities. For information on how to use the OpenSSL command-line tool, look in the Resources section.
First, you're going to need the latest version of OpenSSL. See the Resources section to find out where you can get the latest source code to compile yourself or a binary library of the latest version if you don't feel like spending time to compile. For the sake of security, however, I would recommend downloading the latest source code and compiling it yourself. Binary distributions are typically compiled and distributed by third parties, not by the OpenSSL developers.
Some Linux distributions come with a binary version of OpenSSL, which will work fine for learning how to use the library; but be sure to get the latest version and keep it up to date if you're going to do anything real-world.
For Linux distributions that install from RPMs (Red Hat, Mandrake, and so on), it is recommended that you update your OpenSSL distribution through an RPM package available from the maker of your distribution. For reasons of security, it is also recommended that you have the latest version of your distribution. If the latest version of OpenSSL is not available for your distribution, then it is recommended that the only files you overwrite are the libraries, not the executable. Details for this are included in the FAQ document that comes with OpenSSL.
It should also be noted here that OpenSSL is not officially supported on all platforms. While efforts have been made to make it as cross-platform-compatible as possible, it is possible that OpenSSL may not work on your computer and/or operating system. See the OpenSSL web site (linked from Resources) for information on which platforms are supported.
If you will be using OpenSSL to make certificate requests and digital
certificates, then a configuration file must be created. A template file
called openssl.cnf is available in the
apps folder of the OpenSSL package.
I won't be discussing this, as the file is not required for the scope of
this article. However, the template file is very well annotated and
an Internet search will lead you to many tutorials which discuss
modification of this file.
There are only three headers that will be used by this tutorial: ssl.h, bio.h, and err.h. All are in the openssl subdirectory, and all three will be required for developing your project. There are also only three lines necessary to initialize the OpenSSL library. All are listed in Listing 1. Other headers and/or initialization functions may be required for other features.
Listing 1. Required headers
/* OpenSSL headers */ #include "openssl/bio.h" #include "openssl/ssl.h" #include "openssl/err.h" /* Initializing OpenSSL */ SSL_load_error_strings(); ERR_load_BIO_strings(); OpenSSL_add_all_algorithms(); |
Setting up an unsecured connection
OpenSSL uses an abstraction library called BIO to handle communication of various kinds, including files and sockets, both secure and not. It can also be set up as a filter, such for UU or Base64 coding.
The BIO library is a little complicated to fully explain here, so I will introduce bits and pieces of it as it becomes necessary. First, I will show you how to set up a standard socket connection. It takes fewer lines than using the BSD socket library.
Prior to setting up a connection, whether secure or not, a pointer for a BIO object needs to be created. This is similar to the FILE pointer for a file stream in standard C.
Listing 2. Pointer
BIO * bio; |
Creating a new connection requires a call to BIO_new_connect. You can specify both the hostname
and port in the same call, as shown in Listing 3, which will also attempt
to open the connection for you. You can also separate this into two
separate calls: one to BIO_new_connect to
create the connection and set the hostname, and one to BIO_set_conn_port (or BIO_set_conn_int_port) to set the port number.
Regardless, once both the hostname and port number are specified to
the BIO, it will attempt to open the connection. There isn't any way
around this. If there was a problem creating the BIO object, the pointer
will be NULL. A call to BIO_do_connect must be
made to verify that the connection was successful.
Listing 3. Creating and opening a connection
bio = BIO_new_connect("hostname:port");
if(bio == NULL)
{
/* Handle the failure */
}
if(BIO_do_connect(bio) <= 0)
{
/* Handle failed connection */
}
|
Here, the first line creates a new BIO object with the specified
hostname and port, formatted in the fashion shown. For example, if you
were going to connect to port 80 at www.ibm.com, the string would be www.ibm.com:80. The call to BIO_do_connect checks to see if the connection
succeeded. It returns 0 or -1 on error.
Reading and writing to the BIO object, regardless of whether it is a
socket or file, will always be performed using two functions: BIO_read and BIO_write.
Simple, right? And the good part is that it stays that way.
BIO_read will attempt to read a certain
number of bytes from the server. It returns the number of bytes read, or
0 or -1. On a blocking connection, a return of 0 means that the
connection was closed, while -1 indicates that an error occurred.
On a non-blocking connection, a return of 0 means no data was available,
and -1 indicates an error. To determine if the error is recoverable, call
BIO_should_retry.
Listing 4. Reading from the connection
int x = BIO_read(bio, buf, len);
if(x == 0)
{
/* Handle closed connection */
}
else if(x < 0)
{
if(! BIO_should_retry(bio))
{
/* Handle failed read here */
}
/* Do something to handle the retry */
}
|
BIO_write will attempt to write bytes to
the socket. It returns the number of bytes actually written, or 0 or -1.
As with BIO_read, 0 or -1 does not
necessarily indicate an error. BIO_should_retry is the way to find out. If the
write operation is to be retried, it must be with the exact same
parameters as before.
Listing 5. Writing to the connection
if(BIO_write(bio, buf, len) <= 0)
{
if(! BIO_should_retry(bio))
{
/* Handle failed write here */
}
/* Do something to handle the retry */
}
|
Closing the connection is simple as well. You can close the
connection in one of two fashions: BIO_reset,
or
BIO_free_all. If you're going to reuse the
object, use the first. If you won't be reusing it, use the second.
BIO_reset closes the connection and resets
the internal state of the BIO object so that the connection can be reused.
This is good if you're going to be using the same object throughout the
application, such as with a secure chat client. It does not return a value.
BIO_free_all does just what it says: it
frees the internal structure and releases all associated memory, including
closing the associated socket. If the BIO is embedded in a class, this
would be used in the class' destructor.
Listing 6. Closing the connection
/* To reuse the connection, use this line */ BIO_reset(bio); /* To free it from memory, use this line */ BIO_free_all(bio); |
Setting up a secure connection
Now it's time to throw on what is needed to set up a secure connection. The only part that changes is setting up and making the connection. Everything else is the same.
Secure connections require a handshake after the connection is established. During the handshake, the server sends a certificate to the client, which the client then verifies against a set of trust certificates. It also checks the certificate to make sure that it has not expired. Verifying that the certificate is trusted requires that a trust certificate store be loaded prior to establishing the connection.
The client will send a certificate to the server only if the server requests one. This is known as client authentication. Using the certificate(s), cipher parameters are passed between the client and server to set up the secure connection. Even though the handshake is performed after the connection is established, the client or server can request a new handshake at any point in time.
Handshakes and other aspects of setting up a secure connection are discussed in detail in the Netscape articles and RFC 2246 listed in the Resources section.
Setting up for a secure connection
Setting up for a secure connection requires a couple more lines of code.
Another pointer is required of the type SSL_CTX. This is a structure to
hold the SSL information. It is also used to set up the SSL connection
through the BIO library. This structure is created by calling SSL_CTX_new
with an SSL method function, typically SSLv23_client_method.
Another pointer of type SSL is also needed to hold the SSL connection structure (this is required for something that will be done shortly). This SSL pointer can also be used later to examine the connection information or to set up additional SSL parameters.
Listing 7. Setting up the SSL pointers
SSL_CTX * ctx = SSL_CTX_new(SSLv23_client_method()); SSL * ssl; |
Loading the trust certificate store
After the context structure is created, a trust certificate store must be loaded. This is absolutely necessary for verification of the peer certificate to succeed. If the certificate cannot be verified for trust, OpenSSL flags the certificate as invalid (but the connection can still continue).
OpenSSL comes with a set of trust certificates. They are in the
certs directory of the source tree. Each certificate
is a separate file, though -- meaning that each one must be loaded separately.
There is also a subfolder under certs with expired
certificates. Attempting to load these will cause errors.
You can load each file individually if you like, but for the sake of simplicity, the trust certificates from the latest OpenSSL distribution are included in the source code archive in a single file called "TrustStore.pem." If you already have a trust store file that will be used for your particular project, simply replace "TrustStore.pem" in Listing 8 with your file (or load both of them with separate function calls).
Call SSL_CTX_load_verify_locations to load the
trust store file. This takes three parameters: the context pointer, the
path and the filename of the trust store file, and a path to a directory of
certificates. One of either the trust store file or directory of certificates
must be specified. It returns 1 on success, else 0 if there was a problem.
Listing 8. Loading a trust store
if(! SSL_CTX_load_verify_locations(ctx, "/path/to/TrustStore.pem", NULL))
{
/* Handle failed load here */
}
|
If you are going to use a directory to store the trust store, the
files must be named in a certain way. The OpenSSL documentation
spells out what this is, but there is a tool that comes with OpenSSL
called c_rehash that prepares a folder for use
as the path parameter to SSL_CTX_load_verify_locations.
Listing 9. Preparing a certificate folder and using it
/* Use this at the command line */
c_rehash /path/to/certfolder
/* Then call this from within the application */
if(! SSL_CTX_load_verify_locations(ctx, NULL, "/path/to/certfolder"))
{
/* Handle error here */
}
|
You can name as many separate files or folders as necessary to specify all of the verification certificates you may need. You can also specify a file and a folder at the same time.
The BIO object is created using BIO_new_ssl_connect, taking the pointer to the SSL
context as its only parameter. The pointer to the SSL structure also
needs to be retrieved. In this article, this pointer is only used with
the SSL_set_mode function. That function is used
to set the SSL_MODE_AUTO_RETRY flag. With this option set, if the server
suddenly wants a new handshake, OpenSSL handles it in the background. Without
this option, any read or write operation will return an error if the server
wants a new handshake, setting the retry flag in the process.
Listing 10. Setting up the BIO object
bio = BIO_new_ssl_connect(ctx); BIO_get_ssl(bio, & ssl); SSL_set_mode(ssl, SSL_MODE_AUTO_RETRY); |
With the SSL context structure set up, the connection can be created.
The hostname is set using the BIO_set_conn_hostname function. The hostname and
port are specified in the same format as above. This function also opens
the connection to the host. A call to BIO_do_connect must still be performed to verify
that the connection was opened successfully. This same call also performs the
handshake to set up the secure communication.
Listing 11. Opening a secure connection
/* Attempt to connect */
BIO_set_conn_hostname(bio, "hostname:port");
/* Verify the connection opened and perform the handshake */
if(BIO_do_connect(bio) <= 0)
{
/* Handle failed connection */
}
|
Once the connection is established, the certificate should be checked to see that it is valid. Actually, OpenSSL does this for us. If there are fatal problems with the certificate -- for instance, if the hash values are not valid -- then the connection simply won't happen. But if there are non-fatal problems with the certificate -- as when it has expired or is not yet valid -- the connection can still be used.
To find out if the certificate checked out okay with OpenSSL, call SSL_get_verify_result with the SSL structure as the only
parameter. If the certificate passed OpenSSL's internal checks, including
checking for trust, then it returns X509_V_OK. If something was wrong, it
returns an error code that is documented under the
verify option for the command-line tool.
It should be noted that a failed verification does not mean the connection cannot be used. Whether or not the connection should be used is dependent upon the verification result and security considerations. For example, a failed trust verification could simply mean that the trust certificate is not available. The connection can still be used, just with heightened security in mind.
Listing 12. Checking if a certificate is valid
if(SSL_get_verify_result(ssl) != X509_V_OK)
{
/* Handle the failed verification */
}
|
And that is all that is required. Any communication with the server
is as normal using BIO_read and BIO_write. Closing the connection requires a simple
call to BIO_free_all or BIO_reset, depending on whether the BIO will be
reused.
At some point before the end of the application, the SSL context
structure must be released. Call SSL_CTX_free
to free the structure.
Listing 13. Cleaning up the SSL context
SSL_CTX_free(ctx); |
So OpenSSL has thrown an error of some kind. What does it mean?
First you need to get the error code itself; ERR_get_error does this. Then you need to turn that
code into an error string, which is a pointer to a string permanently
loaded into memory by SSL_load_error_strings or
ERR_load_BIO_strings. This can be done in a
nested call.
Table 1 outlines the ways to retrieve an error from the error stack. Listing 14 shows how to print out the last error message in a text string.
Table 1. Retrieving errors from the stack
ERR_reason_error_string | Returns a pointer to a static string, which can then be displayed on the screen, written to a file, or whatever you wish to do with it. |
ERR_lib_error_string | Tells in which library the error occurred. |
ERR_func_error_string | Returns the OpenSSL function that caused the error. |
Listing 14. Printing out the last error
printf("Error: %s\n", ERR_reason_error_string(ERR_get_error()));
|
You can also have the library give you a preformatted error string.
Call ERR_error_string to achieve this. It
takes the error code and a pre-allocated buffer as its parameters. The
buffer must be 256 bytes long. If this parameter is NULL, OpenSSL writes
the string to a static buffer that is 256 bytes in length, and returns a
pointer to that buffer. Otherwise, it will return the pointer you
provided. If you choose the static buffer option, that buffer will be
overwritten with the next call to ERR_error_string.
Listing 15. Retrieving a preformatted error string
printf("%s\n", ERR_error_string(ERR_get_error(), NULL));
|
You can also dump the entire error queue into either a file or BIO.
This is achieved through ERR_print_errors or
ERR_print_errors_fp. The
queue is dumped in a readable format. The first sends the queue to a
BIO,
while the second sends it to a FILE. The
string is
formatted in this
manner (from the OpenSSL documentation):
[pid]:error:[error code]:[library name]:[function
name]:[reason string]:[file name]:[line]:[optional text message]
where [pid]
is the process ID, [error code]
is an
8-digit hexadecimal
code, [file name]
is the source code file in the
OpenSSL library, and
[line] is the line number in that source
file.
Listing 16. Dumping the error queue
ERR_print_errors_fp(FILE *); ERR_print_errors(BIO *); |
Creating a basic connection with OpenSSL is not difficult, but the documentation can be a little intimidating when trying to figure out how to do it. This article introduced you to the basics, but there is quite a bit of flexibility with OpenSSL yet to be discovered, and advanced settings that you may need to adequately implement SSL functionality for your project.
There are two samples included with this article. One shows an unsecured connection to http://www.verisign.com/, while the other shows a secured SSL connection to https://www.verisign.com/. Both connect to the server and download the home page. There aren't any security checks and all settings within the library are the default -- it should be used for educational purposes only as a part of this article.
The source code should readily compile on any supported system, but it is recommended that you have the latest version of OpenSSL. At the time of this writing, the latest version is 0.9.7d.
-
Download the source code used in this article.
- Read the other articles in this OpenSSL series.
-
You can download OpenSSL sources from the OpenSSL Project; be sure also to check on the current state of documentation. You can also learn a lot from the mailing lists (scroll down to the bottom for a link to the archives), and -- of course, as always -- do take the time to read the FAQ!
-
The ancestor of OpenSSL was SSLeay (it even had quite nice documentation).
-
Gain more insight from "Introduction to SSL," and the SSL 3.0 protocol specification.
-
See also the two-part article "An Introduction to OpenSSL Programming" (Linux Journal, 2001) (and Part II), and another two-parter "Securing Sockets with OpenSSL" from Sams via (informIT, 2001), and its Part 2.
-
ead BIO library documentation and sample chapters from Network Security with OpenSSL (O'Reilly & Associates, 2002) online. The Sams book is Linux Socket Programming (Sams, 2001).
-
OpenSSL is released under a BSD/Apache-type license. If you're a Free Software fan (or a fan of good documentation), you may also want to check out The GNU Transport Layer Security Library (note that GPL'd software cannot link against OpenSSL without an exception clause). Mozilla Network Security Services (NSS) is dual-licensed under the Mozilla Public License (MPL) and under the GNU General Public License (GNU GPL) and also has decent documentation. For more on TLS, see the Wikipedia article Transport Layer Security.
-
Find out the minute, technical details about Transport Layer Security in RFC 2246, which defined the standard. It was updated by RFC 3546, which defined extensions to the TLS protocol.
-
In "Network programming with the Twisted framework, Part 4" (developerWorks, September 2003) David Mertz discusses SSL programming with the Python twisted framework.
-
To learn more about socket programming, see Programming Linux sockets, Part 1 (developerWorks, October 2003) and Part 2, a tutorial series also by David Mertz (developerWorks, January 2004). Beej's Guide to Network Programming Using Internet Sockets is also a good resource for those just starting out with socket programming.
- But if you are completely new to sockets, first read "Understanding Sockets in Unix, NT, and Java" (developerWorks, June 1998), which offers a nice intro-level overview of what they are and what they're good for.
-
See also the IBM docs on Sockets from Communications Programming Concepts Sockets and Programming sockets on AIX from the Technical Reference: Communications, Volume 2.
-
Gain an introductory understanding of "Encryption using OpenSSL's crypto libraries" (Linux Gazette, 2003), and a greater understanding of encryption in general with "Introduction to cryptography" (developerWorks, March 2001). The Handbook of Appplied Cryptography (CRC Press, 1996) is available online in both Postscript and PDF formats (an updated, 2001 version is available for purchase).
-
Find more resources for Linux developers in the developerWorks Linux zone.
- Browse for books on these and other technical topics.
Kenneth is a Software Engineer working for the MediNotes Corp. in West Des Moines, Iowa. He graduated from Peru State College in Peru, Nebraska, with a Bachelor of Science in Business Administration. He also has an Associate of Science in Computer Programming from Southwestern Community College in Creston, Iowa. Kenneth has written several applications and programming libraries.