Thread-specific data

Many applications require that certain data be maintained on a per-thread basis across function calls.

For example, a multithreaded grep command using one thread for each file must have thread-specific file handlers and list of found strings. The thread-specific data interface is provided by the threads library to meet these needs.

Thread-specific data may be viewed as a two-dimensional array of values, with keys serving as the row index and thread IDs as the column index. A thread-specific data key is an opaque object, of the pthread_key_t data type. The same key can be used by all threads in a process. Although all threads use the same key, they set and access different thread-specific data values associated with that key. Thread-specific data are void pointers, which allows referencing any kind of data, such as dynamically allocated strings or structures.

In the following figure, thread T2 has a thread-specific data value of 12 associated with the key K3. Thread T4 has the value of 2 associated with the same key.

Keys	T1 Thread	T2 Thread	T3 Thread	T4 Thread
K1	6	56	4	1
K2	87	21	0	9
K3	23	12	61	2
K4	11	76	47	88

Creating and destroying keys

Thread-specific data keys must be created before being used. Their values can be automatically destroyed when the corresponding threads terminate. A key can also be destroyed upon request to reclaim its storage.

Key creation

A thread-specific data key is created by calling the pthread_key_create subroutine. This subroutine returns a key. The thread-specific data is set to a value of NULL for all threads, including threads not yet created.

For example, consider two threads A and B. Thread A performs the following operations in chronological order:

Create a thread-specific data key K.
Threads A and B can use the key K. The value for both threads is NULL.
Create a thread C.
Thread C can also use the key K. The value for thread C is NULL.

The number of thread-specific data keys is limited to 450 per process. This number can be retrieved by the PTHREAD_KEYS_MAX symbolic constant.

The pthread_key_create subroutine must be called only once. Otherwise, two different keys are created. For example, consider the following code fragment:

/* a global variable */
static pthread_key_t theKey;
 
/* thread A */
...
pthread_key_create(&theKey, NULL);   /* call 1 */
...
 
/* thread B */
...
pthread_key_create(&theKey, NULL);   /* call 2 */
...

In our example, threads A and B run concurrently, but call 1 happens before call 2. Call 1 will create a key K1 and store it in the theKey variable. Call 2 will create another key K2, and store it also in the theKey variable, thus overriding K1. As a result, thread A will use K2, assuming it is K1. This situation should be avoided for the following reasons:

Key K1 is lost, thus its storage will never be reclaimed until the process terminates. Because the number of keys is limited, you may not have enough keys.
If thread A stores a thread-specific data using the theKey variable before call 2, the data will be bound to key K1. After call 2, the theKey variable contains K2; if thread A then tries to fetch its thread-specific data, it would always get NULL.

Ensuring that keys are created uniquely can be done in the following ways:

Using the one-time initialization facility.
Creating the key before the threads that will use it. This is often possible, for example, when using a pool of threads with thread-specific data to perform similar operations. This pool of threads is usually created by one thread, the initial (or another "driver") thread.

It is the programmer's responsibility to ensure the uniqueness of key creation. The threads library provides no way to check if a key has been created more than once.

Destructor routine

A destructor routine may be associated with each thread-specific data key. Whenever a thread is terminated, if there is non-NULL, thread-specific data for this thread bound to any key, the destructor routine associated with that key is called. This allows dynamically allocated thread-specific data to be automatically freed when the thread is terminated. The destructor routine has one parameter, the value of the thread-specific data.

For example, a thread-specific data key may be used for dynamically allocated buffers. A destructor routine should be provided to ensure that when the thread terminates the buffer is freed, the free subroutine can be used as follows:

pthread_key_create(&key, free);

More complex destructors may be used. If a multithreaded grep command, using a thread per file to scan, has thread-specific data to store a structure containing a work buffer and the thread's file descriptor, the destructor routine may be as follows:

typedef struct {
        FILE *stream;
        char *buffer;
} data_t;
...

void destructor(void *data)
{
        fclose(((data_t *)data)->stream);
        free(((data_t *)data)->buffer);
        free(data);
        *data = NULL;
}

Destructor calls can be repeated up to four times.

Key destruction

A thread-specific data key can be destroyed by calling the pthread_key_delete subroutine. The pthread_key_delete subroutine does not actually call the destructor routine for each thread having data. After a data key is destroyed, it can be reused by another call to the pthread_key_create subroutine. Thus, the pthread_key_delete subroutine is useful especially when using many data keys. For example, in the following code fragment, the loop would never end:

/* bad example - do not write such code! */
pthread_key_t key;
 
while (pthread_key_create(&key, NULL))
        pthread_key_delete(key);

Using thread-specific data

Thread-specific data is accessed using the pthread_getspecific and pthread_setspecific subroutines. The pthread_getspecific subroutine reads the value bound to the specified key and is specific to the calling thread; the pthread_setspecific subroutine sets the value.

Setting successive values

The value bound to a specific key should be a pointer, which can point to any kind of data. Thread-specific data is typically used for dynamically allocated storage, as in the following code fragment:

private_data = malloc(...);
pthread_setspecific(key, private_data);

When setting a value, the previous value is lost. For example, in the following code fragment, the value of the old pointer is lost, and the storage it pointed to may not be recoverable:

pthread_setspecific(key, old);
...
pthread_setspecific(key, new);

It is the programmer's responsibility to retrieve the old thread-specific data value to reclaim storage before setting the new value. For example, it is possible to implement a swap_specific routine in the following manner:

int swap_specific(pthread_key_t key, void **old_pt, void *new)
{
        *old_pt = pthread_getspecific(key);
        if (*old_pt == NULL)
                return -1;
        else
                return pthread_setspecific(key, new);
}

Such a routine does not exist in the threads library because it is not always necessary to retrieve the previous value of thread-specific data. Such a case occurs, for example, when thread-specific data are pointers to specific locations in a memory pool allocated by the initial thread.

Using destructor routines

When using dynamically allocated thread-specific data, the programmer must provide a destructor routine when calling the pthread_key_create subroutine. The programmer must also ensure that, when releasing the storage allocated for thread-specific data, the pointer is set to NULL. Otherwise, the destructor routine might be called with an illegal parameter. For example:

pthread_key_create(&key, free);
...

...
private_data = malloc(...);
pthread_setspecific(key, private_data);
...

/* bad example! */
...
pthread_getspecific(key, &data);
free(data);
...

When the thread terminates, the destructor routine is called for its thread-specific data. Because the value is a pointer to already released memory, an error can occur. To correct this, the following code fragment should be substituted:

/* better example! */
...
pthread_getspecific(key, &data);
free(data);
pthread_setspecific(key, NULL);
...

When the thread terminates, the destructor routine is not called, because there is no thread-specific data.

Using non-pointer values

Although it is possible to store values that are not pointers, it is not recommended for the following reasons:

Casting a pointer into a scalar type may not be portable.
The NULL pointer value is implementation-dependent; several systems assign the NULL pointer a non-zero value.

If you are sure that your program will never be ported to another system, you may use integer values for thread-specific data.