C++0x, scoped enums
Introduction and good usage patterns for scoped enumerations
Enumerations, commonly called enums, are constructs in a programming language that allow users to group a set of values under one group and assign a name to each value. Sometimes, you want to represent several states or values that are static and constant. In that case, it's better to enumerate those states, assign them some integral values to make them comparable, and establish relationships between their values if necessary, such as a = 1 and b = a+2.
That's what C enums offer you: a better, shorter way of creating a set of
#define values that are not globally visible at file scope. However, you can further improve on this enumeration representation, and you'll see in later sections how C++ enums and scoped enums achieve that. But first, it will help to review the history of enums.
History of enums in C and C++
It all started, as most of C++ did, in C, as C enums. And before C enums came into existence, enumerating a set of numeric values was accomplished with plain #define directives. Figure 1 shows the time line of enumerations in C/C++ and an example of enumerating four values (top left, top right, bottom left, bottom right).
Figure 1. C enumerations time line
Notice that we added the direction enum, but we renamed its enumerators so that they're different from those of
windowCorner enum to avoid conflicting names (enumerators are injected into the enclosing scope, so TOP_LEFT and TOPLEFT are in the same scope and, therefore, cannot have the same name).
Enums in C and C++
A C/C++ enumeration is like a
struct with static constant integral members (Figure 2), but the members are injected into the enclosing scope of the
struct. In C++, you cannot initialize or assign to an enum variable any value other than one from an enumerator or variable of the same enumeration type. But in C, it is allowed, which makes C enums less type-safe then C++ enums.
Figure 2. Struct with static const data members simulating enums
It is preferable not to allow members of one
struct be compared to members of a different
struct. This is because when you enumerate a set of values, you're creating a unique type with members that are comparable to each other but not comparable to external values, no matter how similar their representation is. Of course, in many situations, we care about the integral value, but it should never make sense to compare two enumerators of different enumerations. For example, one enum can represent colors and another can represent days of the week. Even though both enums have an integral value that makes them mathematically comparable, semantically, it would be invalid. It is preferable that enums have such a property.
The same argument could be made for using enum variables or constants in a context where a different enum or an integral type is expected: it makes no sense. For example, a
foo function taking an
int should not be called with an argument of the enum type. It might be desirable in some cases that this is allowed, but in general, you want your enum to hold certain semantics, not just get converted into a plain boring integer.
Unfortunately, none of those enumeration mechanisms offer these two guarantees. It is obvious that with #defines you are dealing with integers directly, and there are no types other then
ints, so no restrictions exist. For C and C++ enums, you are allowed to compare, for example,
TOPRIGHT (with reference to Figure 3), because the two enumerators are converted to an integral type and compared as integrals. You are also allowed to use an enum in any integral context, thus wherever an integral type is expected (see Figure 3).
Figure 3. Sample code for non type-safe use of C/C++ enums
Notice that we had to rename the enumerators of
enum direction, so that they are different from the enumerators of
enum windowCorner. The reason for this is that all enumerators belong to the scope enclosing the enumeration. So if the enum was defined inside a class, the enumerators become class members, and if it is defined in global scope then the enumerators become globally visible. This is an inconvenience, especially if the enums are declared in namespace or global scope, where you might have many of them.
There is one more issue with our enums: size and signedness. Is it coherent across different compilers? No, it's not. According to the C++ standard, the underlying type of an enum is semi-defined, because the compiler can choose what integral type to use to represent the enums, so that the type is less then an
int if all enum values can be represented with
int. Otherwise, it can be any other, bigger type.
The lack of a well-defined underlying type leads to the inability to forward declare an enum, which is weird, because
structs and unions can be forward declared without knowing anything about them. The issue lies in the way enums are handled and passed.
Forward declaration of enums helps further separate the interface from implementation (as does all other forms of forward declaration). Other benefits of forward declaration is decoupling compilation units that were coupled by enum definition, which reduces total compilation time, and hiding implementation details from the user. Having the enum definition in every compilation unit means that the whole enumeration is visible when it really shouldn't be. It would be beneficial if we could, though.
Let's analyze the code example in Figure 4. The program consists of three files:
There are two versions of the implementation and interface:
- Version 1 is how the code will look like with C++03
- Version 2 is what we would like it to look like and is similar to what it looks like with C++0x.
In Version 1, the definition of
enum E is part of the interface directly (or indirectly, if we wanted to use
#includes), and there's no escaping that. This means that if the definition of
enum E changes,
usage.C has to be recompiled, but it really shouldn't, because it doesn't depend on the definition of
enum E. Now, if we were able to write code similar to Version 2, the definition of E will be independent of
interface.h, thus of
usage.C. In addition to decoupling
implementation.C, we also have hidden the definition of
enum E from the user, which is an ability that library developers want very much.
Figure 4. Decoupling interface from implementation by using forward declaration
Features of scoped enums
The scoped enums solve most of the limitations incurred by regular enums: complete type safety, well-defined underlying type, scope issues, and forward declaration. The syntax for scoped enums is similar to regular enums (which are now called unscoped enums), but you need to specify the class or
struct keyword after the
enum keyword. You can also specify the underlying type using a colon followed by an integral type.
Enums are not integral types, so you cannot specify another enum as the underlying type of another enum.
Figure 5 shows enum examples.
Figure 5. Scoped and unscoped enums
You get type safety by disallowing all implicit conversions of scoped enums to other types, thereby restricting any kind of non-arithmetic operation (assignment, comparison) on enums to just the set of enumerators and enum variables of the same enumeration, and disallowing any arithmetic operation on them. You can still use scoped enums in places such as switch statements, but you would be limited to maintaining a uniform enum type in the switch condition and case labels. See the code example in Figure 6 for a better understanding of this.
Another feature of scoped enums is the introduction of a new scope, called the enumeration scope, which basically starts and ends with the opening and closing brackets of the enum body. Therefore, scoped enumerators are not injected into the enclosing scope anymore, thus saving us from the name conflicts of regular enums. But now there's a slight inconvenience in that you always have to refer to a scoped enumerator with an enumeration qualified name. For example:
Figure 6 demonstrates the scoping rules.
Next is the underlying type. Scoped enums gives you the ability to specify the underlying type of the enumeration, and for scoped enums, it defaults to
int if you choose not to specify it. An unscoped enum with omitted underlying type will simply behave like regular C++03 enums, with implementation defined underlying type. When the underlying type is specified explicitly or implicitly (for scoped enum only), it is called fixed; otherwise, it's not fixed. This means that regular enum syntax does not have a fixed underlying type (see Figure 5).
Figure 6. Conversion and scoping of scoped enums
Finally, you can address the last issue, which becomes available as soon as the underlying type issue is resolved: forward declaration of enums. Basically, any enum with a fixed underlying type can be forward declared. As mentioned above, forward declaration has lots of benefits, such as decoupling code and hiding implementation of an enum when it's not part of a user's problem space. To forward declare an enum, you just declare it without the body section so that the underlying type is fixed (Figure 7 illustrates the rules). You can re-declare multiple times, but all declarations should be consistent with prior declarations, so they should have the same underlying types and be of the same kind (scoped or unscoped).
Figure 7. Forward declaration rules for unscoped enums
Good usage patterns
Enums are usually used to represent states, types, kinds, conditions, and anything that is a set of members with no particular functionality other then to be a unique collection of elements. Regular enums offer a bit of type safety, specifically during assignment, but it all goes bad when you try to compare or use an enum in an integral context. There are many patterns for enum usage, so this article discusses some that exist with regular enums and some new ones that can be used only with scoped enums.
Class inheritance, the kind enum
Suppose that you have a parent class called Widget and a bunch of child classes, such as Button, Label, and so on. A common way to identify an object that is being pointed to by the
Widget* pointer is to have an enumeration -- call it
enum kind -- in the parent class and have one enumerator for each type of child. Then you add an
enum kind member variable in the parent so that every child sets that member to its designated enumerator (see Figure 8)
Figure 8. Using enums to identify objects of derived types
Then all you have to do is look at that type member and identify the real type of the object, based on the enum value. This is all fine until you have a similar set of classes that inherit from each other, and they use the same enum mechanism to hold the type of the child, yet you're using both enum types in the same code. Other than assignment, all other operations that involve implicit conversions are not type-safe. With scoped enums, that would be different, because you will be forced to compare and assign from the same set of enumerators, and you are not allowed to use enums in an integer context without explicit conversions. If you try to do otherwise, such as comparing two enums of different enumerations or you use enums where an implicit conversion to another type is needed, you will get compile-time errors. With regular enums, those logical mistakes will pass silently.
Type safe bool
The Boolean (
bool) type in C++ is not type-safe, because it can be converted to and from any other integral type. (Actually, it's not type-safe because it is an integral type.) Sometimes, you need a type-safe
bool to represent, for example, critical conditions that will allow only explicit manipulation, so they cannot be initialized, assigned, or compared to any other value of a different type. You can always achieve that with a class. For instance, in "A Typesafe Boolean Class" (see Resources) the author proposes a
bool type for which he can control its conversion parameters, meaning what can be converted from and to the
bool type. That way might be flexible but very cumbersome to maintain, at least for beginners.
With scoped enum, you can create a type safe Boolean condition type the way that Figure 9 shows.
This use case was first suggested by Chris Bowler, XL C++ front-end compiler developer, IBM Canada.
Figure 9. Type-safe bool
Its use is much safer compared to C++ built-in Boolean type. Observe the two examples in Figures 10 and 11. In Figure 10, we represent the three conditions by three scoped enums. In Figure 11, we use three Boolean variables for the conditions. The
initiate() function does some reasoning, maybe altering the values of x, y, and z, and then passes them to
handle() twice. The first call misplaces the last two arguments, and the second call omits the
int argument and uses the default argument for the fourth parameter.
bool can be converted to
int, and vice versa, the example in Figure 10 passes silently, because the compiler finds the necessary implicit conversions and integral promotions to change the function call parameters to the right type:
y is converted to
3 converted to
bool. However, the example in Figure 11 will result in a compile time error because a scoped enum cannot be converted to
int, and an
int cannot be used to initialize a scoped enum.
Figure 10. Bool type as condition type
Figure 11. Scoped enum as condition type
You could argue that you can improve the
bool version by using unscoped enums (the old C++03), but you would still fail to detect the implicit conversion of
int. Using unscoped enums, has another disadvantage related to their scoping problem. Notice how the true or false enumerators of each condition are called simply
False, even though they are all defined in the same scope. You cannot do that with unscoped enums, because they are injected into their enclosing scope, and you would have name conflicts if two injected enumerators had the same name.
Another thing worth mentioning is the clarity of the functions' interface. In Figure 11, you know what each parameter indicates without using expressive variable names. You can go further and notice that if the functions' declaration resided in a separate .h file, then (a) you might need to go to that file to understand what the
bool x and
bool y stand for, and (b) the declaration might be missing variable names. But with scoped enums, the description of each parameter is embedded in the enum name, which you are forced to mention in both declaration and definition. Of course, you can still fail to give a proper name for you scoped enum, but then you're just purposely hurting yourself, its like having a program with class names, such as A1, A2, A3.
Type-safe state representation
States appear a lot in C++ programs. Any time that you encounter situations where there is a set of entities that you need to represent or enumerate, you would use enums. A common programming pattern is passing contextual information through a hierarchy of function calls. Let's say you have a set of functions that work on some part of your problem space, and there's dynamic information that you want to maintain throughout the functions' execution. One way is to make a globally accessible object that acts as a database maintaining the dynamic info, but we want to avoid global variables because they add coupling to the code. Another way would be to create a Context class that holds the dynamic information, and pass a reference or pointer to a Context object in the function calls. Yet another way is to pass each piece of information explicitly in the function calls. This would be better then creating a Context object if the information you're passing is small. It is quite frequent that the dynamic context being passed is of Boolean or enumerated type. Here is a perfect opportunity to use our type-safe
bool and type-safe enum.
Figure 12 demonstrates how scoped enums can be used to create a safer program that has more control over the execution process through a tight grasp (compile time detection) of the conditions that define the execution path. The contextual information in this scenario is the action trigger. For simplicity, we have two actions, but you can imagine that the size of the information can be bigger in real-life scenarios.
Similar to the previous examples, by using enums for the conditions, we're ensuring that both the declaration and definition of the functions indicate what every parameter stands for. At each function call, the arguments of the functions clearly state what the value of the condition is. Misplacing parameters or using incompatible types is an error caught at compile time, rather than never being caught if
bools were used. The context information can be safely processed and manipulated in a consistent fashion, so there are no unsafe comparisons nor silent, implicit conversions.
Figure 12. Context passing
Finally, this code is more portable, and enums can be forward declared, because the underlying type of each enumeration is fixed. The power of scoped enums is clearly demonstrated by having clearer code, type-safe conditions, type-safe enums, and portable code.
- Check these sources for more information related to this article:
- Find out more about XL C/C++ for AIX and Linux:
- Subscribe to the developerWorks weekly email newsletter, and choose the topics to follow.
- Get the free trial download for XL C/C++ for AIX.
- Get the free trial download for XL C/C++ for Linux.
- Try XL C/C++ for IBM AIX by spending a while in the Enterprise Modernization Sandbox.
- Download a free trial version of Rational software.
- Evaluate other IBM software in the way that suits you best: Download it for a trial, try it online, use it in a cloud environment, or spend a few hours in the SOA Sandbox learning how to implement service-oriented architecture efficiently.