Skip to main content
    Country/region [select]      Terms of use
     Home      Products      Services & solutions      Support & downloads      My account     

developerWorks > SOA and Web services >
OMG Interface Definition Language
e-mail it!
IDL basic types
User-defined types
Constant definitions
User exceptions
Arrays, sequences, and string
Names and scoping
About the author
Rate this article
dW newsletters
dW Subscription
(CDs and downloads)
Defining the capabilities of a distributed service

Dave Bartlett
Consultant, author, and lecturer
01 Sep 2000

It all starts with the Interface Definition Language (IDL). This was true when we were writing distributed systems using RPC; it is true when writing with COM; and it is also true for CORBA. In all these cases, the Interface Definition Language provides the mechanics for separating object interfaces from their implementations. IDL provides abstraction, creating something that is considered apart from its concrete existence.

It also gives us a common set of data types that we will use to define more complex data types. We will use all of these types to define the capabilities of our distributed service. Another wonderful aspect of IDL is that it abstracts away programming language dependence and hardware dependence. This article examines the OMG IDL built-in types and keywords.

IDL is a specification language. It allows us to separate the specification of the object (how you interact with it) from the implementation. This is a contract that says "Ms. Client, if you call this method, passing these parameters, then I, Mr. Server, will return to you this array of strings." Client programmers using this interface have no idea what implementation details lie behind the interface.

OMG IDL looks much like C. It is easy to fall into the trap of comparing the languages and their keywords. However, the similarities end at a superficial level. The purposes of each language are entirely different. As we move through the language you should keep in mind that the goal of OMG IDL is to define interfaces and to streamline the process of distributing objects.

IDL basic types
The OMG Interface Definition Language has several fundamental types that should look familiar. Here is a table of these built-in types:

Table 1. IDL basic types

TypeRangeMinimum size in bits
short-215 to 215-116
unsigned short0 to 216-116
long-231 to 231-132
unsigned long0 to 232-132
long long-263 to 263-164
unsigned long long0 to 264-164
floatIEEE single-precision32
doubleIEEE double-precision64
long doubleIEEE double extended floating pointexponent of 15 bits and signed fraction of 64 bits
charISO Latin-18
wcharEncodes wide characters from any wide character set, such as UnicodeImplementation dependent
stringISO Latin-1, except ASCII NULVariable
BooleanTRUE or FALSEUnspecified
octet0 to 2558
anySelf-describing data type that can represent any IDL typeVariable

Integer types
The OMG IDL is straightforward about integer types. Although it does not provide a type int, it doesn't suffer from the ambiguities that have always arisen from ranges of int on different platforms. The IDL, then, really provides integer types in 2-byte (short), 4-byte (long) and 8-byte (long long) sizes.

All these integer types also have corresponding unsigned types. This poses a problem for Java programs because the Java programming language does not support unsigned types. This is not a peculiarity of the OMG IDL, but it creates a rather peculiar situation in the Java-to-IDL mapping, which we'll discuss in next month's column. But until then, you should think about how you would map an unsigned short in IDL to a Java type. Would you use a Java short or a Java int? What would be the advantages and disadvantages to each? These are the questions that the authors of the language mappings must wrestle with, and this is a good exercise to prepare you for the next column.

Floating-point types
The OMG IDL floating point types float, double and long double follow the IEEE 754-1985 Standard for Binary Floating Point Arithmetic. Presently, the long double is for huge numbers and you may find that your particular language mapping is not yet supporting this type.

char and wchar
Just so we are all using the same terminology, a character set is a set of alphabetic or other characters used to construct the words and other elementary units of a native language or computer language. A coded character set (or codeset) is a set of unambiguous rules that establishes a character set and the one-to-one relationship between each character of the set and its bit representation.

When dealing with char you must remember that OMG IDL must deal with character sets on two levels. The first must stipulate exactly what character set an IDL definition will be made up from. The lexical conventions (character tokens that will represent the keywords, comments, and literals of an IDL file) stipulate that the ISO 8859.1 character set represent the characters in an IDL file. Yes, even IDL has to have a standard character set that it will be built on. ISO 464 defines null and other graphic characters.

Next, the OMG must deal with the transmission of characters from one computer system to another. This means that there may be a translation involved from one character codeset to another, depending upon the language bindings. In last month's column we did an IORDump on our Orbacus Object Reference, and in there we found the following information:

Native char codeset:
  "ISO 8859-1:1987; Latin Alphabet No. 1"
Char conversion codesets:
  "X/Open UTF-8; UCS Transformation Format 8 (UTF-8)"
  "ISO 646:1991 IRV (International Reference Version)"
Native wchar codeset:
  "ISO/IEC 10646-1:1993; UTF-16,
   UCS Transformation Format 16-bit form"
Wchar conversion codesets:
  "ISO/IEC 10646-1:1993; UCS-2, Level 1"
  "ISO 8859-1:1987; Latin Alphabet No. 1"
  "X/Open UTF-8; UCS Transformation Format 8 (UTF-8)"
  "ISO 646:1991 IRV (International Reference Version)"

As you can see, an IOR can contain codeset information to negotiate the preferred and available codesets through conversion.

With all that out of the way, you should understand that an OMG IDL char is an 8-bit quantity that can represent a character in two ways. First, it can encode a single byte character from any byte-oriented code set, and second, when used in an array, it can encode any multi-byte character from a multi-byte character set such as Unicode.

Wchar simply allows for codesets that are greater than 8-bit. The specification does not mandate support for a particular codeset. It allows each client and server to use the codeset native to the local machine, then specify how characters and strings are converted for transmission between environments using different codesets.

There's not much to say here; a Boolean can have only the values of TRUE or FALSE.

An octet is an 8-bit type. This turns out to be an important type because the octet is guaranteed not to have any representation changes as it is transmitted between address spaces. This means you can transmit binary data and know that it will arrive in the same form as when it was packaged up. Every other IDL type may have its representation altered during transmission. For example, an array of char may undergo codeset conversions as directed by the IOR codeset information. An array of octet would not.

any Type
An IDL any is a structure that can contain any data type. The type could be a char or long long or string or another any or a type that you have created such as Address. The any container is made of a type code and a value. The type code describes what is in the value part of the any.

If you have C++ experience you can think of an any as a self-describing data type that is similar but safer than a void *. If you have Visual Basic experience, you can think of an any as being similar to a variant. The mechanics of the any type and how all this magic is pulled off for a user-defined type will become evident when we look at the IDL-to-Java mapping.

User-defined types
The fundamental types are essential; they provide the building blocks for our interface definitions. OMG IDL provides you with the capability to define types of your own that will help hide some of the complexity and allow you to compose more sophisticated data types from the fundamental data types. These complex types can be enumerations, structures, and unions, or you can use typedef to create a new name for a type.

Named types
You should use typedef to create a new name for a type that will help clarify the interface or save typing.

For example, you may want to pass the barometric pressure in a method PresentWeather(..., in float Pressure, ...). It will make your method more readable if you typedef a float to be used within this method.

typedef float AtmosPressure;

In C++ the typedef keyword suggests type definition when in fact alias would probably be a more accurate term. This may or may not be true with OMG IDL depending upon the language mapping for your implementation language. The CORBA specification provides no guarantee that two typedefs of a short will be compatible and interchangeable.

Stylistically, you should be careful not to create aliases for existing types. Instead, you should try to create conceptually different types that will add readability and extendibility to your IDL. It would be best to define a logical type exactly once and use that definition throughout your interface.

An OMG IDL enumeration is a way of attaching names to numbers, thereby giving more meaning to anyone reading the code. The OMG IDL version of enumeration looks like the C++ version.

enum CloudCover{cloudy, sunny};

CloudCover is now a new type that can be used within your IDL. The OMG IDL guarantees that enumerations are mapped to a type with at least 32 bits because you are allowed up to 232 identifiers within an enumeration. The specification does not dictate the ordinal value an identifier will have, but it does dictate that the order will be preserved. Therefore, you cannot assume that cloudy will always have the ordinal value of 0, it could be given a value of 1 in some language mappings. But you can be sure that cloudy will test to be less than sunny.

If you think about IDL's goal as being to define interfaces that will span heterogeneous systems, then it makes sense that ordinal values are not specified. You only want to send the value to the server. This would be "cloudy". In the server space cloudy could be represented by 0 or 1 or whatever the implementation language dictates. Some implementation languages will not allow you to control the ordinal values, as C++ will. OMG IDL will not permit empty enumerations.

The struct keyword provides a way to collect a group of variables together into a structure. Once created, a struct represents a new type that can then be used throughout your interface definition.

struct Date {
  short month;
  short day;
  long year;

When defining structs, you want to make sure that you are creating types that are readable. You do not want to create several different structures with the same name in different name spaces as this will only confuse the users of your IDL.

Discriminated unions
The OMG CORBA specification describes the IDL union as a cross between a C union and a switch statement. The IDL discriminated union must have a typed tagged field to determine which union member will be used in the current instance. Like C++, only one member of the union is active at a time and that member can be determined from the discriminator.

enum PressureScale{customary,metric};
union BarometricPressure switch (PressureScale) {
  case customary :
    float Inches;
  case metric :
    short CCs;

In the example above, the short CCs is active if the discriminator is metric or an unrecognized discriminator value is used. If the discriminator is customary, then the float member, Inches, is active. The union member can be any type, including user-defined complex types. The discriminator type must be an integral type (short, long, long long, and so on, as well as char, boolean, or enumeraton).

Constant definitions
The syntax and semantics for defining constants in IDL are identical to C++. Your constants can be integer, character, floating-point, string, Boolean, octet or enumerated but not of type any or a user-defined type. Here are some examples:

const float MeanDensityEarth = 5.522;      // g/cm^3
const float knot = 1.1508;                 // miles per hour
const char NUL = '\0';

Integer constants can be defined in decimal, hex or octal notation:

const long ARRAY_MAX_SIZE = 10000;
const long HEX_NUM = 0xff;

Floating-point literals use the usual C++ conventions for exponent and fraction:

const double SPEED_OF_LIGHT = 2.997925E8;
const double AVOGADRO = 6.0222E26;

Character and string constants support the standard escape sequences:

const char TAB = '\t';
const char NEWLINE = '\n';

You can use arithmetic operators within your constant declarations as long as there are no mixed type expressions.

User exceptions
IDL allows you to create exceptions to indicate your error conditions. An IDL user exception is similar to a structure in that the exception can contain any amount of error information of the types you choose. An exception is ultimately used by being raised from an operation. Here is an example:

exception DIVIDE_BY_ZERO {
  string err;
interface someIface {
  long div(in long x, in long y) raises(DIVIDE_BY_ZERO);

Exceptions create namespace. Therefore, member names must be unique only within the exception. Exceptions cannot be used as data members of user-defined types. There is no exception inheritance in OMG IDL.

Arrays, sequences, and string
Passing one element at a time is fine, but we often have lists or vectors or matrices of information that we want to pass back and forth between our client and server. Arrays are a common type in almost any programming language, but the implementation of the array from one language to the next is usually different. The challenge for the OMG IDL developers was to create a set of array types that would easily map to the language implementations. These requirements gave rise to the IDL array and sequence. The string type is a special sequence that will allow languages to leverage much of their string libraries and optimizations.

OMG IDL has multidimensional, fixed-size arrays of arbitrary element type. All arrays must be bounded. Arrays are perfect to use with lists that have a fixed number of elements that are always present. For example:

// bounded and unbounded array examples
typedef long shares[1000];
typedef string spreadsheet[100][100];
struct ofArrays {
 long anArray[1000];
// unbounded arrays NOT ALLOWED
// typedef long orders[];

The array dimensions must be specified and they must be positive constant integer expressions. IDL does not support open arrays, as in C and C++, because there is no pointer support. The typedef keyword must be present, unless you are specifying the arrays as part of a structure.

In many instances, what the CORBA specification does not say is as important as what it does say. The specification does not specify array indexing in any way, shape, or form. This means that array indexes from one language implementation to another may be different and so you cannot send an array index from a client to a server and expect the server to point to the proper array element. Some languages start array indexes at 0 and others at 1.

You will use sequences a great deal in developing your interface definitions. Sequences provide flexibility when dealing with arrays of data that may have many values that are identical.

A sequence is a variable length vector that has two characteristics: a maximum size, which is determined at compile time and which could be unlimited; and a length, which is determined at run time. Sequences may contain elements of all types whether fundamental types or user-defined.

Sequence may be bounded or unbounded. For example:

// bounded and unbounded sequence examples
typedef sequence<long> Unbounded;
typedef sequence<long, 31> Bounded;

An unbounded sequence can hold any number of elements, constrained only by the limits of your platform memory. A bounded sequence is constrained by the bound. Both types of sequences may contain no elements, user-defined types, or they may contain other sequences.

string and wstring
A string is equivalent to a sequence of char, while a wstring represents a sequence of wchar. As a compromise to C and C++, the OMG IDL string and wstring may contain any character except null. Char or wchar conventions determine the size of the element types with string represented by 8-bit quantities and wstring 16-bit or greater.

Strings are special in IDL, but then they are special in most languages. Many languages have libraries and special optimizations to handle string manipulation. By putting string in its own type, the OMG allows the language mapping to use special optimizations that would not be handled with a general-purpose sequence.

Names and scoping
All OMG IDL identifiers are case sensitive. This means that two identifiers differing only in the case of their characters will be considered to be redefinitions of one another. You should be aware that in deference to case-sensitive languages, all references to a definition must use the same case as the defining occurrence.

The IDL scoping rules are straightforward and easy to grasp. The contents of an entire OMG IDL file, along with any files brought in through preprocessor directives, form a naming scope. Any definitions that do not appear inside a scope are part of the global scope and there can be only a single global scope. Under the global scope, the following definitions form scopes: module, interface, struct, union, operation, and exception.

The module keyword is used to create namespaces; that is its sole purpose. The modules you define will create a logical group and liberal use of modules will prevent pollution of the global namespace. The root or global space is considered empty and each time the module keyword is encountered during a file scan the string "::" and the identifiers are appended to the name of the current root. This enables types from other modules to be referenced by including their name scope, as in Pennsylvania::river which you will see in the example below.

An identifier may be defined once in a scope, but can be redefined in nested scopes. The following example will explain some of these points:

module States {
  // error: redefinition
  // typedef sequence<string> states;
  module Pennsylvania {
    typedef string river;
    interface Capital {
      void visitGovernor();
  module NewYork {
    interface Capital {
      void visitGovernor();
    interface Pennsylvania {
      void visit();
  module NewJersey {
    typedef Pennsylvania::river NJRiver;
    // Error
    // typedef string Pennsylvania;
    interface Capital {
      void visitGovernor();

Each one of our inner modules (Pennsylvania, New York, and New Jersey) have an interface Capital with an operation visitGovernor(). But they don't step on each other because they are each inside the module. We have problems with a redefinition of 'States' when we try to create a sequence of the same name under the module States. The redefinition of Pennsylvania occurs after it has already been introduced as a scope resolution identifier for 'NJRiver' in the New Jersey module. Notice that we don't get an error in the New York module with interface Pennsylvania because the outer Pennsylvania module has not been introduced via some scope resolution identifier.

Now it is time to define interfaces, which is why we are looking at the OMG Interface Definition Language in the first place. A good way to think about an IDL interface is that it specifies a software contract between a service implementation and the clients that will use it. Let's start with an interface that exercises what we have learned about IDL. Since this is all about communication, lets look at some IDL that defines a Listener and a Speaker. The Listener must connect to the Speaker, then the Speaker will pass messages on to the Listener. This is a callback example.

// Thrown by server when the client passes
// an invalid connection id to the server
exception InvalidConnectionIdException
 long invalidId;
// This is the callback interface that
// the client has to implement in order
// to listen to a talker.
interface Listener
  // Called by the server to dispatch messages on the client
  void listen(in string message);
  // Called by the server when the connection
  // with the client is successfully opened
  void engage(in string person);
  // Called by the server when the connection with the client is closed
  void disengage(in string person);
// interface on the server side
interface Speaker
  // Called by the client to open a new connection
  // Returned long is the connection ID
  long register(in Listener client, in string listenerName);
  // Makes the server broadcast the message to all clients
  void speak(in long connectionId, in string message)
  // Called by the client to sever the communication
  void unregister(in long connectionId)

With this definition we define two new CORBA interface types: Listener and Speaker. Each interface has several methods that will be used by the other end of the connection. The client will start the connection by getting an initial object reference to a server object that implements the Speaker interface. This object reference could be passed to it or it could be retrieved from a naming service. Most importantly, the client makes the initial contact with the Speaker. Next, the client, who is also a Listener because it implements the Listener interface, must register with the Speaker and pass in a reference to their Listener interface. This allows them to receive messages from the Speaker.

The important item to notice is that the Listener interface is used as a type in the register method. Interface names become types and can be passed as parameters. It looks as if the Listener object is being passed, but in reality it is an object reference. This is another example of the CORBA model providing location transparency.

It is worth noting that each object reference (IOR) points to only one interface. Each interface exposes the details of one or more distributed objects. I say "one or more" because there can be thousands of objects that implement the same interface in a distributed system. In our example here, there could be thousands of Listeners that the Speaker is sending messages to. To a certain degree, then, IDL interfaces correspond to class definitions, and CORBA objects correspond to class instances.

I have only scratched the surface of OMG IDL. What should be obvious is that OMG IDL provides a rich set of built-in types and keywords that can be used to create a fine-grained description for interacting with an object in a distributed system. Since the language looks so much like C, you should realize all the descriptive power that has made C so successful. All of the OMG service definitions are written in IDL, proving the power of the OMG IDL. All the vertical market standardization efforts (Financial, CORBAMed, and so on) are written in IDL, proving its flexibility.

Learning to use OMG IDL correctly and efficiently is a great place to begin your journey toward learning CORBA and writing excellent distributed systems. Everything starts with IDL, and if you can get it correct at the start of a project, your chance for success grows dramatically.


About the author
Dave Bartlett lives in Berwyn, Pennsylvania, consulting, writing and teaching. He is the author of Hands-On CORBA with Java, a 5-day course presented via public sessions or in-house to organizations. Presently, Dave is working to turn the course material into the book Thinking in CORBA with Java. Dave has Masters degrees in Engineering and Business from Penn State. If you have questions or are interested in a specific topic, you can contact Dave at

e-mail it!
Rate this article

This content was helpful to me:

Strongly disagree (1)Disagree (2)Neutral (3)Agree (4)Strongly agree (5)


developerWorks > SOA and Web services >
  About IBM  |  Privacy  |  Terms of use  |  Contact