An introduction to Service Data Objects for C++

Take a quick tour of the C++ API to SDO

In this article, you are introduced to the API you need to work with Service Data Objects from C++, and you get easy access to the main elements of the API for a rapid startup.


Ed Slattery, Software Engineer, IBM UK

Ed Slattery photoEd Slattery joined the Java Technology Centre as a software engineer in 1999. He initially worked on shiraz reusable VM technology, then graphics (AWT/Swing), and is now in incubator projects team, where he has written the core of the C++ implementation of SDO. This core has now been contributed to the Tuscany incubator project with the Apache software foundation. Contact Ed at

Pete Robbins, Software Engineer, IBM UK

Pete Robbins photoPete Robbins is a software engineer at IBM Hursley, UK. He joined IBM in 1981 and has worked in a variety of development and technical planning roles. Most recently, he has been working on the Tuscany open source SOA project, where he developed the XML serialization of SDO in C++. He then moved on to develop the initial SCA for C++ implementation that is part of the Apache incubator project. He is also involved in the SCA specification collaboration. Contact Pete at

21 March 2006

Also available in Chinese


Read this article to get a quick tour of the basic application programming interface (API) for Service Data Objects in C++. Beginning with a brief overview of the concepts behind Service Data Objects (SDO), you move on to each major area of the API, which should give you enough background information so you can get started on your first application.

Overview of SDO

Service Data Objects provide a means of describing the structure of a graph of data elements and a means of loading a particular instance of the data based on that graph description from any data source. SDO also provides the ability to track changes to the graph of data as it is used by an application and to record those changes in a way that can travel with the data as it moves from machine to machine. SDO provides, therefore, the ability to load data from a database and pass the data to a temporarily connected client. Upon return of that data, SDO can then recognize any changes that occurred while the client was disconnected.

The key constituents of SDO are:

  • The metadata (or description of the data). This is a framework of types and properties that provides a set of rules to which the data must conform and a means of accessing a data element by describing its path from another data element. The metadata may be built dynamically or loaded from an XSD description.
  • An API to populate and manipulate graphs of data. The data may be loaded from a data source (using a Data Access Service) or created dynamically.
  • A change summary API to record all the changes made to a graph of data.
  • The ability to serialize, and thus transport, all three elements -- the metadata, the data, and the change summary.

The Data Access Service follows an API description such that third parties can write their own Data Access Services to any data source. The basic SDO implementation provides services to allow serialization and deserialization from XML.

The metadata is described in terms of types and properties, and a complete description is held by a data factory. This factory is used to create data objects, so validation against the metadata is performed by the factory before creation is allowed.

A Type is a way of describing a data object, similar to a Class in Java. Two distinct varieties of Type exist, DataType and DataObjectType. DataTypes are similar to Java primitives and represent elements such as an integer, a byte, a Boolean, and others. A basic set of DataTypes representing the normal primitives of most programming languages is provided by default with a data factory. Others can be defined by the user, by extending or restricting the basic types.

A Type is defined by its Uniform Resource Identifier (URI) and its name. For example, the predefined DataTypes all have a URI of commonj/sdo and the names are Byte, Boolean, Short, Integer, Long, and so forth.

DataTypes do not have properties.

DataObjectTypes, on the other hand, can have properties, which can be considered as either "containment" or "reference." Containment properties allow the data object to contain actual data values. Reference properties allow the data object to refer to values held in other locations.

DataObjectTypes represent groupings of primitives and other DataObjectTypes, and are the building blocks of data graphs. A DataObjectType can have named properties, and those properties may themselves be DataTypes or DataObjectTypes. To allow the graph to easily represent tabular data, each property may also be defined as many valued or single valued.

A Type may be a subType of another type, and can inherit all its properties. A Type can be abstract, and the data factory will not allow you to create an instance of that type. A Type which is abstract is akin to pure virtual class in C++ or an interface in Java.

The company example

The following example, a basic definition of a typical company, illustrates all the preceding points.

The company has a name, many departments, and a director. Each department has a name and many employees. Each employee has a name, a serial number, and a desk location. The director may have a private account number. The company has a union representative, who is one of the employees.

To illustrate subtypes, let's define DirectorType and EmployeeType as subtypes of PersonType. PersonType has a property called name. DirectorType has a property called account, and EmployeeType has properties called serial number and desk location. The companyType will need to have a reference property called UnionRep, pointing at one of the employees.

We can define the Types and Properties necessary to instantiate a company.

First, let's define CompanyType. CompanyType is a DataObjectType, since it has properties (such as name, departments).

Figure 1

shows the metadata we want to describe:

Figure 1. Types

You can load this metadata from an XSD schema or or create it dynamically. Before looking at the XSD, let's use a DataFactory in the C++ API to define the metadata dynamically, as shown in Listing 1.

Listing 1. Using a DataFactory in C++ API
DataFactoryPtr df = DataFactory::getDataFactory();

// df is an empty data factory, containing only definitions of the primitives. We will 
// now define the types.
// Descriptions of 'open' , and 'sequenced' appear later in the document - dont worry 
//about them for now.

bool isSequenced = false;
bool isOpen = false;
bool isAbstract = false;
bool isDataType = false;

df->addType("mynamespace","CompanyType", isSequenced, isOpen, isAbstract, isDataType);
df->addType("mynamespace","DepartmentType", isSequenced, isOpen, isAbstract, isDataType);
df->addType("mynamespace","DirectorType", isSequenced, isOpen, isAbstract, isDataType);
df->addType("mynamespace","EmployeeType", isSequenced, isOpen, isAbstract, isDataType);

isAbstract = true; // we will not let a person be created, only employees or directors
df->addType("mynamespace","PersonType", isSequenced, isOpen, isAbstract, isDataType);

// now we need to tell the data factory that employees and directors "are" persons.


// now we need to add properties

bool isMany = false;
bool isReadOnly = false;
bool isContainment = false;

// add all the single valued strings...

    isMany, isReadOnly, isContainment);
    isMany, isReadOnly, isContainment);
    isMany, isReadOnly, isContainment);
    isMany, isReadOnly, isContainment);
df->addPropertyToType("mynamespace","EmployeeType","Serial Number","commonj/sdo","String",
    isMany, isReadOnly, isContainment);

// since unionrep is a reference, isContainment" remains false...

// the director is contained, unlike the union rep ...

// departments and employees are many valued, and contained...
    "DepartmentType", isMany, isReadOnly, isContainment);
    "EmployeeType", isMany, isReadOnly, isContainment);

Object creation

The metadata is now defined, so we can move on to creating or loading actual data instances, such as the real company, which, for this example, is named ACME.

The data factory allows us to create data objects, but also each data object created has an API to create related data objects. In principle, you could use the factory to create just the company, and then use the company data object to create the departments and employees. The code shown in Listing 2 populates our company, ACME.

Listing 2. Populating the company
// create an object of type CompanyType
DataObjectPtr acme = df->create("mynamespace","CompanyType");

// set its Name property

//Use the company to create departments. The property "Departments" tells the 
// factory which type of object to create.

DataObjectPtr sales = acme->createDataObject("Departments")

DataObjectPtr marketting = acme->createDataObject("Departments")

// Now we can use the departments to create employees...

DataObjectPtr sales_emp1 = sales->createDataObject("Employees")
sales_emp1->setCString("Name","Helena B Carter");

DataObjectPtr sales_emp2 = sales->createDataObject("Employees")
sales_emp2->setCString("Name","Anthony W Thompson");

DataObjectPtr marketting_emp1 = marketting->createDataObject("Employees")
marketting_emp1->setCString("Name","Justin Marples");

DataObjectPtr marketting_emp2 = marketting->createDataObject("Employees")
marketting_emp2->setCString("Name","Stephen McTavish");

You should probably consider the following:

  • Why did createDataObject allow me to create two employees, and how can I reference them now?
  • How can I access or address one data object from another one?
  • What is a DataObjectPtr and who takes care of deleting used DataObjects?

createDataObject uses the property name to distinguish how to behave. In the case of Employees, the property is defined as many-valued, so each call to createDataObject creates another one in the list. If you called createDataObject twice on a property called Director, the second object replaces the first.

As you can see in the preceding code, you are really accessing properties by name and calling the company object and telling it to work with the Departments property, which is then calling a departments object and telling it to work with its Employees property. This method of access extends further to a subset of XML Path Language (XPath). That is, you can also access properties of objects related to an object by addressing them directly, enabling you to access the first department of the company and get its name, as shown in Listing 3.

Listing 3. Accessing properties of objects
cout << "Department 1=" << acme->getCString("Departments[1]/Name") << endl;

Or, as shown in Listing 4, you can access the second employee of the second department.

Listing 4. Accessing the second employee
cout << "Employee =" << acme->getCString("Departments[2]/Employees[2]/Name") << endl;

The syntax, shown in Listing 5, separates data objects from properties by a slash (/) and references elements of lists by either 1 or .0 syntax, which will find the same employee.

Listing 5. Syntax to separate data objects from properties
cout << "Employee =" << acme->getCString("Departments.1/Employees.1/Name") << endl;

Note: In the dot syntax the first element is element zero, and in the bracket syntax, the first element is element 1.

You can access lists as a whole, as shown in Listing 6, to iterate over them by first getting the list itself.

Listing 6. Accessing lists
DataObjectList& emps = acme->getList("Departments[1]/Employees");
// Now we can walk the list or add or remove items.

Memory management

Up to this point, you've been working almost exclusively with DataObjectPtrs . And you've just seen a DataObjectList& (reference). To use both lists and DataObjectPtrs correctly, you must now address memory management in SDO.

SDO handles all its memory allocation and de-allocation internally, and does not require user code to be involved. A DataObjectPtr is really a wrapper class holding a pointer to the DataObject. This wrapper class drops its connection to the DataObject when it goes out of scope, or gets nulled. DataObjectPtrs must not be deleted.

Internally to the library, all DataObjects in a graph are referenced by their parent and are maintained until detached from the data graph, so you don't need to delete DataObjects. They will be deleted for you when you drop your client reference to them, and they are not connected to a data graph.

DataObjectLists are returned as references. These references indicate that you are looking at the real list of values, not a copy, and it will be updated as you make changes to the property value.

So far, you learned how to create metadata, then populate and access a data graph corresponding to the metadata. It is time to mention that all of this could also be done using the XSDHelper and XMLHelper classes. An XSDHelper will load metadata from an XSD document, and an XMLHelper lets you load data graphs from an XML file.

Using an XSD schema to do these tasks is shown in Listing 7.

Listing 7. Using XSD schema

Click to see code listing

Listing 7. Using XSD schema

    <xsd:element name="company" type="company:CompanyType"/>
     <xsd:complexType name="CompanyType">
             <xsd:element name="Departments" type="company:DepartmentType" maxOccurs="unbounded"/>
            <xsd:attribute name="Name" type="xsd:string"/>
            <xsd:attribute name="Director" type="company:DirectorType"/>
            <xsd:attribute name="UnionRep"     type="xsd:IDREF"    doxml:propertyType="company:EmployeeType"/>
        <xsd:complexType name="PersonType">
         <xsd:attribute name="Name" type="xsd:string"/>
     <xsd:complexType name="DepartmentType">
         <xsd:restriction base="company:PersonType" />
             <xsd:element name="Employees" type="company:EmployeeType"     maxOccurs="unbounded"/>
     <xsd:complexType name="EmployeeType">
         <xsd:restriction base="company:PersonType" />
         <xsd:attribute name="Serial Number" type="xsd:ID"/>
         <xsd:attribute name="Desk Location" type="xsd:string"/>

The data entered as XML is shown in Listing 8.

Listing 8. Data entered as XML
<?xml version="1.0" encoding="UTF-8" ?>  
<company xmlns="mynamespace" xmlns:xsi="" 
name="ACME"    >
     <Departments Name="Sales"    >
          <Name>Helena B Carter</Name> 
          <Name>Anthony W Thompson</Name> 
<Departments Name="Marketting"    > 
          <Name>Justin Marples</Name> 
          <Name>Stephen McTavish</Name> 

The following code, shown in Listing 9, replaces all the code in Listing 8.

Listing 9. Replacement code
DataFactoryPtr df = DataFactory::getDataFactory();

XSDHelper xsdh = HelperProvider::getXSDHelper(df);

XMLHelper xmlh = HelperProvider::getXMLHelper(df);
XMLDocumentPtr doc = xmlh->createDocument(comp,"companyNS","company");
DataObjectPtr acme = doc->getRootDataObject();

The methods of XSDHelper and XMLHelper allow you to load and save XSD/XML information, so you can use them to save both metadata and data from a graph to the serialized form.

Listing 10 shows the XSDHelper saving the metadata to a new XSD file.

Listing 10. XSDHelper saving the metadata to a new XSD file

Listing 11 shows the XMLHelper saving the document to a new file.

Listing 11. XMLHelper saving document to a new file

Exception handling

The SDO API throws a number of exceptions. For example, accessing a data object by an XPath expression which is invalid will throw an SDOInvalidPathException. Trying to use a property that does not exist will throw an SDOPropertyNotFoundException. All these exceptions inherit from SDORuntimeException, so the minimal error handling required is shown in Listing 12.

Listing 12. Minimal error handling
try {
DataFactoryPtr df = DataFactory::getDataFactory();

XSDHelper xsdh = HelperProvider::getXSDHelper(df);

XMLHelper xmlh = HelperProvider::getXMLHelper(df);
XMLDocumentPtr doc = xmh->createDocument(comp,"companyNS","company");
DataObjectPtr acme = doc->getRootDataObject();
catch (SDORuntimeException e)
cout << "Exception caught: "     << e << endl;

Listing 12 shows the basic SDO API. Next we look at Sequences, Open Types, the Introspection API, and Change Summaries.


A DataObjectType may be defined as sequenced, which means that in addition to the API for setting properties, there is another API on these DataObjects allowing you to set the same properties. You need to also remember the order in which you set the values. This gives you the powerful ability to know the order of settings, even across many-valued properties and thus, to know that you set, for example, the first element of listA, followed by the first element of listB, and then second element of listA.

Further, the sequence API gives you the ability to add free text settings between the property settings. One could set the first element of listA, then a text item saying "Hello," and then the second element of listA. Sequences are useful for recording series of events, where the order is important.

Open Types

An Open Type is a special form of DataObjectType. It has properties, but if you try to set a property of it which is does not contain, it will not throw an exception like normal data objects. Instead it simply creates the property for you -- on that particular instance of the data object. In other words, you might have an open type called OpenEmployee, which has a property Name. All of these are valid, as shown in Listing 13.

Listing 13. Open Type properties
DataObjectPtr emp = df->create("myspace","OpenEmployee");
emp->setCString("Name","Alphonse Dodet");
emp->setInteger("AlfsNumber", 256);

DataObjectPtr emp2 = df->create("myspace","OpenEmployee");
emp2->setCString("Name","Bill McCawber");
emp2->setBoolean("BillsBool", true);

When you query the properties of the type OpenEmployee, it will tell you it has a property called Name. When you query the properties of emp, it will tell you it has properties of Name and AlfsNumber. Emp2 will tell you it has properties of Name and BillsBool. Note that the choice of API used to set the open property decides what type of property is required. (BillsBool is a Boolean, where AlfsNumber is an Integer.)


The ability to find out the types you have available and examine their properties is called the introspection API. You can query the properties belonging to a type and the properties belonging to a data object. You could, for example, display the whole metadata of the companyType as shown in Listing 14.

Listing 14. Displaying metadata
const Type&  t = df->getType("mynamespace","Company");

cout << "==============================================================" << endl;
cout << "Type: " << t.getURI() << "#" << t.getName() << endl;
if (t.isSequenced())cout << "is Sequenced" << endl;
if (t.isOpenType())cout     << " is Open" << endl;

PropertyList&  pl = t.getProperties();

for (int i=0; i < pl.size() ;i++)
             cout << "     Property: " << pl[i].getName() << " ";
             if (pl[i].isMany()) cout << "Many Valued ";
             else cout << "Single Valued ";
             if (pl[i].isDataType) cout << " DataType "; 
                          if (pl[i].isContainment()) cout " Containment ";
                          else cout << " Reference ";
                 cout << pl[i].getType().getURI() << "#" << pl[i].getType().getName() << endl;

cout << "===============================================================" << endl;

Change summary

A change summary may be attached to any DataObjectType. The only rule when creating instances of DataObjects is that no DataObject with a change summary may contain another DataObject with a change summary. A change summary is attached like a Property to the Type, but does not appear in the list of Properties which are settable. The change summary is initially inactive, so the user code needs to access it and set it logging. As soon as it is logging, it records every change to a property of its data object, and every data object in the tree below. It records creations, changes, and deletions. It records the old values of every item before it was updated (except creations). Multiple changes to the same object are only recorded once, so the old value remains set at the oldest unchanged value. Deletions record the entire state of the DataObject before it was deleted.

The user code may switch off logging at will, and all changes are saved, up to the point that logging was switched off. All changes from a previous log are lost when logging is switched on again.

When a DataObject with a change summary is serialized, the change summary is also saved and can be restored with the data object. The change summary API allows you to query the settings of all the objects in your graph and assess whether they have been changed or not.


You've gotten a very brief overview of the SDO for C++ API. Get the specification document now.



developerWorks: Sign in

Required fields are indicated with an asterisk (*).

Need an IBM ID?
Forgot your IBM ID?

Forgot your password?
Change your password

By clicking Submit, you agree to the developerWorks terms of use.


The first time you sign into developerWorks, a profile is created for you. Information in your profile (your name, country/region, and company name) is displayed to the public and will accompany any content you post, unless you opt to hide your company name. You may update your IBM account at any time.

All information submitted is secure.

Choose your display name

The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerWorks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

Required fields are indicated with an asterisk (*).

(Must be between 3 – 31 characters.)

By clicking Submit, you agree to the developerWorks terms of use.


All information submitted is secure.

Dig deeper into SOA and web services on developerWorks

Zone=SOA and web services
ArticleTitle=An introduction to Service Data Objects for C++