Spreadsheets provide an intuitive and powerful front end to represent and manipulate business information. Microsoft® Excel is the de facto tool for spreadsheets. The main problem with Excel is its inability to seamlessly transfer information between the spreadsheet and a relational data source like Informix® or DB2®. Often the users end up writing complex macros to do this. This process is buggy, expensive, difficult to maintain, and frequently beyond the skill set of the regular Excel power user.
IBM® Office Connect enables Excel users to overcome this limitation by providing a simple GUI-based patented process that enables information in an Excel spreadsheet to be transferred seamlessly to multiple datasources. Here are some of the things you can accomplish using Office Connect:
- Develop powerful BI applications just using Excel.
By using a drag and drop paradigm, Excel developers can bind their worksheets to multiple data sources. Since Excel also provides a slew of data manipulation and presentation functions, Office Connect enables Excel power users to do even complex e-Business applications just using spreadsheets. - Integrate multiple data sources in the same Excel application.
Although primarily used against relational databases, Office Connect has the ability to connect to other data sources also. Currently it can bind an Excel worksheet to a relational database like Informix or DB2 or to HTML-based Web pages. Information from all these data sources can be brought into the same workbook and integrated. - Distribute to enterprise users to enable secure and controlled access to business data over the web.
These workbooks can be distributed to the enterprise users over the web, and they have the ability to not only to be refreshed with data from the bound datasource but also to modify the data and then save it back to the same datasource even over the Internet. This capability of Office Connect to provide virtual access to potentially sensitive data in a database makes it essential that it have inbuilt features to handle information security:
- To prevent unauthorized access to information to a bound data source, Office Connect uses LDAP to provide a role-based authentication mechanism.
- While transferring the data on the Web, it has the capability to makes use of SSL to encrypt the information on the pipe.
Office Connect has two flavors based on the Excel user's role in the organization:
- Office Connect Integrated Development Environment (IDE) for Excel power users (developers)
- Office Connect lightweight Web client for end user (consumer)
Let's take a look at each of these.
Integrated Development Environment
This is the main development environment for the product and is made of four logical entities:
- Add-in to Excel
- Java datasource server and plug-ins
- Metadata (XML-based information regarding data source, binding, GUI, and so on)
- Office Connect database repository framework
Figure 1 shows how these entities are related.
Figure 1. Integrated Development Environment

Now we'll take a look at each aspect of the development environment.
Add-in to Excel
This module consists of a number of dynamic libraries and an Excel add-in file. When it's installed, Office Connect is integrated with Excel and adds its menus and functionality to the product.
Datasource server and plug-ins
The datasource engine is written in Java and is invoked as a separate process when Excel starts up. Communication between the Excel add-in and the Java server is achieved using sockets. All API calls and data transfer between the two occur via an XML file. The engine itself can be extended for any new data source by plugging in Java classes. This extension is possible even at run time. The datasource engine uses the data source's native connectivity layer (such as JDBC) to communicate with the data source.
Figure 2. Datasource server and plug-ins

Office Connect XML metadata
An Office Connect project consists of an Excel workbook and XML metadata. The Excel workbook contains all static text, formatting information and business logic (formulas), and the XML metadata contains all the datasource and binding information. When an Office Connect project is saved, the metadata is also saved inside the associated Excel workbook.
Note: Office Connect code makes extensive use of XML, from driving the GUI controls and localization, to datasource manipulations and transformation.
Figure 3. Office Connect XML

Office Connect database repository framework
An Office Connect project can be saved into the repository for distribution. The repository consists of a group of tables that can be created in any database that has BLOB data type support. These tables contain user and role information besides storing Office Connect project files for distribution.
A repository manager client is provided as a part of the Office Connect Excel Add-in. The web client can also access information in the repository via the Office Connect application server.
Once a project has been created, it can be distributed over the Web using a lightweight client and a J2EE-compliant application server.
Figure 4. Deployment environment

A lightweight Excel add-in client
This module consists of an ActiveX and an Excel add-in file. On installation, the Office Connect client is integrated with Excel and adds its menus and functionality to the product.
Excel workbooks, data and authentication information are transferred between the client and the application server. Communication between the two occurs via HTTP. Support for SSL is also provided. The format of the data exchanged is XML. SQL calls to refresh or save data from the workbook are dynamically generated in the client. The client caches the result set in memory and later uses it to handle synchronization and concurrency.
Application server
The application server consists of a Java servlet and a datasource engine. The servlet parses out the XML commands from the client and redirects them to the datasource engine for communication with the data source. The servlet also connects to the repository to retrieve the Office Connect projects for the client. These project files are serialized to the client directly over HTTP for security reasons.
The servlet does not maintain any result set in memory. It releases the connection after each transaction. This allows Office Connect to scale to thousands of users easily.
Authentication for data access is controlled using either LDAP or a rudimentary user and role management system that is shipped as a part of the Office Connect integrated development product.
Office Connect is integrated into the Excel environment when it is installed. As you can see in Figure 5, its menus, toolbars and functionality are incorporated into Excel without blocking any of Excel's native operations, unlike some other Excel add-ins.
Figure 5. Office Connect integration into the Excel environment
Discovering the datasource schema
The Project Manager, shown in Figure 6, is the main development window for Office Connect. From here the user can seamlessly connect to DB2, Informix, or any other database and import the database schema. Office Connect uses JDBC to connect to the database. Based on the database server selected, the JDBC Driver is automatically chosen, and a sample template of the connection URL is presented in the grid.
Figure 6. Project Manager
Users can set filters to limit the schema information returned, as shown in Figure 7.
Figure 7. Using filters to limit the schema information
The returned database schema is represented in the form of a tree where the nodes of the tree represent different database objects like tables, UDF's, views, and so on. From this tree the required database objects can be selected to import their complete metadata. This metadata is converted into XML format and stored in a project file associated with the current workbook.
Building a schema model and binding it to the workbook
Office Connect uses the schema extracted from the database for further development, and the connection to the database is closed. This XML schema metadata can also be exported to a file for reuse.
This disconnected, model-driven approach allows a DBA to create XML-based schema models of the database that can be later used by Excel power developers without letting them have any access to the database schema directly. In a larger organization this can prove to be quite desirable.
The imported data source objects and their properties are visualized in the Office Connect Project Manager. Users can modify the database object properties in the model to suit their application requirements. For example the Read Only property of a field (SALES_PERSON) could be set to true. Doing this would prevent end users from being able to save back data in the Excel cell bound to this field. Figure 8 shows such an example.
Figure 8. Modifying database object properties in the model

These database objects can be bound to the spreadsheet by simply dragging and dropping them onto it. This process of drag and drop results in:
- Creating an Excel range and binding the database object to it
- Generating the SELECT, UPDATE, INSERT and DELETE SQL if the object is a table
- Adding the binding information to the project metadata.
Custom SQL and Query Builder
When a simple database object like a table is not sufficient, the users can enter their own custom SQL statements for SELECT, INSERT, UPDATE and DELETE. A Query Builder utility is also provided to help users to graphically construct client side views. These views (SQL queries) can also be bound to a range in the spreadsheet. Users can also associate custom INSERT, UPDATE and DELETE SQL statements with these views.
Figure 9 shows the Query Builder utility.
Figure 9. Query Builder utility

Refreshing and updating the data
Once a database object has been bound to an Excel range, the user can refresh the range from the database by selecting the Refresh from DB option in the pop-up menu.
Modifications to the data in the range can be made and saved back to the database by simply selecting the modified rows and then clicking on Save to DB from the pop-up menu, shown in Figure 10. The selected rows are read from the spreadsheet and UPDATE or INSERT SQL statements are generated for each row based on the information in the cached result set (from the refresh operation). If even a single row fails to save, the whole transaction is rolled back. Office Connect follows an all or nothing approach for all database operations.
Figure 10. Deployment environment

Concurrency is handled by appending the original record from the cached result set as a condition for each row being updated. If the record has been modified in the meantime, the database update operation fails to locate it and a concurrency error is thrown. The users are prompted with a message indicating that someone has modified the record, and they are given the option of selecting to override the changes. If the user agrees, then a new UPDATE SQL is generated without the original record being appended to it and an unconditional update is performed.
In a real world business application, there is need to limit the query by using predicates. Office Connect allows the users to bind predicates (parameters) in the query to cells in the worksheet to create a query by example (QBE), shown in Figure 11.
Figure 11. Deployment environment

In the example shown above, the final query has the criteria appended to it as an added condition. Based on the the Figure 11 example, the SELECT SQL would append SALES_PERSON = ':::C5/>' And SALES_DATE =':::D5/>'. On refresh the query would be:
SELECT "AJAYA"."SALES"."SALES_DATE", "AJAYA"."SALES"."SALES_PERSON",
"AJAYA"."SALES"."REGION", "AJAYA"."SALES"."SALES"
FROM "AJAYA"."SALES"
WHERE "SALES_PERSON" = 'LEE' AND "SALES_DATE" = '12/31/1995'
|
Linking two bound ranges together can create master detailed scenarios. Predicates for the detail range are bound to columns in the master range. The detail range is refreshed from the master range and the predicate values for it are picked from the current row in the master range (that is, from where the refresh is being initiated).
Once the users have created both the master (Sheet1!SALESibmExternalData)and detail ranges (Sheet!STAFFibmExternalData), they can link them by selecting the detail range as the Cascade Range property of the master range.
Figure 12. Master detail

The next step is to create a QBE for the detail and bind at least one predicate (NAME) for the detail range to a cell (D2 i.e SALES_PERSONcolumn) in the master range. Now if the user brings up the Office Connect popup menu from within the master range, another menu item Refresh Detail: STAFF is added to it. On clicking this menu item, the detail range is refreshed using the value (LEE) from the SALES_PERSON column (D5) in the current row of the master range and substituting it in the NAME predicate of the details ranges refresh SQL.
In our example we have created a QBE for the master range by binding the SALES_PERSON and SALES_DATE to cells C5 and D5 respectively. Further the master range drives the refresh in the detail range based on the values in the SALES_PERSON column for the current row.
Excel also allows users to add controls like combo boxes to the worksheets. These controls can also be filled with data from any columns in the worksheet. Office Connect can link the combo box text to the bound cell of the query. Any changes in the combo box selection can then trigger a refresh on the range.
Integrating multiple datasources in the same workbook
HTML
Office Connect enables the binding of a HTML table in a web page to a range in the Excel worksheet. If the HTML table is dynamically generated, then Office Connect can also bind the parameters in the HTTP call to cells in the worksheet in a manner similar to creating a QBE for database objects. Users can then modify the parameter values in the spreadsheet and get fresh data from the website
Figure 13. Binding parameters in HTTP call to cells in the worksheet

When the user types ibm in cell C12 and moves focus from the cell, the bound HTML range is refreshed.
Note: The same page on the spreadsheet can have multiple binding to both HTML and relational data sources. Using the QBE feature, data from one data source can be used to drive information from another data source.
Office Connect provides an infrastructure to distribute its projects over the Web using a central repository, an application server and a lightweight Web client.
Repository Manager
The repository consists of a group of relational tables and can be created in any database that has BLOB data type support. The Repository Manager client also is integrated into the Excel upon installation and enables users to:
- Save their projects in the repository database for redistribution
- Search for them based on criteria such as author, name of project, category and so on
- Download a project into Excel
- Check out and check in an existing project.
Office Connect provides rudimentary versioning capability. Besides project storage and retrieval functionality, the Repository Manager also provides access to an authentication infrastructure based on either an LDAP server or Office Connect's inbuilt user, group and role management system.
Figure 14. Repository Manager

User and role based authentication to control unauthorized access to data
Each bound range and sheet in the Project Manager has read/write privilege associated with it. A developer can link one or more roles with either of them. When the end user opens this workbook, the datasource access menus (Refresh range from DB, Save to DB and so on) will be enabled or disabled for each range, based on the role associated with that user's login username. This in effect controls access to the datasource for each bound range and worksheet.
Figure 15. Authentication

Using the database's own authentication
The user and role management functionality provided is outside the database's own authentication mechanism. The user name and password provided while initially extracting the datasource schema is used for all JDBC connections to the database by the end users. If the workbooks developers want to force the end users to use the database's internal authentication mechanism, they can check the Database Password Authentication instead of selecting any role. When the end users try to either refresh or save the data, they will be prompted to give their database username and password. The JDBC driver will now use this information to make a connection the database.
Note: In this case each end user must have an account in the database.
Office Connect provides a lightweight Excel plug-in Web client for end users. This has been packaged to silently install over the Web. The Office Connect Web Client also integrates into the Excel environment and extends its menus and toolbars. Using this plug-in, the end users can connect to the Office Connect application server running inside any J2EE-enabled Web server.
Before the Web Client can be used, Office Connect application server must be up and running inside any J2EE Web server (including SSL support for data encryption). Also the application server should be configured to point to a repository database that contains Office Connect projects for distribution.
Users log in to the application server by entering its URL along with their usernames and passwords in the login window of the Web Client. This information is transferred as an XML formatted HTTP call to the application server. Authentication is done using LDAP or Office Connect's own user management system. Once the username and password are authenticated, the use's role and group information is returned back to the Web client.
Figure 16. Web login

A user can now search the repository for a list of projects. The projects available in this list are controlled by the users group and role information. The user can select one of the projects for downloading. The Excel Workbook and its associated XML metadata is serialized and transferred to the Web client using HTTP.
Users can now refresh the various ranges in the workbook with data from the data source. They can also modify the data in these ranges and save it back to the bound data source (database only) Concurrency is handled in the same manner as in the development environment. The menus which allow the users to do the data transfer operation are enabled or disabled for a given range or sheet based on the user's role and privileges.
Synchronization with disconnected updates
Office Connect does not cache any data in the application server but it does cache it in the client. End users can save the Office Connect Excel workbook on their local hard disks. (The data cache, metadata, and URL are also saved inside the workbook.) Users can disconnect their laptop and update the data in the workbook. When the laptop is reconnected and that workbook is opened, they are prompted to log in to the application server. There is no need to provide the URL since it was also saved with the workbook. Once logged in, the user can update the data source with the modified spreadsheet data.
Using this feature of Office Connect it is possible to publish Office Connect workbooks on a Website and have links to them from a Web page. The workbook is opened inside a Web browser, giving a more natural flow and eliminating the end user's need to understand the requirement of a separate application server for Office Connect.
Excel power users can create sophisticated BI applications using Office Connect in conjunction with Excel's own powerful functions. This, combined with its distribution, authentication and security infrastructure makes Office Connect an ideal tool for small and medium business users.
Appendix. Office Connect Feature List
| BI tools for SMB | |
|---|---|
| Add-in to Microsoft Excel | No or low learning curve |
| Easy-to-use drag and drop paradigm | Leverages existing user base of 40 million Excel users |
| Model-driven RAD to create Excel-based database applications | |
|---|---|
| Extract database schema to create a reusable XML-based metadata model | |
| Binds Excel ranges to database objects for bi-directional data transfer between Excel and database | |
| Generate SELECT, UPDATE, INSERT and DELETE SQL | Query builder for creating custom queriesa |
| Reads and writes data from Excel to database | Creates QBE and master detail scenarios |
| Integrate multiple data sources including relational and HTML in the same workbook. | |
| Enterprise data integration over the Web using Excel | |
|---|---|
| Lightweight Excel Web Client that silently installs over the Web | |
| Retrieves and updates data over the web | Handles concurrency |
| Authenticated, role-based access to the database using LDAP or inbuilt user management system | |
| Handles disconnected updates by synchronizing data in spreadsheet with the database | |
| SSL support for data encryption on the Internet | Extensible data source server written in Java |
| Facility to remotely trouble shoot application server over the web | |
| Supported platforms and databases | ||
|---|---|---|
| DB2, DB2 for OS/390, Relational Connect, IDS9.x, XPS8.x, IDS7.x, Online5.x, Redbrick, MS Access, Oracle, Sybase | Brazilian, French, German, Italian, Japanese, Korean, Simplified and Traditional Chinese, Danish, Finnish, Norwegian, Polish, Russian, Swedish | |
| Win32 and Excel 97 or above | Application Server: J2EE platform | |
| Websphere, Tomcat, Apache/Tomcat (SSL) | Compliant to accessibility standards | |
Comments (Undergoing maintenance)






