IBM InfoSphere Information Server architecture and concepts

IBM® InfoSphere® Information Server provides a unified architecture that works with all types of information integration. Common services, unified parallel processing, and unified metadata are at the core of the server architecture.

The architecture is service-oriented, enabling IBM InfoSphere Information Server to work within evolving enterprise service-oriented architectures. A service-oriented architecture also connects the individual suite product modules of InfoSphere Information Server.

By eliminating duplication of functions, the architecture efficiently uses hardware resources and reduces the amount of development and administrative effort that are required to deploy an integration solution.

The following diagram shows the InfoSphere Information Server architecture.

Figure 1. InfoSphere Information Server high-level architecture

Unified parallel processing engine

Much of the work that InfoSphere Information Server does takes place within the parallel processing engine. The engine handles data processing needs as diverse as performing analysis of large databases for IBM InfoSphere Information Analyzer, data cleansing for IBM InfoSphere QualityStage®, and complex transformations for IBM InfoSphere DataStage®. This parallel processing engine is designed to deliver the following benefits:

Parallelism and data pipelining to complete increasing volumes of work in decreasing time windows
Scalability by adding hardware (for example, processors or nodes in a grid) with no changes to the data integration design
Optimized database, file, and queue processing to handle large files that cannot fit in memory all at once or with large numbers of small files

Common connectivity

InfoSphere Information Server connects to information sources whether they are structured, unstructured, on the mainframe, or applications. Metadata-driven connectivity is shared across the suite components, and connection objects are reusable across functions.

Connectors provide design-time importing of metadata, data browsing and sampling, runtime dynamic metadata access, error handling, and high functionality and high performance runtime data access. Prebuilt interfaces for packaged applications that are called packs provide adapters to SAP, Siebel, Oracle, and others, enabling integration with enterprise applications and associated reporting and analytical systems.

Unified metadata

InfoSphere Information Server is built on a unified metadata infrastructure that enables shared understanding between business and technical domains. This infrastructure reduces development time and provides a persistent record that can improve confidence in information. All functions of InfoSphere Information Server share the same metamodel, making it easier for different roles and functions to collaborate.

A common metadata repository provides persistent storage for all InfoSphere Information Server suite components. All of the products depend on the repository to navigate, query, and update metadata. The repository contains two kinds of metadata:

Dynamic: Dynamic metadata includes design-time information.
Operational: Operational metadata includes performance monitoring, audit and log data, and data profiling sample data.

Because the repository is shared by all suite components, profiling information that is created by InfoSphere Information Analyzer is instantly available to users of InfoSphere DataStage and InfoSphere QualityStage, for example.

The repository is a J2EE application that uses a standard relational database such as IBM Db2®, Oracle, or SQL Server for persistence (Db2 is provided with InfoSphere Information Server). These databases provide backup, administration, scalability, parallel access, transactions, and concurrent access.

Common services

InfoSphere Information Server is built entirely on a set of shared services that centralize core tasks across the platform. These include administrative tasks such as security, user administration, logging, and reporting. Shared services allow these tasks to be managed and controlled in one place, regardless of which suite component is being used. The common services also include the metadata services, which provide standard service-oriented access and analysis of metadata across the platform. In addition, the common services tier manages how services are deployed from any of the product functions, allowing cleansing and transformation rules or federated queries to be published as shared services within an SOA, using a consistent and easy-to-use mechanism.

InfoSphere Information Server products can access three general categories of service:

Design: Design services help developers create function-specific services that can also be shared. For example, InfoSphere Information Analyzer calls a column analyzer service that was created for enterprise data analysis but can be integrated with other parts of InfoSphere Information Server because it exhibits common SOA characteristics.
Execution: Execution services include logging, scheduling, monitoring, reporting, security, and web framework.
Metadata: Metadata services enable metadata to be shared across tools so that changes made in one InfoSphere Information Server component are instantly visible across all of the suite components. Metadata services are integrated with the metadata repository. Metadata services also enable you to exchange metadata with external tools.

The common services tier is deployed on J2EE-compliant application servers such as IBM WebSphere® Application Server, which is included with InfoSphere Information Server.

Unified user interface

The face of InfoSphere Information Server is a common graphical interface and tool framework. Shared interfaces such as the IBM InfoSphere Information Server console and the IBM InfoSphere Information Server Web console provide a common interface, visual controls, and user experience across products. Common functions such as catalog browsing, metadata import, query, and data browsing all expose underlying common services in a uniform way. InfoSphere Information Server provides rich client interfaces for highly detailed development work and thin clients that run in web browsers for administration.

Application programming interfaces (APIs) support a variety of interface styles that include standard request-reply, service-oriented, event-driven, and scheduled task invocation.