WebSphere® Information
Analyzer is an integrated tool for providing comprehensive enterprise-level
data analysis. It features data profiling, analysis, and design and supports
ongoing data quality monitoring.
The WebSphere Information
Analyzer user interface performs a variety of data analysis tasks, as Figure 1 shows.
Figure 1. Dashboard view of a project
provides high-level trends and metrics
WebSphere Information
Analyzer can be used by data analysts, subject matter experts, business analysts,
integration analysts, and business end users. It has the following characteristics:
- Business-driven
- Provides end-to-end data lifecycle management (from data access and analysis
through data monitoring) to reduce the time and cost to discover, evaluate,
correct, and validate data across the enterprise.
- Dynamic
- Draws on a single active repository for metadata to give you a common
platform view.
- Scalable
- Leverages a high-volume, scalable, parallel processing design to provide
high performance analysis of large data sources.
- Extensible
- Enables you to review and accept data formats and data values as business
needs change.
- Service oriented
- Leverages IBM® Information
Server’s service-oriented architecture to access connectivity, logging, and
security services, allowing access to a wide range of data sources (relational,
mainframe, and sequential files) and the sharing of analytical results with
other IBM Information
Server components.
- Robust analytics
- Helps you understand embedded or hidden information about content, quality,
and structure.
- Design integration
- Improves the exchange of information from business and data analysts to
developers by generating validation reference data and mapping data, which
reduces errors.
- Robust reporting
- Provides a customizable interface for common reporting services, which
enables better decision making by visually representing analysis, trends,
and metrics.
IBM WebSphere AuditStage
is a suite component that augments WebSphere Information Analyzer by
helping you manage the definition and analysis of business rules. WebSphere AuditStage
examines source and target data, analyzing across columns for valid value
combinations, appropriate data ranges, accurate computations, and correct
if-then-else evaluations. WebSphere AuditStage establishes metrics to weight
these business rules and stores a history of these analyses and metrics that
show trends in data quality.
Where WebSphere Information Analyzer fits in the IBM Information
Server architecture
WebSphere Information Analyzer uses
a service-oriented architecture to structure data analysis tasks that are
used by many new enterprise system architectures. WebSphere Information Analyzer is
supported by a range of shared services and reuses several IBM Information
Server components.
Figure 2. IBM Information
Server architecture
Because WebSphere Information Analyzer has multiple discrete
services, it has the flexibility to configure systems to match varied customer
environments and tiered architectures. Figure 2 shows
how WebSphere Information
Analyzer interacts with the following elements of IBM Information Server:
- IBM Information
Server console
- Provides a graphical user interface to access WebSphere Information Analyzer functions
and organize data analysis results.
- Common services
- Provide general services that WebSphere Information Analyzer uses
such as logging and security. Metadata services provide access, query, and
analysis functions for users. Many services that are offered by WebSphere Information
Analyzer are specific to its domain of enterprise data analysis such as column
analysis, primary key analysis and review, and cross-table analysis.
- Common repository
- Holds metadata that is shared by multiple projects. WebSphere Information
Analyzer organizes data from databases, files, and other sources into a hierarchy
of objects. Results that are generated by WebSphere Information Analyzer can
be shared with other client programs such as the WebSphere DataStage™ and WebSphere QualityStage Designer by
using their respective service layers.
- Common parallel processing engine
- Addresses high throughput requirements that are inherent in analyzing
large quantities of source data by taking advantage of parallelism and pipelining.
- Common connectors
- Provide connectivity to all the important external resources and access
to the common repository from the processing engine. WebSphere Information Analyzer uses
these connection services in three fundamental ways:
- Importing metadata
- Performing base analysis on source data
- Providing drill-down and query capabilities