The mining editor: a visual tour

Familiarize yourself with the views of the Data Warehousing perspective in the Design Studio and the mining editor.

In default mode, the mining editor, part of the Data Warehousing perspective in the Eclipse-based InfoSphere® Warehouse Design Studio, includes four main views and a drag-and-drop palette of mining operators. You can customize the position of the various views and maximize each view by double-clicking the view title.

Figure 1. The mining editor Example of the mining editor and its parts

Data Project Explorer: A tree view that shows the files and metadata associated with your data warehousing projects. You can create and manage data warehousing projects. In addition, you can create mining flows, data flows, control flows, and warehouse applications.

Mining Editor canvas: A graphical workspace where you can place and connect operators that represent data sources, a number of mining operations, and target tables to model a mining flow.

Menu bar: A combination of Eclipse, Data Warehousing perspective, and mining editor specific menus: Mining flow and Diagram.

Mining Editor Toolbar

Generate a Java™ Bean from this diagram icon: Use this icon to create a Java Bean from an open mining flow and specify its settings, connection method, and data source to use.
Disconnect /Connect the flow database and work offline/online: Click this icon to switch the working mode of the mining flow. If the flow is disconnected, you can connect to the flow database to work in online mode. If the flow is connected, you can disconnect the flow database to work in offline mode.

Execute this mining flow in the database icon: Use this icon to run a complete mining flow or run the mining flow step by step. You can also show the SQL statements, specify operator breakpoints, view the SQL details, and access the variables manager.

Validate this mining flow icon: Use this icon to verify the accuracy of each operator in the mining flow.

Generate the SQL/DDL script for this mining flow icon: Use this icon to generate the SQL script for the mining flow.

Refresh Database icon: Use this icon to synchronize the database metadata for the mining flow with the connected database.

Generated SQL code: Shows the SQL script generated for the mining flow (not visible in the figure). To view the SQL script, select the tab.

Operator palette: A list of source, target, and preprocessing operators. You can select an operator from the palette and place it in the canvas by clicking inside the canvas.

Data Source Explorer

A tree view that allows you to:

Create new database connections and connect to existing databases.
Explore database schema.
Invoke data exploration functions such as sample content, value distributions (univariate, bivariate, multivariate).
Explore data mining models including import, export, and visualize.

Property views: Tabbed pages that allow you to specify the detailed behavior of each operator in a data flow. Within this view, you can define which tables or files your source and target operators represent, and how each operator will change the data set.

SQL Results view: When you explore data (Sample contents), you can view sample rows of tables.
Execution Status view: When you run a mining flow, you can view the SQL statements and SQL return codes.

Selected operator: Operators can be expanded and selected to access property views below the canvas.

The following figure gives an overview of the operators available from the Mining editor palette:

Figure 2. Mining editor palette overview

Note: Note the difference between the Easy Mining operators (Mining - View Creation) and the Model Builders (Mining - Model Creation). The Easy Mining operators (Find Deviations, Find Rules, Cluster Table and Predict column) analyze an input table and create an output table containing the results. The Model Builders (Associations, Sequences, Clusterer, and Predictor) analyze an input table and create a data mining model.

Feedback