Directory Integrator, Version 7.1.1

Creating a Project

Create a new EasyETL Project by pressing the New Project button at the top of the Project Navigator.

Figure 1. New Project button

Name this Project ‘CSVtoXML' and press Finish. This opens your new Project in the Simple AL editor.

Figure 2. Simple AssemblyLine editor

The editor provides two drop-downs: one for selecting the Input source and one for the target. The area below is empty (apart from the assistance text) until you have chosen your source.

Setting up input for your ETL AL

Configure input for your EasyETL AssemblyLine by clicking on the drop-down for Source Information and selecting the File Connector.

Figure 3. Selecting Source information

You will then be presented with the configuration dialog for this Connector.

Point the File Path parameter at the ‘People.csv' file found here:

TDI_HOME/examples/Tutorial

where TDI_HOME is replaced with the TDI installation directory on your machine¹.

Figure 4. Setting the File Path parameter

Now click on the tab labeled Parser and select the CSV Parser, keeping the default configuration parameters as-is. Finally, press the Connect button at the bottom of the dialog box to test the connection and discover available Attributes.

Figure 5. Testing the connection and discovering schema

The schema for the connected system is displayed and from here you can choose which of these to use in your data flow. For this example, use the Select All button and then press OK to close the configuration dialog.

Back in the EasyETL workbench you will see that the bottom half of the Simple AL editor has changed to reflect that you now have an Input Source configured. TDI now presents you with a button for stepping through this information one record at a time, as well as buttons to run the ETL task to completion and to stop it.

Figure 6. Input Source configured

Below these buttons are two grid boxes called Data Viewers that list the Attributes handled by your data flow. The Data Viewer on the left shows your input Attributes. Those that you selected for reading in the previous step are displayed in bold at the top and are called the Input Map. Below will be any unselected Attributes displayed in gray, and you can include these for input mapping by double-clicking on them. Similarly, you remove Attributes from the map by either double-clicking or deleting them².

The box to the right is the Output Viewer and it shows the set of Attributes to be written. In TDI terms, this is your Output Map and by default it is identical to the list you selected for your Input Map. Note that you can change the name of any Output Attribute by clicking in the right-hand column and then editing the value. Use this technique to rename the Attributed called ‘Title' to ‘JobTitle'.

Figure 7. Renaming an Output Attribute

Your EasyETL project is now ready for its first test.

Running your EasyETL AssemblyLine

Select the Data Collector view at the bottom of the screen and then press the Read and Write next record button. This causes the following to happen:

There is a delay as your EasyETL AssemblyLine is transferred to the running Server and started;
The first record is read and parsed from your CSV input source and the data is displayed in both the input and output grid displays;
The Attributes you selected for output are written and collected in the Data Collector view³.

So even though you have not selected an Output target yet, you can still run and test your ETL project, viewing the data as it flows down the AssemblyLine.

Figure 8. One record read and collected

Each time your press the Read and Write… button, another record is read, displayed and collected. If you now press the Run button then your ETL job runs to completion and you will see this AL report.

Figure 9. EasyETL AssemblyLine completed

As shown in the dialog above, no records were actually written anywhere. However, the Data Collector still provides handy visual feedback on the information being extracted and transferred.

Furthermore, you can select rows in the collected data and copy/paste this information to a file or other target⁴.

Transformations

Up to now your Output values have been identical to your input. However, there will be situations where you want to manipulate or even compute these based on the data read. You do this in TDI EasyETL by writing Transformations in JavaScript.

In order to work with Transformations you first have to enable them by selecting the Show data transformations check box.

Figure 10. Enabling Transformation

This causes a new grid box to appear between the Data Viewers: the Transformation Viewer. Here you see arrows indicating that all three Output Attributes get their values directly from Input Attributes – in other words, no transformations. You are going to define a new Output Attribute and then add the Transformation script to compute its value. Do this now by right-clicking in the Output Map, choosing Add Attribute and naming it ‘FullName'. Now double-click on the Transformation to the left of this new Output Attribute and then enter this script snippet:⁵ return First + " " + Last

Figure 11. Show Transformation script

There is an Evaluate button in the Transformation Script editor dialog for testing the script, along with one for bringing up some JavaScript tips and examples.

Press Evaluate now to get an idea how your Transformation works.

Figure 12. Evaluate Expression

The Output value shown was computed using data that you collected when you ran your AssemblyLine. Close the evaluation results dialog by pressing OK.

Now press OK now to accept this script and close the Transformation Script editor and re-run your EasyETL AssemblyLine by pressing Run and then viewing collected entries. Notice how the Data Collector now gives you two Component Collections to choose from: Output and Input. Choose Output and see how your Transformation script generated a ‘FullName' value for each entry.

Figure 13. Output collection with computed FullName Attribute

So now you know how to set up an Input Source and select the Attributes to extract, as well as how to transform this data to fit your output needs. The next step is selecting an Output target and driving the data there.

Selecting an output target

Select the File Connector for your Output target using the drop-down above the Output View and have it write to a file called ‘Output.xml'⁶. Choose the XML Parser and press OK. Note that you could use the Connect button to ensure that the file path you entered is valid. However, there will be no data to discover – unless of course you point your Connector at an existing XML file.

Once our Output is configured, run your ETL AssemblyLine again. Once completed you can open the output file and verify the results.

Figure 14. XML Output

Your Output target could as easily have been a database table, just as your input could be coming from a Lotus® Notes® application or LDAP directory. The steps you took to create this example ETL job are the same, regardless of the systems or data stores you are working with.

¹ Note that TDI supports both forward slash and backslash as the path separator when running on Windows. Your TDI solutions will be more portable between Windows and all the other platforms TDI runs on if you use forward slash.

² You can also right-click and choose Add Attribute or Remove Attribute.

³ The leftmost button at the top of the Data Collector view opens a configuration dialog where you can increase the Data Collector Buffer size. Note that if you plan to collect large amounts of data then you may need to increase the memory available to TDI. Do this by locating the ‘ibmditk' batch-file/script in the TDI installation folder and opening it in an editor. Near the bottom is the line that launches ‘miadmin' and you could insert the following text after the –vmargs option: –Xmx512M This example will allow TDI data memory to grow to 512 Mbytes.

⁴ Data is copied in CSV format to simplify importing values into spreadsheets and report tables. ‘CSV' stands for Character Separated Value, and the separator character used by TDI is the semi-colon (;).

⁵ Note that you can press Ctrl + Space to get up a list of suggestions to what you can type. This list includes some special objects like 'task' and 'main' and lists the Attributes being read in from your Input Source at the bottom.

Note also that the notation shown here for accessing Input Attributes only works in EasyETL. In the full TDI Workbench you must prefix an Attribute name with the Entry object carrying it - for example the Work Entry (work). As a result, the above example would look like this:

   return work.First + " " + work.Last

The Work Entry and other TDI concepts are discussed in the first chapter of this document.

⁶ If you do not enter a full path, or a relative one, then TDI will base this from the Solution Directory that you specified during installation.

[ Terms of use | Feedback ]

IBM Tivoli Directory Integrator 7.1.1