Create a new EasyETL Project by pressing the New Project button
at the top of the Project Navigator.
Figure 1. New Project button
Name this Project ‘CSVtoXML' and press Finish.
This opens your new Project in the Simple AL editor.
Figure 2. Simple AssemblyLine editor
The editor provides two drop-downs: one for selecting the Input
source and one for the target. The area below is empty (apart from
the assistance text) until you have chosen your source.
Setting up input for your ETL AL
Configure
input for your EasyETL AssemblyLine by clicking on the drop-down for
Source Information and selecting the
File Connector.
Figure 3. Selecting Source information
You will then be presented with the configuration dialog for
this Connector.
Point the
File Path parameter
at the ‘
People.csv' file found here:
TDI_HOME/examples/Tutorial
where
TDI_HOME is
replaced with the TDI installation directory on your machine
1.
Figure 4. Setting the File Path parameter
Now click on the tab labeled
Parser and
select the
CSV Parser, keeping the default
configuration parameters as-is. Finally, press the
Connect button
at the bottom of the dialog box to test the connection and discover
available Attributes.
Figure 5. Testing the connection and discovering
schema
The schema for the connected system is displayed and from here
you can choose which of these to use in your data flow. For this example,
use the
Select All button and then press
OK to
close the configuration dialog.
Back in the EasyETL workbench you
will see that the bottom half of the Simple AL editor has changed
to reflect that you now have an Input Source configured. TDI now presents
you with a button for stepping through this information one record
at a time, as well as buttons to run the ETL task to completion and
to stop it.
Figure 6. Input Source configured
Below these buttons are two grid boxes called
Data Viewers that
list the Attributes handled by your data flow. The Data Viewer on
the left shows your input Attributes. Those that you selected for
reading in the previous step are displayed in
bold at the top
and are called the
Input Map. Below will be any unselected
Attributes displayed in gray, and you can include these for input
mapping by double-clicking on them. Similarly, you remove Attributes
from the map by either double-clicking or deleting them
2.
The box to the right is the Output
Viewer and it shows the set of Attributes to be written. In TDI terms,
this is your Output Map and by default it is identical
to the list you selected for your Input Map. Note that you can change
the name of any Output Attribute by clicking in the right-hand column
and then editing the value. Use this technique to rename the Attributed
called ‘Title' to ‘JobTitle'.
Figure 7. Renaming an
Output Attribute
Your EasyETL project is now ready for its first test.
Running your EasyETL AssemblyLine
Select the
Data
Collector view at the bottom of the screen and then press
the
Read and Write next record button. This
causes the following to happen:
- There is a delay as your EasyETL AssemblyLine is transferred to
the running Server and started;
- The first record is read and parsed from your CSV input source
and the data is displayed in both the input and output grid displays;
- The Attributes you selected for output are written and collected
in the Data Collector view3.
So even though you have not selected an Output target yet, you
can still run and test your ETL project, viewing the data as it flows
down the AssemblyLine.
Figure 8. One record read and collected
Each time your press the
Read and Write… button,
another record is read, displayed and collected. If you now press
the Run button then your ETL job runs to completion and you will see
this AL report.
Figure 9. EasyETL AssemblyLine completed
As shown in the dialog above, no records were actually written
anywhere. However, the
Data Collector still
provides handy visual feedback on the information being extracted
and transferred.
Furthermore, you can select rows in the collected
data and copy/paste this information to a file or other target4.
Transformations
Up to now your Output values
have been identical to your input. However, there will be situations
where you want to manipulate or even compute these based on the data
read. You do this in TDI EasyETL by writing Transformations in JavaScript.
In
order to work with Transformations you first have to enable them by
selecting the Show data transformations check
box.
Figure 10. Enabling Transformation
This causes a new grid box to appear between the Data Viewers:
the Transformation Viewer. Here you see arrows indicating that all
three Output Attributes get their values directly from Input Attributes
– in other words, no transformations. You are going to define a new
Output Attribute and then add the Transformation script to compute
its value. Do this now by right-clicking in the Output Map, choosing
Add Attribute and naming it ‘FullName'. Now double-click on the
Transformation to the left of this new Output Attribute and then enter
this script snippet:
5 return First + " " + LastFigure 11. Show Transformation script
There is an
Evaluate button in the Transformation
Script editor dialog for testing the script, along with one for bringing
up some JavaScript tips and examples.
Press Evaluate now
to get an idea how your Transformation works.
Figure 12. Evaluate
Expression
The Output value shown was computed using data that you collected
when you ran your AssemblyLine. Close the evaluation results dialog
by pressing
OK.
Now press OK now
to accept this script and close the Transformation Script editor and
re-run your EasyETL AssemblyLine by pressing Run and
then viewing collected entries. Notice how the Data Collector now
gives you two Component Collections to choose from: Output and Input.
Choose Output and see how your Transformation
script generated a ‘FullName' value for each entry.
Figure 13. Output
collection with computed FullName Attribute
So now you know how to set up an Input Source and select the
Attributes to extract, as well as how to transform this data to fit
your output needs. The next step is selecting an Output target and
driving the data there.
Selecting an output target
Select the
File
Connector for your Output target using the drop-down above
the Output View and have it write to a file called ‘Output.xml'
6.
Choose the
XML Parser and press
OK.
Note that you could use the
Connect button to ensure
that the file path you entered is valid. However, there will be no
data to discover – unless of course you point your Connector at an
existing XML file.
Once our Output is configured, run your ETL
AssemblyLine again. Once completed you can open the output file and
verify the results.
Your Output target could as easily have been a database table,
just as your input could be coming from a Lotus® Notes® application
or LDAP directory. The steps you took to create this example ETL job
are the same, regardless of the systems or data stores you are working
with.
3 The leftmost button at the top of the
Data
Collector view opens a configuration dialog where you
can increase the
Data Collector Buffer size.
Note that if you plan to collect large amounts of data then you may
need to increase the memory available to TDI. Do this by locating
the ‘
ibmditk' batch-file/script in the TDI
installation folder and opening it in an editor. Near the bottom is
the line that launches ‘
miadmin' and you
could insert the following text after the
–vmargs option:
–Xmx512M This example will allow TDI data memory
to grow to 512 Mbytes.