An AssemblyLine (AL) is a set of components strung together to move and transform data; it describes the "route" along which the data will pass. The data that is been handled through that journey is represented as an Entry object. The AssemblyLine works with a single entry at a time on each cycle of the AssemblyLine. It is the unit of work in TDI and typically represents a flow of information from one or more data sources to one or more targets.
Some of the components that comprise the AssemblyLine retrieve data from one or more connected systems—data obtained this way is said to "feed" the AL. Data to be processed is fed into the AL one Entry at a time, where these Entries carry Attributes with values coming from directory entries, database rows, e-mails, Lotus® Notes® documents, records or similar data objects. Each entry carries Attributes that hold the data values read from fields or columns in the source system. These Attributes are renamed, reformatted or computed as processing flows from one component to the next in the AL. New information can be "joined" from other sources and all or parts of the transformed data can be written to target stores or sent to target systems as desired. This can be illustrated thus:
In this diagram, picture the collection of large jigsaw puzzle pieces as the AssemblyLine, the leftmost blue dots and squares in the grey stream entering from below as raw data from an input stream, and the purple bits on the top right as data output on an output stream. The darker orange element intersecting a jigsaw piece with the bucket in it denotes a Parser, turning raw data into structured data, which then can start travelling down the AssemblyLine (as lighter-colored elements in a bucket). The middle jigsaw piece pictures a Connector reading already-structured data from for example a database.
Data enters the AssemblyLine from connected systems using Connectors in some sort of input Mode, and is output later to one or more connected systems using Connectors in some output Mode.
Data can either be read from record-oriented systems like a database or a message queue: in this case the various columns in the input are readily mapped into Attributes in the resulting work Entry, which is depicted as a "bucket" in the puzzle piece on the left. Or, data can be read from a data stream, like a text file in a filesystem, a network connection, and so forth. In this case, a Parser can be prefixed to the Connector, in order to make sense of the input stream, and cut it up into pieces after which it can be assigned to Attributes in the work Entry.
Once the first Connector has done its work, the bucket of information (the "work Entry", called, appropriately, work) is passed along the AssemblyLine to the next Component—in the illustration, another Connector. Since the data from the first Connector is available, it can now be used as key information to retrieve, or lookup data in the second connected system. Once the relevant data is found, it can be merged into work, complementing the data that is still around from the first Connector.
Finally, the merged data is passed along the AssemblyLine to the third puzzle piece or Connector this time in some output Mode, which takes care of outputting the data to the connected system. If the connected system is record-oriented the various Attributes in work are just mapped to columns in the record; if the connected system is stream-oriented, a Parser can do the necessary formatting.
Other components, like Script Components and Functions, can be inserted at will in the AssemblyLine to perform operations on the data in work.
It is important to keep in mind that the AssemblyLine is designed and optimized for working with one item at a time. However, if you want to do multiple updates or multiple deletes (for example, processing more than a single item at the time) then you must write AssemblyLine scripts to do this. If necessary, this kind of processing can be implemented using JavaScript, Java libraries and standard IBM® Tivoli® Directory Integrator functionality, such as pooling the data to a sorted data store, for example with the JDBC Connector, and then reading it back and processing it with a second AssemblyLine.
AssemblyLines are built, configured and tested using the IBM Tivoli Directory Integrator Config Editor (CE), see The Configuration Editor for more information. The AssemblyLine has a Data Flow tab in the Config Editor. This is where the list of components that make up this AL are kept.
All components in an AL are automatically registered as script variables. So if you have a Connector called ReadHRdump then you can access it and its methods directly from script using the ReadHRdump variable. As a result, you will want to name your AL components as you would script variables: Use alphanumeric characters only, do not start the name with a number, and do not use special national characters (for example, å, ä), separators (apart from underscore '_'), white space, and so forth.
There is always an alternative method for accessing an AL component (for example, the task.getConnector() function) but a conscious naming convention is always advisable.
Starting an AssemblyLine in TDI is a fairly costly operation, as it involves the creation of a new Java thread and usually sets up connections to one or more data sources. Consider carefully if your solution design could be made to work with fewer, rather than more, distinct AssemblyLines, where each AssemblyLine does more work; for example, by using Branches or Switches to define multiple operations handled by a single AL. Note that each operation can still be implemented as a separate AssemblyLine, but these can be embedded "hot-and-ready" into a single AL that dispatches work to them by using the AL Connector or AL Function. This also allows you to leverage features like Global Connector Pools to manage resource usage and boost performance and scalability.
AssemblyLines can include the following components:
Additionally, Connectors can have Parsers configured; also, at System, Config, AssemblyLine, Attribute map and Attribute level there are options to configureNull Behavior.
Each AL component is available as a pre-registered script variable with the name you chose for the component.
Note that you can dynamically load components with scripted calls to functions like system.getConnector(), although this is not for inexperienced users.1