Shared containers

Shared containers help you to simplify your design but, unlike local containers, they are reusable by other jobs.

You can use shared containers to make common job components available throughout the project. You can create a shared container from a stage and associated metadata and add the shared container to the palette to make this pre-configured stage available to other jobs.

You can also insert a server shared container into a parallel job as a way of making server job functionality available. For example, you could use it to give the parallel job access to the functionality of a server transform function. (Note that you can only use server shared containers on SMP systems, not MPP or cluster systems.)

Shared containers comprise groups of stages and links and are stored in the Repository like IBM® InfoSphere® DataStage® jobs. When you insert a shared container into a job, InfoSphere DataStage places an instance of that container into the design. When you compile the job containing an instance of a shared container, the code for the container is included in the compiled job. You can use the InfoSphere DataStage debugger on instances of shared containers used within server jobs.

When you add an instance of a shared container to a job, you will need to map metadata for the links into and out of the container, as these can vary in each job in which you use the shared container. If you change the contents of a shared container, you will need to recompile those jobs that use the container in order for the changes to take effect. For parallel shared containers, you can take advantage of runtime column propagation to avoid the need to map the metadata. If you enable runtime column propagation, then, when the jobs runs, metadata will be automatically propagated across the boundary between the shared container and the stage(s) to which it connects in the job.

Note that there is nothing inherently parallel about a parallel shared container - although the stages within it have parallel capability. The stages themselves determine how the shared container code will run. Conversely, when you include a server shared container in a parallel job, the server stages have no parallel capability, but the entire container can operate in parallel because the parallel job can execute multiple instances of it.

You can create a shared container from scratch, or place a set of existing stages and links within a shared container.

Note: If you encounter a problem when running a job which uses a server shared container in a parallel job, you could try increasing the value of the DSIPC_OPEN_TIMEOUT environment variable in the Parallel ► Operator specific category of the environment variable dialog box in the InfoSphere DataStage Administrator.