Schema management

Use the Schema Library Manager to import schemas into the metadata repository and to organize them into libraries that can be shared across all DataStage projects.

Before you can use the Hierarchical Data stage to produce or consume data, you must import the XML schemas that describe the data into the metadata repository. To import the schemas, you use the Schema Library Manager, which is available from the IBM® InfoSphere® DataStage® and QualityStage® Designer via the menu choice Import > Schema Library Manager.

After you import a schema, you can browse the type structure that the schema defines. You use these types to define the processing in the Hierarchical Data stage. The schemas that you import are available for use with any data. To use a schema with specific data, you bind the physical location of the data to the schema. For example, to read an XML file, you provide a path to the XML file and select the schema for the data that the file contains. When you work with XML data, you perform the binding when you configure the XML Parser or XML Composer step in an assembly.

You can add related schemas to the same library. Schemas that are in the same library can refer to each other. For example, one XML schema can use an XML include element or an import element to refer to another schema that is in the same library. Schemas that are in one library cannot refer to external schemas or to schemas that are in another library.

When you create a library, you specify a unique name for the library and an optional category name. Library names must be unique across all categories. Organizing libraries into categories ensures that you can later locate a specific library.

After you add schemas to a library, the library is automatically validated to determine if the schemas contain any errors. If the validation is successful, the library contains all of the element and type definitions from the schemas. If the library is invalid, you are notified that there are errors. To view the list of errors, click Validate. Whenever you modify a schema, delete a schema from the library, or add a new schema to the library the library is automatically re-validated. Schemas are used only at design time, not at runtime. Therefore, modifying a schema or deleting a schema from a library has no effect on existing jobs that use the schema.

If you modify a schema that is already being used by a job, the schema modifications are not automatically passed on to the job. If you want to apply a modified schema to an Hierarchical Data stage, you must edit the Hierarchical Data stage to retrieve the modified schema.

Within a library, you cannot repeat the same type definition; however, two different libraries can have the same type definition. Having the same type definition in two different libraries is useful if you want to have two versions of the same type definition.