DB2 Version 10.1 for Linux, UNIX, and Windows

Default document models

When using one of the default document models, you must remember that all fields are indexed, and no special information is extracted, and no numeric attribute is indexed.

For HTML, XML, and Outside In filtered documents, Net Search Extender provides default document models that are used if you do not specify a document model during index creation. For structured plain text documents, you must provide and specify a document model.

If you use one of the default document models:

All fields are indexed, and no special information, such as meta information, is extracted.
- For HTML and INSO formats, each field is assigned the name of the corresponding tag.
- For XML, all XML nodes of an XML document are mapped to overlapping fields which are identified by the fully qualified element paths of the corresponding nodes. For example, the path /play/role/name.
No numeric attribute is indexed (as no numeric attribute is defined in the default document model).

Table 1. Behavior of the default document models for the supported document formats
Document type	Behavior of the default document model
HTML	Accepts these as text fields: <a> <address> <au> <author> <h1> <h2> <h3> <h4> <h5> <h6> <title>. Field name is the tag name, for example "address".
XML	Accepts all tags as text fields. The field name is the fully qualified element path name, for example "/play/title".
Structured plain text (GPP)	No default document model.
Outside In (INSO)	Accepts as text fields, the document properties shown in Definition of a document model for Outside In filtered documents as returned by the Outside In filters. The Field name is the name of the document property used by Outside In, for example: "SCCCA_TITLE".

For each type of document, a default document model is defined. As each model is different, an example and explanation is provided for each one in the following sections.

Note:

Although the default document models do correctly process documents, for better indexing and search you should define your own document models.

With the default document model, the text of a document is fully indexed regardless of whether or not it is part of a text field. This means that unrestricted text searches include a search of that text.