DB2 Version 10.1 for Linux, UNIX, and Windows

Default document models

When using one of the default document models, you must remember that all fields are indexed, and no special information is extracted, and no numeric attribute is indexed.

For HTML, XML, and Outside In filtered documents, Net Search Extender provides default document models that are used if you do not specify a document model during index creation. For structured plain text documents, you must provide and specify a document model.

If you use one of the default document models:
Table 1. Behavior of the default document models for the supported document formats
Document type Behavior of the default document model
HTML Accepts these as text fields: <a> <address> <au> <author> <h1> <h2> <h3> <h4> <h5> <h6> <title>. Field name is the tag name, for example "address".
XML Accepts all tags as text fields. The field name is the fully qualified element path name, for example "/play/title".
Structured plain text (GPP) No default document model.
Outside In (INSO) Accepts as text fields, the document properties shown in Definition of a document model for Outside In filtered documents as returned by the Outside In filters. The Field name is the name of the document property used by Outside In, for example: "SCCCA_TITLE".

For each type of document, a default document model is defined. As each model is different, an example and explanation is provided for each one in the following sections.

Note:

Although the default document models do correctly process documents, for better indexing and search you should define your own document models.

With the default document model, the text of a document is fully indexed regardless of whether or not it is part of a text field. This means that unrestricted text searches include a search of that text.