Searching SharePoint documents (FileNet P8)
When you use IBM® eDiscovery Manager to search Microsoft SharePoint items that are archived by IBM Content Collector, Version 3.0 or later, refer to these search configuration options for IBM FileNet® P8.
IBM Content Collector creates a default document class with the following name and properties.
- ICCSharepointInstance2
- DocumentTitle
- ICCCreatedBy
- ICCCreatedDate
- ICCExpirationDate
- ICCFileName
- ICCFilePath
- ICCFolderPath
- ICCExpirationDate
- ICCLastModifiedDate
- ICCLibrary
- ICCModifiedBy
- ICCSharePointGUID
- ICCSharePointVersion
- ICCSite
To use multi-part content archived from SharePoint by IBM Content Collector 3.0 or above in an IBM FileNet P8 repository with IBM eDiscovery Manager, create a new eDiscovery Manager collection using the collection type "Microsoft SharePoint - Content Collector". This collection type provides a good starting point which can be modified and extended to create a richer field definition. Initially, the field definitions have the following fields:
| Collection field | Content server property | Type | Text index |
|---|---|---|---|
| EXTERNAL_ID | Id | String | |
| CREATED_DATE | ICCCreatedDate | Date | |
| MODIFIED_DATE | ICCLastModifiedDate | Date | |
|
ICCExpirationDate | String | |
| LIBRARY | ICCLibrary | String | |
| SITE | ICCSite | String | |
|
ICCSharePointVersion | String | |
| FOLDER_PATH | ICCFolderPath | String | |
| FILE_NAME | ICCFileName | String |
Delete the definition for CONTENT and then add the following definitions to get to the full, new field definitions:
| Collection field | Type | Text index | Description |
|---|---|---|---|
| CONTENT | String | //icc_content | Matches all of the content, including attachments of a SharePoint item. |
| RAW_CONTENT | String | $FULL_TEXT$ | Matches all of the content and XML tags. |
| DOCUMENT | String | //icc_main | Matches primary file content only, whether file or HTML rendering. |
|
String |
|
Matches primary file, file name only. |
| ATTACHMENT | String |
|
Matches attachment content only. |
|
String |
|
Matches attachment file name only. |
XPath syntax supported in field mappings
- It does not support iteration and ranges in path expressions.
- It eliminates filter expressions: that is, it allows filtering only in the predicate expression, not in the path expression.
- It does not allow absolute path names in predicate expressions.
- It implements only one axis (tag) and allows propagation only in the forward direction.
- /*
- //*
- /@*
- //@*
Disregarding of XML namespaces
Namespace prefixes are not retained in the indexing of XML tag and attribute names. You can search XML documents by using namespaces, but namespace prefixes are discarded during indexing and removed from XML search queries.
Numeric values
Predicates that compare attribute values to numbers are supported.
Complete match
The operator = (equal sign) with a string argument in a predicate means that a complete match of all tokens in the string with all tokens in the identified text span is required. The order of the tokens is important.
For more details on the XML search syntax, see the FileNet P8 topic "SQL Syntax Reference" and go to the "XML Search" section.