Identity management component

The management of multiple user credentials is a common problem for an enterprise. Watson Explorer Content Analytics solves the problem by providing an optional identity management component.

The information found in an enterprise can exist in many shapes and forms. It can be distributed throughout the enterprise and managed by the most appropriate software for the task at hand. For example, enterprise users might use an SQL application to access relational databases or a document management system to access documents relevant to their work.

Controlling access to sensitive information in these repositories is typically enforced by the managing software. Users identify themselves to the host system through a user ID and password combination. After being authenticated by the system, the managing software controls which documents the user is allowed to see and act upon based on the user's defined access rights.

It is common for users to have different user IDs and passwords associated with each repository. Similar to how users are asked to identify themselves to the original enterprise repositories, users must provide credentials before viewing documents in a collection that require current credentials to be validated. Users who have multiple identities must present the corresponding credentials for each identity.

If you specify that you want to use the identity management component in the administration console, the search servers can use the following approaches to validate a user's current credentials during query processing:
  • The enterprise search application can prompt the user to register the credentials that they need to access various domains in a user profile. The profile, which is encrypted and stored in a secure data store, enables the user to search the secure domains. If credentials are not specified for a domain that requires current credentials to be validated, documents from that domain are excluded from the search results.
  • If documents in a collection were crawled by a crawler that provides support for single sign-on (SSO) security, and you specify that you want to use SSO security to control access to documents, the system will use SSO security methods to authenticate users for the duration of a search session. The user does not need to create a profile on the My Profile page that specifies credentials or provide a user ID and password when searching secure domains.
  • If you use the Search portlet in WebSphere® Portal, secure search of sources through the portlet is supported only from the Search portlet. With portlet-related secure search, the user does not need to create a profile on the My Profile page that specifies credentials.

When users query collections that require current credentials to be validated when a query is submitted, the system can use the profile or SSO security methods to deny or permit access to documents.

Obtaining the user's group information

To validate a user's credentials, the identity management component must obtain the user's group information for each of the user's identities and add this information to a user security context (USC) string. This group information is used to filter results in accordance with access control data that is stored in the index or in accordance with SSO authentication data. The identity management component does this by using SSO tokens or by using the user's credentials to connect to the back-end system and request the groups that the user is a member of.

When you configure identity management options in the administration console, you can specify how often this group information is to be refreshed. You can extract new group data each time that the user logs in to the enterprise search application, or you can extract the group data on a regular basis, such as every three days.

Security without the identity management component

Not all enterprises want to manage the multiple identities of their user communities with the identity management component. If you disable the identity management component in the administration console, then it is the responsibility of your search application to generate the user security context string. After it is generated, the USC string is used to set the ACL constraints value on each query. For example:
Query q = factory.createQuery("IBM"); 
q.setACLConstraints("User's Security Context in XML");
Tip: To help you write your own identity management functionality, Watson Explorer Content Analytics provides an API that gives you programmatic control over the identity management database. This API allows you to generate the USC with Java objects, and the XML string is then automatically built.
The XML query string must be of the following form, where contains the fully formed XML string:
@SecurityContext::'...'
The format of the XML string is as follows:
<identities id=”login_UserName”>
  <ssoToken>token_value</ssoToken>
	<identity id="security_domain">
		<type>Notes</type>
		<username>domain_userName</username>
		<password encrypt="no">domain_userPW</password>
		<groups>
			<group id=”g1” />	
			<group id=”g2” />	
		</groups>
		<properties>
      <property name="property_name">property_value</property>
      ...
		</properties>
	</identity>
	...
</identities>
identities
The value of the id attribute is the user ID that the user provides when logging in to the system.
ssoToken
Optional: Specifies the Lightweight Third-Party Authentication (LTPA) token that is created for the user for the duration of the browser session. This parameter is used only if the target domain is enabled for SSO and the crawler is configured to use SSO security.
identity
Contains the user's credentials for a particular data source. The value of the id attribute is the domain that stores the user's credential information (in the case of Domino®, this is the Domino domain name).
type
Identifies the type of data and corresponds to the crawler type (Notes, DB2, Exchange Server, and so on).
username
Specifies the user name that is to be used to search the domain.
password
Specifies the password for the specified user name. The encrypt attribute must be set to no (the system does not provide an encryption method outside of the identity management component).
groups
Specifies the group names that the user belongs to. A separate group element is used for each group name.
properties
Specifies a list of connection-specific properties, such as the administrator ID and encrypted password that were used to create the crawler, or whether SSO is enabled for the source.
property_name
The name of the property.
property_value
The value of the property.