Skip to main content

By clicking Submit, you agree to the developerWorks terms of use.

The first time you sign into developerWorks, a profile is created for you. Select information in your developerWorks profile is displayed to the public, but you may edit the information at any time. Your first name, last name (unless you choose to hide them), and display name will accompany the content that you post.

All information submitted is secure.

  • Close [x]

The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerworks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

By clicking Submit, you agree to the developerWorks terms of use.

All information submitted is secure.

  • Close [x]

Understanding the footprint of IBM Content Collector for Microsoft SharePoint

Prepare for a successful Content Collector deployment

Brent J. Benton (bbenton@ca.ibm.com), Software Engineer, IBM
Brent Benton photo
Brent Benton is an engineer in the Software Group at IBM. He has worked in software development and ECM for eleven years for Yaletown Technology Group, FileNet, and IBM. His contributions to ECM and IBM include numerous ECM integration and content migration tools and products, including the design and development of a previous generation of what today is known as Content Collector for File Systems. For the past four years Brent has worked on what today is Content Collector for Microsoft SharePoint. Currently Brent is the development lead for social media connectors for IBM Content Collector.

Summary:  Understanding what components IBM® Content Collector for Microsoft SharePoint installs, what SharePoint integration options it uses, and how it processes data is invaluable in helping you to assess its impact on your enterprise system and complete a successful installation and implementation. In this article, get an introduction to the components of Content Collector, the installation process, and the tasks that take place during processing.

Date:  03 Mar 2011
Level:  Intermediate PDF:  A4 and Letter (49KB | 10 pages)Get Adobe® Reader®

Activity:  3437 views
Comments:  

Overview

The IBM Content Collector for Microsoft SharePoint product provides collection and archiving of SharePoint content and extends capabilities of SharePoint to leverage IBM Enterprise Content Management (ECM) products. It installs software on Microsoft SharePoint Web Front-end (WFE) servers as well as Content Collector servers.

This article describes the product components that are installed within SharePoint, details the software installation, and discusses what happens during data processing. The intended audience is SharePoint administrators, Content Collector and ECM administrators, and anyone involved in assessing the environment impact of IBM Content Collector for Microsoft SharePoint.

Components of a Content Collector for Microsoft SharePoint implementation

Figure 1 provides an overview of the major components involved in a Content Collector for Microsoft SharePoint implementation, showing their locations and communication. All cross-server communication shown is over HTTP.


Figure 1. Major components of IBM Content Collector for Microsoft SharePoint 2.2
SharePoint Server contains SharePoint, which contains Content Collector Solution. Content Collector Solution contains Content Collector Feature, Content Collector SharePoing Web service, and Content Collector Link Handler. SharePoint Connector lies within Content Collector Server, and contains SharePoint Discovery and SharePoint Connector Service.

This article covers only the Content Collector components that exist on, or communicate with, a SharePoint server.

Components on SharePoint WFE Servers

A single SharePoint installation deploys the following components to SharePoint WFE servers:

  • ICCSPFeature

    This SharePoint Feature adds one content type and three site columns to a SharePoint site collection. A feature is used instead of programmatic object creation in order to support localization. By design, the feature is not activated during installation, but is activated automatically during configuration or processing. This approach ensures that the additional content type and columns are created only where needed.

  • ICCSPWebService

    This is a web service deployed globally in SharePoint. The development team decided on a web service on the SharePoint server for two reasons. The first reason is simply to get the best performance. The second reason is that the SharePoint Web Service API is only a subset of the SharePoint Object Model API, so by choosing the latter any circumstances where something cannot be achieved due to API choice are avoided.

    The web service offers these methods:

    • DeleteDocument
    • GetBlogPostComments
    • GetDocumentContent
    • GetDocumentMetadata
    • GetFilesForFolder
    • GetListRelativeUrl
    • GetUrlsToWalk
    • GetWikiPageContent
    • HasDocumentBeenModifiedSince
    • LockdownAndMarkDocument
    • MarkDocument
    • ReplaceDocumentWithLink
    • TestWebService
    • UpdateDocumentUrl
  • ICCSPLinkHandler

    This web page handles redirection from shortened URLs to their full lengths. This is necessary due to the URL site column in SharePoint having a limit of 260 characters.

Content Collector server

  • SharePoint Connector

    This is the actual connector, which communicates with the Content Collector SharePoint web service to perform its activities. The SharePoint Discovery component within the SharePoint Connector is responsible for communication with SharePoint. The SharePoint Connector Service is responsible for communication with the task route engine of Content Collector.

  • Configuration Manager

    The administration application for Content Collector enables the configuration of all connectors, metadata, and task routes. To connect to the SharePoint server, the configuration manager uses the SharePoint web service API where possible, and where not, the Content Collector SharePoint web service (which in turns utilizes the SharePoint Object Model API). Specifically, it:

    • validates credentials for site and web service
    • retrieves a list of Libraries and supported Lists in a site
    • retrieves a list of Content Types in a site
    • retrieves a list of Columns in a site or library
  • Content Collector Web Services

    The Content Collector Web Services provide a variety of functions for the different Content Collector connectors. The primary purpose in a SharePoint connector implementation is to provide transparent content retrieval. In other words, when a user clicks a link document in SharePoint, after the Link Handler has verified the user's access, the content is retrieved from the appropriate ECM repository through the Content Collector Web Services.


Installation

The following sections provide detailed information about running the IBM Content Collector for Microsoft SharePoint installer on a SharePoint WFE server.

Prerequisites

The installer inspects the Windows registry for a key for either SharePoint 2007 or SharePoint 2010. If neither registry key is found then the user is informed and installation is aborted.

There are no other prerequisites.

Actions

The installer performs the following actions:

  1. Copies files to install destination folder*:
    1. ICCSPWebService.wsp - solution file
    2. stsadm.cmd – Console commands for solution add/deploy/retract/remove for SharePoint 2007, and for SharePoint 2010 invokes ICCSPWebService.ps1.
    3. ICCSPWebService.ps1 – Powershell script for solution add/deploy/retract/remove for SharePoint 2010
    4. Installer files – the following folders and files are created by the InstallAnywhere installer:
      1. jre – folder for Java runtime
      2. license – folder for translated license files
      3. Uninstall_IBM Content Collector for Microsoft SharePoint – uninstaller folder
      4. IBM_Content_Collector_for_Microsoft_SharePoint_InstallLog – install log file

    *The default destination folder is:
    C:\Program Files\IBM\Content Collector for Microsoft SharePoint
    This can be changed during installation.

  2. Runs stsadm.cmd to deploy ICCSPWebService.wsp solution file.
    The solution contains:
    1. Manifest.xml – solution manifest
    2. Web service files:
      1. ICCSPWebService.asmx
      2. ICCSPWebServicedisco.aspx
      3. ICCSPWebServicewsdl.aspx
      4. ICCSPWebService.dll
    3. Feature files:
      1. ContentType.xml
      2. Feature.xml
      3. SiteColumns.xml
      4. 22 language resource files, covering 21 languages
    4. Link Handler files:
      1. ICCSPLinkHandler.aspx

Farm deployment

The ICCSPWebService.wsp solution is farm-deployment friendly. In other words, you only need to install it on a single SharePoint WFE server, and SharePoint automatically deploys it to the rest of the farm.

File locations

"HIVE" is used below as a short-form of the SharePoint hive location, as follows:
which for SharePoint 2007 is:

  • For SharePoint 2007, the location is:
    C:\Program Files\Common Files\Microsoft Shared\Web Server Extensions\12
  • For SharePoint 2010, the location is:
    C:\Program Files\Common Files\Microsoft Shared\Web Server Extensions\14

After deployment, SharePoint will have placed the solution files as follows:

  • Web Service
    ICCSPWebService.dll is deployed to the Global Assembly Cache. All other files are deployed to:
    HIVE\ISAPI
  • Feature
    All feature files are deployed to:
    HIVE\TEMPLATE\FEATURES\ICCSPFeature
  • Link Handler
    Link Handler page is deployed to:
    HIVE\TEMPLATE\LAYOUTS

Permissions

The user logged on to the SharePoint WFE server to perform the installation must have appropriate permissions (for example, copy files to disk, install and run applications, deploy solutions) to perform all the above actions.

When you configure SharePoint connections on the IBM Content Collector server, the credentials you provide must belong to a user who belongs to the Site Collection Administrators group.

Load balancing

The configuration of the load balancer does not occur during installation on the WFE server, but during installation on the IBM Content Collector server.


Processing

Web service breakdown

During processing, the SharePoint connector on the IBM Content Collector server will make web service calls while performing tasks. Here is a breakdown of which web service method calls each task may invoke.


Table 1. Web service method usage
Task or activityWeb service methods
SP CollectorGetBlogPostComments
GetDocumentContent
GetDocumentMetadata
GetFilesForFolder
GetUrlsToWalk
GetWikiPageContent
SP Get VersionsGetBlogPostComments
GetDocumentContent
GetDocumentMetadata
GetWikiPageContent
SP Create FileNo web service methods are called.
Files are downloaded direct from SharePoint.
SP Post-processingDeleteDocument
HasDocumentBeenModifiedSince
LockdownAndMarkDocument
MarkDocument
ReplaceDocumentWithLink
SP Manage LinkUpdateDocumentUrl
Validate button of either Initial Configuration or
the Connection configuration dialog
TestWebService

Post-processing

The SP Post-processing task has four options, each of which performs different operations, with a distinct processing impact difference overall. Here are the four options, listed in order of typical processing time performance, fastest first.


Table 2. Post-processing options and their actions
OptionActions
Mark as processedTags an item as processed.
DeleteRemoves an item.
Lock downTags an item as processed.
Makes permission changes to an item.
Replace with linkCreates a link document.
Mirrors metadata and permission grantees from original item to link document.
Removes original item.

Conclusion

In this article you learned about the architecture and components of IBM Content Collector for Microsoft SharePoint. You gained insights into the installation and processing impacts on your SharePoint environment. With this information in hand you can be better prepared to assess the product impact to your environment and perform a successful implementation.


Resources

About the author

Brent Benton photo

Brent Benton is an engineer in the Software Group at IBM. He has worked in software development and ECM for eleven years for Yaletown Technology Group, FileNet, and IBM. His contributions to ECM and IBM include numerous ECM integration and content migration tools and products, including the design and development of a previous generation of what today is known as Content Collector for File Systems. For the past four years Brent has worked on what today is Content Collector for Microsoft SharePoint. Currently Brent is the development lead for social media connectors for IBM Content Collector.

Report abuse help

Report abuse

Thank you. This entry has been flagged for moderator attention.


Report abuse help

Report abuse

Report abuse submission failed. Please try again later.


developerWorks: Sign in


Need an IBM ID?
Forgot your IBM ID?


Forgot your password?
Change your password

By clicking Submit, you agree to the developerWorks terms of use.

 


The first time you sign into developerWorks, a profile is created for you. Select information in your developerWorks profile is displayed to the public, but you may edit the information at any time. Your first name, last name (unless you choose to hide them), and display name will accompany the content that you post.

Choose your display name

The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerWorks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

(Must be between 3 – 31 characters.)

By clicking Submit, you agree to the developerWorks terms of use.

 


Rate this article

Comments

Help: Update or add to My dW interests

What's this?

This little timesaver lets you update your My developerWorks profile with just one click! The general subject of this content (AIX and UNIX, Information Management, Lotus, Rational, Tivoli, WebSphere, Java, Linux, Open source, SOA and Web services, Web development, or XML) will be added to the interests section of your profile, if it's not there already. You only need to be logged in to My developerWorks.

And what's the point of adding your interests to your profile? That's how you find other users with the same interests as yours, and see what they're reading and contributing to the community. Your interests also help us recommend relevant developerWorks content to you.

View your My developerWorks profile

Return from help

Help: Remove from My dW interests

What's this?

Removing this interest does not alter your profile, but rather removes this piece of content from a list of all content for which you've indicated interest. In a future enhancement to My developerWorks, you'll be able to see a record of that content.

View your My developerWorks profile

Return from help

static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=Information Management
ArticleID=630302
ArticleTitle=Understanding the footprint of IBM Content Collector for Microsoft SharePoint
publish-date=03032011
author1-email=bbenton@ca.ibm.com
author1-email-cc=

Tags

Help
Use the search field to find all types of content in My developerWorks with that tag.

Use the slider bar to see more or fewer tags.

For articles in technology zones (such as Java technology, Linux, Open source, XML), Popular tags shows the top tags for all technology zones. For articles in product zones (such as Info Mgmt, Rational, WebSphere), Popular tags shows the top tags for just that product zone.

For articles in technology zones (such as Java technology, Linux, Open source, XML), My tags shows your tags for all technology zones. For articles in product zones (such as Info Mgmt, Rational, WebSphere), My tags shows your tags for just that product zone.

Use the search field to find all types of content in My developerWorks with that tag. Popular tags shows the top tags for this particular content zone (for example, Java technology, Linux, WebSphere). My tags shows your tags for this particular content zone (for example, Java technology, Linux, WebSphere).

Try IBM PureSystems. No charge.

Special offers