IBM Datacap handles production-level digitization, data extraction, verification, indexing, and exporting of documents to back-end systems. Datacap components fall into three categories: base, supporting, and optional as shown in the following figure.
Base components are integral to Datacap. Supporting components are external components that Datacap must interface with such as databases and file systems. Some supporting components are required for Datacap to operate. Optional components are not required to operate Datacap but are available. Optional components can provide services such as external content repositories or authentication services. The following are some of the components in each categories:
- Base components:
- Datacap Server: Manages and serves documents and executes the Datacap workflow.
- Datacap Desktop: Provides a thick client interface for running user-attended tasks.
- Datacap FastDoc: Combines an entry-level user interface with business user oriented tools for quickly setting up and testing Datacap applications. In addition this thick client can run in stand-alone mode.
- Datacap Studio: Provides advanced functionality to develop, assemble rulesets, and configure and tests applications.
- Datacap Rulerunner: Executes processing rules on documents.
- Datacap Maintenance Manager: Monitors operations and automates recurring system maintenance tasks.
- Datacap Fingerprint Service: Caches and serves fingerprints to Datacap applications.
- Datacap Web: Provides a Web interface for user-attended tasks and administration.
- Supporting components:
- Datacap Database Server: Hosts Datacap databases for administering and controlling the processes of Datacap applications.
- Microsoft IIS: Datacap Web, Datacap Web Services, and Report Viewer are installed on Microsoft IIS.
- Optional components:
- Datacap Navigator: Provides a Web interface for user-attended tasks and administration. This interface is based on IBM Content Navigator, HTML5 and Dojo technology running in a J2EE application server
- Datacap Navigator Mobile: Provides imaging and capture capabilities embedded in IBM Content Navigator Mobile on iOS and Android devices.
- IBM Content Classification: Categorizes and organizes content by combining multiple methods of context-sensitive analysis.
- LDAP: An LDAP or Active Directory service is also often part of the configuration for Datacap users to authenticate as an alternative to native Datacap authentication.
- IBM WebSphere: IBM WebSphere® is used to host the optional IBM Content Navigator web client.
- Content repository: Used to store the scanned images and the associated metadata. Datacap can export to IBM Content Manager V8 (CM8), FileNet P8, IBM Content Manager OnDemand (CMOD), and non-IBM repositories such as Microsoft Sharepoint and Content Management Interoperability Services compliant repositories.
It is possible to deploy hybrid configuration of IBM Datacap as per organizational requirements. The deployment patterns include:
- Centralized deployment
- Regional deployment
- Web deployment
- Local machine
- High availability
In many scenarios, organizations opt to collocate some of the services. Additionally, Datacap customers commonly use virtualization technologies to reduce the physical footprint of the deployment while maintaining the separation of services.
Here are some sample representations of what an environment could look like.
Centralized deployments are used when operations must be concentrated in one place, such as in a traditional mailroom scenario. This approach is best suited when incoming image volumes are high and when economies of scale can be derived from pooling resources and specializing operators to specific tasks, similar to an assembly line.
In this scenario, Datacap servers and users are located in a single location. Both scan and verify functions are available through either thick or web client. For web clients, organizations can deploy Datacap Web and IBM Content Navigator. For Datacap Web, Report Viewer, or Datacap Web Services running as a IIS application, organizations must install Microsoft IIS.
A distributed deployment configuration is ideal for organizations with geographically dispersed user populations and resources where key system resources can be located closest to users. A distributed deployment can be thought of as a variant of the central deployment model. In a distributed deployment, a large population of users and sizable capture operations justify installing system resources in regional offices.
For example, in a regional office, you might want to install an instance of Datacap, and a departmental scanner.
Distributed deployments are typical in organizations that:
- need to scan documents from multiple locations.
- scan centrally but need to have remote users verify the documents.
- use outside vendors to scan or verify images.
- need to use mobile capture and indexing capabilities.
- have remote multi-function devices and printers which have to participate in the capture process
The following figure shows an example of a multi-region deployment configuration. In this example, there are three deployments of Datacap, each configured differently, all exporting their documents to a central ECM repository. In one region, we show Datacap installed on a single server. Remember to size your environment adequately before making a decision to collocate installation components.
Datacap Web deployment
The Datacap Web deployment patterns can be central server - web clients remote, browser client, or Content Navigator.
Central server - web clients remote
In this scenario, the Datacap servers are all located centrally. Clients use one of the web client options for scan, fixup and verify operations.
The following figure shows a typical web deployment with multiple remote sites. In one remote site, users scan using MFDs which will upload the scanned images to the MFD server. The second site uses Datacap Web or IBM Content Navigator to perform operations such as scan, index and verify. This architecture provides an environment where scanning and verification tasks can be performed remotely, while using a centralized server farm.
Here are two of the Datacap browser clients:
- Verifine web client
The Verifine client is a configurable client. With Verifine, it is possible to modify the layout of the panels to suit your organization's preferences.
- aVerify and aScan web clients
The aVerify and aScan clients are ideal for situations where bandwidth is at a premium. Both clients (one offering scanning services, the other verification and indexing services), use AJAX technology to limit the amount of information that needs to be transmitted between documents making it faster.
IBM Content Navigator provides both user functionality, such as scanning and verification interfaces, and administration functionality. IBM Content Navigator provides drag and drop interface designer capabilities.
In some specific circumstances, it could be possible to install Datacap services locally on a workstation provided it met the minimum supported configuration. Although this type of deployment is rare, it could be ideal for situations where volumes are low enough such that they do not need server configurations.
An example might be a small regional office which scans a small amount of documents and want to perform the validations immediately. Having Datacap installed on a single workstation could help simplify deployment and reduce hardware costs. In this scenario, you could have the same deployment on multiple workstations providing some level of redundancy should one workstation become unavailable.
One option for this type of deployment is to use Datacap FastDoc (FastDoc). FastDoc applications are easy to configure and deploy and can be run either in local or Rulerunner mode.
High availability and load balancing
While it is possible to run all Datacap components on a single server, it is rarely done for a number of reasons which include among others; the need for redundancy and scalability. In this section, we describe high availability and load balancing options in Datacap Version 9. For the purposes of this discussion, we refer to high availability and load balancing as simply "load balancing".
Load balancing is a method for scaling a system horizontally by distributing the work across many compute nodes in a "farm." It also provides high availability by redirecting clients to a working node in case of failure. A load balancer presents a single address for communication with multiple servers, for one or more Datacap applications. Configure the load balancer to send requests directed to each pooled or balanced address to one of the servers in the farm. You can select round robin scheduling or another method.
Clients access Datacap Server using a TCP/IP socket based protocol. You configure the server's name or IP address and port in Datacap Application Manager.
The following figure illustrates a sample load balanced architecture.
- IBM Redbooks publication: Impl
emen ting Doc umen t Im agin g an d Ca ptur e So luti ons with IBM Dat acap , SG 24-7 969- 1
- IBM Redpaper:
- 5 Things to Know about IBM Datacap
Likes before 03/04/2016 - 1
Views before 03/04/2016 - 3690