Modified by jos.olminkhof
In case you were not able to attend the IBM Case Manager in a Mortgage Processing Scenario product demo
event on March 12, 2015, or (like many who attended) had trouble with the video quality during the web cast;
you can now view the recorded session of this demo at your convenience.
The demo covers some of the features introduced in ICM 5.2.1, such as: external documents, external data, push in-baskets and To-Do lists, and shows the products through a desktop as well as a mobile user interface.
DataCap (9.0), IBM Enterprise Records, Cognos Real-time Monitoring and Watson Content Analytics are shown briefly in this demo as well.
Click here to watch the recorded demo
Click here to download the Demo Presentation:
Click here for more information about IBM Case Manager.
We continuously try to improve our portfolio of ECM demos, and any feedback or request you may have is much appreciated. You can reach me at email@example.com.
This information is provided “as is” without warranty of any kind, express or implied, and is based on current IBM product plans
and strategy, which are subject to change by IBM without notice. IBM shall not be responsible for any damages arising out of the
use of, or otherwise related to, this document. Nothing contained in this document is intended to, nor shall have the effect of,
creating any warranties or representations from IBM (or its suppliers or licensors), or altering the terms and conditions of the
applicable license agreement governing the use of IBM software.
IBM, the IBM logo, and ibm.com are trademarks of IBM, registered in many jurisdictions worldwide. A current list of IBM
trademarks is available at ibm.com/legal/copytrade.shtml. Other company, product, or service names may be trademarks
or service marks of others. © Copyright IBM Corporation 2015. All Rights Reserved.
Modified by Jackie Zhu
Tomas Barina is an ECM Consultant with IBM Software Group in Czech Republic. He has more than 10 years of experience in content management field. For the last eight years, Tomas focuses primarily on design and delivery of FileNet based solutions. His areas of expertise include solution design, ECM, and mobile development. Tomas holds a Masters Degree in Computer Science and Artificial Intelligence from Czech Technical University.
IBM Content Navigator (ICN) has become the main UI of IBM ECM product and solution portfolio and increasingly, larger number of products and custom applications are adopting its unified framework for UI development. ECM solutions are now becoming unified by its look and behavior. But can your solution handle mobile platform? Are people asking you about mobile support for your ECM solutions? Do you want to extend your ECM solution to mobile users yet you are not sure about options you have? You come to the right place. I will try to answer some of your questions.
Mobile development options
You probably have heard about the native client for iOS that you can download from AppStore and use with ICN. This client allows you to quickly provide core ECM functionality to your users. However, it is obvious you cannot customize the UI appearance with it. But don’t forget, it still uses ICN backend, so you can modify the data that the client consumes or produces by using request/response filters or you can add your own mobile features where you can open your custom applications directly from within the client. And why not use UI developed in ICN framework?
Have you noticed how many websites are multi-channel these days, meaning adapting to the client and resolution the visitors use? When developing your ICN plugins, keep multi-channel in mind and decide about data representation directly for the device or to optimize your layout on the fly.
IBM Worklight sample
What exactly is this IBM Worklight? IBM Worklight helps you extend your business to mobile devices. It’s an IDE, and it’s also a runtime framework you can use for your mobile applications development.
ICN ships with IBM Worklight Consumer Edition license which makes it possible to use IBM Worklight server components to extend your applications, for example various adapters, push notifications, etc.
For ICN sample that ships with ICN, however, you do not need any additional runtime infrastructure. Running ICN server is enough.
Figure 1- Sample UI
Another option is to pack the whole sample as an ICN plugin. Then you don’t have to install anything on the device. Of course, you will not be able to access native device APIs.
Figure 2-Sample architecture
In the IBM Content Navigator Redbooks publication, we provide a guided tour showing how to add a new work feature to this sample and how to display a list of work items and their parameters. There’s a chapter there that helps you understand how the sample works and how to customize it for your specific needs.
Take a look at our video (coming soon) that demonstrates the customized sample and check out the IBM Redbooks publication for more details.
For IBM Content Navigator related blog posts, see:
For more information on IBM Content Navigator, see IBM Redbooks publication:
Modified by Jackie Zhu
Paula Muir is a Software Developer with IBM Content Manager OnDemand for Multiplatforms in Boulder, Colorado. She has 20 years of experience with Content Manager OnDemand and 15 years of experience in the data indexing field. Her areas of expertise include indexing and loading data, and AFP and PDF architecture.
In one of the blog posts I wrote earlier while writing IBM Content Manager OnDemand Guide (an IBM Redbooks publication), I talked about how to successfully index your data for IBM Content Manager OnDemand (CMOD). Today, let’s talk about a more interesting topic: How indexing is misused.
Let's say we have some kind of financial statement that contains lists of names, account numbers, and balances. Some customers decide that they want to collect every name, account number, and balance from the statement as an index value. Then, when the results of their document search appear in the Search Results screen, they can see the information that they are looking for.
The thing is, they don't want to look at the actual document at all. They only want to look at the Search Results screen.
Here's a document – let's collect every value as an index!
It sounds great but CMOD is not really designed to do this. It is meant to be a document archive system, not a system to generate a subset of document information.
Since CMOD is not designed to do this, inevitably, customer who try this run into problems. The first problem is that the performance of the indexing and loading of the data is really bad. It's bad because the indexer is collecting gazillions of index values from their documents.
The next problem is that the index values usually don't display on the search results screen in the way that customers think they should. They may not be in the exact order that they appear in the document. This occurs because of how the indexer collects the values, or because of how the values that are returned to the client by the database.
Eventually, after much trouble, it is better not to do this.
I will spend the rest of my life exploring Italian wine. They use hundreds of different grapes, some going back to Roman times. So with many different kinds of wine, we are way beyond Cabernet Sauvignon and Chardonnay here. If you like seafood, try a Soave, made from the famous Garganega grape. Everyone's heard of that, right? If you like pasta with red sauce, try a Valpolicella Ripasso, made from three grapes you've never heard of. If you'd like to try a real stunner, shell out the money for an Amarone, Valpolicella's big brother. Made from the same grapes, they are left out to dry and shrivel and become incredibly concentrated. Although it is a dry wine, it has a searing richness of flavor. I've tried serving it with pasta, meat, even pizza, but at this point I think it works best with cheese. Just cheese and crackers on a board. It's just too big and rich for food. By the way, all these wines are from the Veneto region of Italy, near Venice.
Three wines down, 200 more Italian wines to go.
For Content Manager OnDemand related blog posts, see:
For more information on Content Manager OnDemand, see IBM Redbooks publications:
Modified by Jackie Zhu
Paula Muir is a Software Developer with IBM Content Manager OnDemand for Multiplatforms in Boulder, Colorado. She has 20 years of experience with Content Manager OnDemand and 15 years of experience in the data indexing field. Her areas of expertise include indexing and loading data, and AFP and PDF architecture.
Last time I mentioned that I was writing an IBM Redbooks publication, Content Manager OnDemand Guide, and that I could write a whole book about document indexing for IBM Content Manager OnDemand. Why?
I could write a book about indexing in order to explain:
The different data formats
Indexing concepts and what the point is of the whole thing
How indexing is misused
Common user errors
How to fix badly formed data
How to use the CMOD graphical indexer
How to use the ACIF exits
I'll think of more later.
Today, in addition to red wine ( we'll get to that ), this is the super condensed explanation of PDF floating triggers, which is a new feature of the PDF Indexer in CMOD V9. Hang on!
Warning: If you know nothing about indexing, the following will make no sense to you. Just skip to the bottom and read about wine.
In order to understand floating triggers one must understand group triggers.
Group triggers occur once in a group.
All the group triggers must be found before any fields are collected.
When any field based on a group trigger changes, a new group is started.
Therefore the group triggers and fields determine group boundaries.
A floating trigger may occur multiple times or may not occur at all in a group.
Floating triggers operate independently of each other.
Floating triggers do not define group boundaries.
Since the floating trigger may not occur, a field based on a floating trigger must have a default value defined.
Also, because the floating trigger may not occur, or may occur more than once, a field based on a floating trigger cannot be combined with other fields. Otherwise, the results would be chaotic.
The index values collected from the floating triggers will appear in the Search Results in the same row that contains the index values for the group to which it belongs.
Example of group triggers and floating triggers
Here is an example of a document where a float trigger is needed. In the following statement, the text “Checking Account Balance” or “Savings Account Balance” might or might not occur, depending on whether the accounts exist. If they do exist, you would like to collect the balance amounts to use as fields.
The group trigger will be “Name”, the float triggers will be “Checking” and “Savings”.
For a verbose explanation of what I just said, with examples of why one would ever want to use a floating trigger, see the article in the Content Manager OnDemand Newsletter for 4th Quarter 2012, at:
I was recently in Germany with my Dad, and tasted a wine called Spatburgunder. I had no idea what it was until I got home and looked it up. I only knew that I really, really, liked it. I found out that Spatburgunder was the German name for Pinot Noir. Really! Of course I can't find any Spatburgunder here in Colorado. But maybe if you're on the east or west coast, along the trade routes and all that, you might find some. It's great!
I was going to write about Italian wine. Next time.
For Content Manager OnDemand related blog posts, see:
For more information on Content Manager OnDemand, see IBM Redbooks publications:
Integrating Service Level Baselining, Performance Reporting, and Application Monitoring into your Software Development Methodology
This is the second installment of a multi-part blog series that examines how to leverage application and user experience monitoring when developing applications, especially customer facing applications. It examines integration with different methodologies and varied infrastructure deployment. The series is not intended to be comprehensive, but is a reflection on my personal experiences and time spent with hundreds of ECM customers since starting with the ECM industry in 1996.
Integration of application and user experience monitoring into your SDLC with a traditional Waterfall methodology can quickly pay dividends. As a software developer, timelines are tight and the demand for low defect code is high. A developer can’t take a casual attitude towards the early stages of a project, because little issues at the beginning can become “show stoppers” later on.
Some terms and acronyms used in this installment are outlined in the first installment.
The Software Development Life Cycle typically contains several phases: Requirements, Design, Development/Implementation, Testing, and Operation/Maintenance phases.
During the Requirements phase, the functional requirements are specified, outlining what the application is “supposed to do”. An important part of the Requirements phase is the non-functional requirements – requirements that outline how the system is supposed to “operate”. These can include: SLAs, HA/DR operation, auditability, performance, usability, capacity, supportability, and response times.
The Design phase creates the system architecture that meets the requirements outlined in the previous phase. The design should not ignore “Run the Engine” (RTE) components and must take into consideration the non-functional requirements. Design should eliminate any “magic happens here” black boxes, especially those driven by vague business requirements or last minute management cursory reviews.
The Development/Implementation phase is where the application development team “cranks out the code”. In my experience, development teams produce a quality product that meets the outlined functional requirements. Issues with the overall application are typically found with the interaction between other applications or systems. Often the non-functional requirements aren’t uniform between applications or systems leading to problems during integration. Also business requirements can have issues of “That’s what we specified, but not what we meant”.
The Testing phase is where an independent testing team compares the application developed against the requirements (functional AND non-functional) specified during the Requirements phase. Integration testing, where the new application interacts with existing or newly built applications, can lead to considerable remediation efforts. Testing should not be underestimated; it is the application’s first contact with real world users and issues.
The Operation/Maintenance phase begins when the development team completes development on the release and hands off the application to the operations/support staff. The application is made available to the target internal or external customers and starts to perform the work that the requirements outlined. The support teams need to keep the “engine running” and provide feedback to management and the development teams about what is working well and where remediation may be needed.
The Waterfall methodology is a sequential process that closely follows the SDLC outlined above. Each step in the SDLC “flows” to the next, with some overlap between steps. Requirements lead to a design, which leads to development, then onto testing, production, and finally maintenance.
The Requirements phase should produce a number of non-functional requirements. The Business User community imposes some of these: response times, SLAs, system availability. Other non-functional requirements come from operations staff and management: HA/DR, capacity, auditability. Business users typically tend to gloss over many non-functional but critical “plumbing” requirements, as they are focused on what the application is “supposed to do”. However, during application testing and deployment, the lack of defined non-functional requirements can become a large point of contention with the development team.
The Design phase needs to include detailed design elements on how to meet non-functional requirements. For example, is logging/reporting to meet audit requirements going to be written directly into the application or accomplished externally? Application Monitoring (AM) and Experience and Performance Monitoring (EPM) should be integrated into the design, allowing full use of tools and reporting during subsequent phases. The application development teams needs the design to be able to demonstrate that they are meeting the functional and non-functional requirements.
During the Development/Implementation phase monitoring starts taking a more pronounced role. Leveraging application, system, and experience monitoring in the development phase provides the development team insight to meeting the non-functional requirements. How is the system performing, what are the user response times, are there any errors being generated? AM can provide: early information on performance metrics, queue depths, component status and component interaction from application operating behavior versus finding design issues in near production rollout. Automated monitoring, reporting, and correction can reduce the waste of valuable development time troubleshooting basic operational issues during development. Including AM early in development helps provide a more complete and supportable product to support/operations, freeing the development team to work on new projects.
Both AM and EPM pay huge dividends during the Testing phase. The testing team can certainly follow test scripts to validate functional requirement use cases, but how do they objectively measure non-functional requirements? The testing team can’t submit a report to management stating that “we think the system responds fast enough”. EPM allows the team to provide exact response times for users during testing and over a variety of situations (load, location, error conditions). Properly implemented monitoring can provide objective reporting on capacity, storage use, response times, errors generated (and remediated) and a host of other “non-functional” items. Properly designed monitoring also allows troubleshooting of issues encountered during integration testing. Providing information about data flowing in and out of an application and sending alerts if actual operation is different than expectations.
Baselining a system during the final user acceptance test is critical; a baseline gives management and support an overall “picture” of the application. Once the application goes into production, comparison to the baseline will help identify bottlenecks, volume related performance issues, capacity, and growth. As application or environment fixes and enhancements are put in place after “go live”; comparison to the baseline measurements provides verification that the changes remediated the issue, improved performance/operation, or most importantly “did no harm”. Service Level Baselining allows management to have a measurement of user interactions in an ideal situation and again offers a comparison point when user load or environmental issues occur.
As the project follows the Waterfall methodology and the application is placed into production, AM allows the support and management teams to know that the application is “really working”. Typically, the support team has a whole host of applications they are responsible for and can’t have the level of understanding and involvement with the application that previous teams (architects, developers, testers) do. Properly designed and configured monitoring allows the support staff to have in-depth and immediate visibility into an application. By using Application Service Level Monitoring the system can be administered by less experienced administrators, freeing up senior resources for critical issues elsewhere. In the case where the application is running out of JVM memory or storage, the support team can be alerted before it becomes an issue and a “fire drill” eliminated. If during production users report “slow” performance, EPM objectively reports response times and SLAs as accurate input for constructive business operation reviews. It’s important the Application owners have full visibility of the application operating “stack” when an issue is occurring.
Maintaining the application takes a couple tracks. First, the application development team many need an effort to remediate any application defects found upon contact with the users (never under estimate the ability of the user base to quickly exercise defects!). These defects also include non-functional issues. Often how the Business Analysts think a user will use the system is very different from how a user actually uses the system. Performance issues and bottlenecks may be identified with Application Monitoring and the gathered metrics and comparisons will help the application development team track these down.
Secondly maintaining the operating environment, which usually falls to the application support team, needs to be considered. After common support tasks occur, such as network changes, additional storage/CPU/memory, are complete is there an automated set of monitors that lets management and support know that the system is back in a fully operational state? Have the changes impacted the business users, either positively or negatively? The network changes may have been made for performance reasons, but do the users see a 1 second improvement or only a 1ms improvement? Getting ahead of potential issues helps maintain the system. Giving the storage team a couple extra weeks to purchase, provision, and configure storage make a huge difference for deployment and harmony. Having properly configured monitoring can help achieve this harmony between groups.
Planning for and integrating Application and User Experience Monitoring early in the SDLC, and with each phase, provides immediate benefits and helps to produce a better, more complete, and sustainable application. Application and Service Level Monitoring shouldn’t be an afterthought, only for the application support team to worry about. Every phase of a Waterfall based project can make use some aspect of monitoring, ultimately making your life as an application developer, support engineer, project manager, or application manager better when working with a new application. Implementing AM / EPM during throughout the SDLC sends positive signals to the business participants in the project– that you understand that service levels and end user response are key project success criteria.
The next blog entry will talk about integration of monitoring with Rapid Application Development methodologies. In the world of short timeframes and high expectations, it’s imperative to know what’s really going on.
Integrating Service Level Baselining, Performance Reporting, and Monitoring into your Software Development Methodology
This is a multi-part blog series that will examine how to leverage application and user experience monitoring when developing applications, especially customer facing applications, to achieve world class service levels. It will examine integration with different methodologies, using various infrastructure deployment approaches. The series is not intended to be comprehensive, but is a reflection on my personal experiences and time spent with hundreds of ECM customers since starting with the ECM industry in 1996.
Traditionally monitoring has been considered a “Run the Engine” (RTE) type of activity, much like the dashboard lights and gauges on your automobile. Deploy the application and start monitoring to make sure the application is running. In reality, monitoring must be integrated early in the development process to provide data, get user feedback, and to prepare for deployment and RTE activities. Monitoring helps to improve the development process and be better prepared for production, especially when the integration starts early. Done correctly, application monitoring contributes to the ‘DevOps’ shift occurring within IT organizations.
When developing a new business application or product there are many items to consider including:
- Does the application meet the business requirements?
- Is the end product relatively defect free?
- Does it integrate cleanly into the existing environment?
- Does it follow established coding practices?
- Is delivery going to meet the designated timeline?
- And of course, can it be delivered within budget?
Timelines can be a real problem. Given a set of business requirements, develop a product that meets those requirements and will be complete in time to meet a business, regulatory, or other timeframe. I’ve experienced, that since requirements need to be met and are the most visible and tangible deliverables, it’s the operational items, performance testing, and comprehensive in-depth understanding of the application that suffer the most.
In many cases, as developers strive to meet the previous concerns, other important topics get pushed out until after deployment or end up being dropped all together.
- What is the true performance of the application?
- What are the potential bottlenecks and how are issues identified?
- In the production environment, what is the user experience?
- Does the application meet the user’s performance expectations?
- What are the baseline performance and service levels?
- Is the application “instrumented” to provide good metrics?
- Can the application support team properly support this application?
- Is the application support team trained and prepared to support this application?
As a software developer, there are a couple types of customers that your application answers to.
- The first is the traditional end-user, the employee who uses your system to complete their day-to-day activities and/or the general public who uses the system from outside the organization.
- The second is the business manager/owner who requested the application be developed. They need to make sure the end-user customers are happy with the application and that it is providing real value to the business. The business owner customer needs to understand how the application is operating/performing – how it is “getting the job done”. Having a happy and well informed business owner customer is very important, because they most likely have just financed your project and they (or their peers) will be paying for your next project.
As a starting point for some of the topics that will be discuss in future entries, it’s important to outline some terms. There may be more exhaustive definitions or slightly different definitions for these terms, but I’m using the terms as described below. I’ve introduced a couple already.
RTE – “Run the Engine”. After the application has been deployed and put into production, RTE is the effort and adjustments to keep the application performing its designated task(s).
SDM- Software Development Methodology. The plan, processes, and controls that an application development group uses to deliver an application that meets specified business requirements. Also can specify a linear or iterative approach to development.
SDLC- Software Development Lifecycle. Closely related to the SDM, this outlines the processes, phase, and deliverables needed. The SDLC encompasses much more than development phase.
Waterfall Development – A linear and sequential development approach. Traditionally “big project” type development with long timelines.
RAD – Rapid Application Development. Iterative development approach. Agile and Scrum are popular development approaches.
Baselining- The process of measuring, analyzing, and documenting performance at a given point in time. These metrics are used as a reference to compare and relate to future metrics. A “snapshot” of system performance.
Performance Reporting- The process of gathering, storing, consolidating, and distributing operational metrics for an application or process. This applies not only to “physical” metrics (CPU, memory, I/O), but also process metrics (time from ingestion to completion).
SLA- Service Level Agreement. An agreement between a service provider and the consumer of that service. Typically outlines items such as: system availability, response time, processing volumes, and other metrics.
System Monitoring – “Ping, power, and pipes” monitoring. Provides information that the “hardware” and operating system is operational . Often provides some system performance information like CPU, storage, and memory usage.
AM - Application Monitoring. Monitoring at the “application” level. Provides information on how the application is performing, processing information, any errors or potential issues. End-to-end status of data flow is possible, with metrics and reporting throughout the process. Extends system monitoring to a more granular level on items related to the application.
ASLM - Application Service Level Monitoring. Externally and objectively looking at system AND application performance. Alerting, reporting, and automatically responding to the metrics gathered. Through analysis of metrics gathered over time, a better understanding of application operation is achieved. Using alerting and automated response, a more stable system and process that meets agreed upon SLAs is provided to the customer.
EPM - Experience and Performance Monitoring. Monitoring actual user experience while using an application (not synthetic transaction monitoring). Helps support staff bridge the gap between how the application is running and what the business user is experiencing.
The next blog entry will examine integration of these monitoring topics into a “traditional” SDLC and with Waterfall methodologies. Future topics will include: integration with RAD methodologies, working with infrastructure, communicating the appropriate information to keep the customer happy, and monitoring technology guidelines.
In the previous entry, we reviewed the purpose of metadata in a Content Manager context, two different design approaches and the purpose and uses for object stores. With these basics covered, today's entry will review property templates, classes, folders and choice lists.
Properties are the mechanism by with IBM FileNet Content Manager defines the metadata to be collected about an object. In general, a property defines the data type, cardinality (single- or multi-valued), whether a value is required and more. In the first part of this series, we described how metadata is generally used to help find a document, therefore properties are generally what help define how a document can be found in the system.
Before I get into the details on properties, there is a small bit of my background I would like to share as a preface to this discussion. Back in 2000-2002, I worked for a web content management firm in Bethesda, Maryland called eGrail (which was ultimately acquired by FileNet). One thing I learned from my time there was to, as much as possible, design interfaces with the user in mind. If you make it simple for your end users to understand then they are more likely to use the system. The following recommendations tend to flow out of that principle.
Keep it Simple
One common problem I have seen when working with customers is a general lack of simplicity in the design of their taxonomy model. Or more specifically, the defined model includes a number of properties that make sense from a developer’s prospective, but not from an end user’s. When looking the data model, it is worth asking whether a user will understand what the property is supposed to contain. If it is not obvious what a property is for or why it is necessary, the quality of data provided for that property may be suboptimal.
Descriptive Property Names
A second issue to keep in mind when building a data model in Content Manager is that often the first help end users will see are the names given properties. This is especially true of Workplace and Workplace XT deployments. While confusing property names can be overcome with proper training or documentation, it is likely better to simply select a more descriptive name for the properties to begin with.
Properties Describe a Document
Another issue that I have seen when working with customers is a tendency to try and copy all possible fields from the document itself into the metadata for the object. While there may be very good reasons for doing so in a specific application, it is as likely that rekeying large amounts of document information into metadata may not be useful. For example, in the case of a customer document, the metadata model could very well include not just the customer identifying number, but may also include first name, last name, address, etc. In this case, there are two questions worth asking: will a document really be searched for by customer name and how will any customer name changes be handled.
Much of the time, the search will be based on an customer number or account number. I would also suspect that the customer name to customer ID number is stored in a separate system that can be treated as system of record. If that is the case, then why not rely on that system to maintain the current (and maybe history) information for cross reference purposes. The front end application could then query this system or service for the customer number, then turn to Content Manager to retrieve the required documents. There is an additional layer of capability available in Content Manager to help in the above situation, but that will be covered later in this series.
More Properties == More “Whitespace”
One additional problem with too many properties or an unclear taxonomy scheme is what a coworker of mine calls the “whitespace” problem. Look at the two theoretical classes below. One has two required properties (plus Document Title) and the other has nine non-required properties. What can happen in this case is instead of entering metadata for the nine properties, a user could check in the document with no metadata. However, with the second class where there are two well defined required properties, an end user must enter values for those properties before committing the document.
I have personally worked with a system that had this kind of structure. The default class had about fourteen properties, none of which made any sense. Since this system was also using foldering to help find the documents, when I checked in a document, I never entered any properties. In the end, the system was useable because we built a folder structure that worked, but the properties were wasted because none of us ever used them.
Once the properties have been defined, the type of documents or content to be stored need to be defined. In Content Manager, that structure is defined through classes. In the same way that a Java class defines the properties and methods, a Content Manager class defines the properties collected, how a document’s content is stored and the content lifecycle.
When creating classes, there are two basic organizational methods that can be used. Classes can be created along organizational/functional lines or by content types. Organizational grouping works by defining classes by business unit, geographical region or any other organizational group that works. Content-based classes help organize document based on their type, such as presentations, meeting notes, report, etc. Just like the top down, bottom up discussion in part one, the appropriate way of creating classes varies by organization and business driver. Neither one is necessarily better than the other, and which is appropriate may be driven by business or technical requirements.
Regardless of the approach selected, there are a number of guidelines to keep in mind:
Object Oriented Design Principles
The basic deign of the Content Engine is object oriented and allows for application of OO-based design principles of reuse, sharing and sub-classing. In the same way a Java application could have a base class and subclasses that further define the behavior, a Content Engine class can be sub-classed, with the subclass inheriting the properties of the parent.
In a top-down based metadata structure, this would allow the common properties to be placed on the base, or parent class and then each organization to add subclasses to further define the structure and customize to the specific requirements. Furthermore, the Content Engine allows for searching across classes by searching on the parent class. For example, if the model has two kinds of customer documents that share a parent class of customer document, an application or stored search can find both kinds of documents by searching the parent customer class.
Even if it is not possible to apply sub-classing to a given data model, it is still possible to share properties across multiple classes. Doing so can still help facilitate cross-class searching. In addition, creating additional property templates and adding them to classes will result in additional columns in the object store database. From a performance aspect, sharing properties also helps because adding an index to a single column will then speed retrievals for multiple classes.
Another advantage to sharing properties can be consistency. If every class uses the same property for first name, last name or identifier, then end users will come to understand this and be able to move from line of business or class to class with little retraining.
Default instance security
While not technically a part of the object model, security is an important consideration in designing a Content Manager system. Content Manager affords an enterprise quite a bit of flexibility in securing content. However, a good starting point for many systems is a class’s default instance security. This defines the default security for all new instances of a document or object and can provide a simple way to ensure that documents committed are properly secured. Default Instance Security works by copying the ACL defined on the class’s default instance security tab, replacing #CREATOR-OWNER with the user creating the object and then using that ACL on the new document.
Another element of a Content Manager data model can be folders. In Content Manager, folders offer a way of grouping like documents together in a single, retrievable container. As a general rule, folders are useful for smaller, browsable datasets. Browsable datasets could include office documents or any set of documents that naturally lend themselves to one grouping. Workplace, Workplace XT and FileNet Integration for Microsoft Office make it easy to apply this kind of structure.
However, if folders are improperly used, it can have a performance impact when browsing or retrieving objects. In addition, foldering is not a substitute for proper metadata and searching for a large dataset. The reasons are two fold:
- If the dataset in an application is greater than a couple of hundred items, end users will not be browsing for the document, as it is too cumbersome. Instead, it would be better to enable the users to search for the document(s) necessary
- If the number of items in a given folder reaches a high enough number (sometimes as little as 50), Workplace and Workplace XT requires the user to page through the list. This means increasing the number of clicks or adding time to finding the document(s) desired.
If folders are deemed necessary, there are a small number of recommendations to keep in mind:
- Limit the number folders or items at a given level to between 20 and generally no more than a couple hundred
- Ensure the structure is self-explanatory. If the users are not sure what the structure is ad what it means, the end result could be lost documents and confusion.
- Ensure the structure works for the organization as a whole
Part 3 of this series will look further into alternate or additional ways of storing and applying metadata to help meet these challenges.
When implementing an IBM FileNet P8 Content Manager-based solution, one of the more complex issues to resolve is the data model. A properly designed data model can ease storing, finding and retrieving documents and content. However, a poorly designed data model can lead to confusion, lost content and missed business opportunities.
This entry is to be a primer for someone building a new data model or looking to revise an existing one. The principles and recommendations in this series are a result of written best practices and recommendations, as well as the collective experience of the IBM ECM Software Services group has collected in fifteen-plus years of implementing various FileNet products from Image Services to IBM FileNet P8.
The review will be broken into four basic sections:
- The purpose of metadata
- Design approaches
- The basic design elements available within Content Manager
- Additional features of Content Manager for information management
Purpose of Metadata
Any description of lessons learned and recommendations for handling metadata manipulation and design stars with an understanding of the purpose of metadata. At its most basic level, metadata is data about data. In the case of IBM FileNet P8 Content Manager, metadata is data that is collected about a document so that it can be stored and retrieved at a future point. This metadata in Content Manager is stored as properties on a document (or object) and is available for searching via Stored Searches, Search Templates, out of the box applications (such as Workplace and Workplace XT) and via the API.
Any metadata collected should generally help a user find a document. Good examples would be account number, customer ID, case number, etc. What metadata should not be is a wholesale copy of the data in the document. Any structure that dictates keying in a good majority of the document content into the metadata can be reexamined. There may very well be a reason for doing so, but if the key purpose of this type of data is to just help find the documents, then this may be requiring too much time when committing the document. In addition, if the metadata itself is useful (for future data mining or for the application itself), there may be a reason to not keep the original document, but keep the metadata only. Part three of this series will cover custom objects and ways they can be used to store metadata-only.
When approaching the metadata design, there are two basic starting points: a top down, enterprise approach or a more departmental, bottom up approach. This section is a summary of what is described in the IBM FileNet Content Manager Implementation Best Practices and Recommendations Redbook, Section 5.2.1. While this summary is a good start, the Redbook itself has more information and should be read both for this information as well as a large number of other recommendations and best practices.
A top down approach starts with an enterprise wide view of what is to be accomplished with the data model and how this integrates with the wider goals of the enterprise. Practically this means that an enterprise wide system would review the basic metadata that every document should contain to comply with these goals and mandate this. Then at each lower level, for example department or organizational unit, the business analysts would work to add the metadata elements appropriate to that line of business or problem to be solved.
One key advantage to this approach is a unified structure allowing for better organization and cross line of business searching. Having the enterprise mandate, for example, that any customer document have the appropriate customer identifier as a metadata property would make it easier to identify all documents related to a specific customer.
However, if not properly balanced, an enterprise approach can lead to creation of shared properties that are not necessarily appropriate for all business units. In the example above, the enterprise architect may determine that a customer ID is appropriate for the enterprise structure, but it may not be appropriate for human resources or any non-customer facing class. As such a mandate to include it may be counter productive.
The alternate approach turns is to start at the department level and work up, generalizing where possible. If a system was born as a departmental system, this is likely how this happened. Each department went about solving their own problem and then tried to incorporate themselves into the enterprise at large.
This approach does have its advantages. Most important is that each group can start with what is important to them. In the above example, customer facing and noncustomer facing lines of business or departments may have totally different needs and requirements. Starting at the bottom would allow for a more customized structure that meets their needs.
The downside is the potential for lack of consistency can make cross line of business searching and data mining more difficult. A lack of standards for naming convention and data integrity requirements may exacerbate this problem. Which approach, or how an organization balances these two approaches depends on its needs. Neither answer approach is “best”, so it may be an issue of just picking the best of both worlds and applying them. Also, regardless of the approach, it would make sense to have one person or group who is responsible for the data model, or at least to oversee the creation of the model to help ensure consistency and interoperability.
Basic Design Elements
After selecting the approach to designing the structure, it helps to know what tools Content Manager offers the designer and architect for building that taxonomy. This part of the entry goes into them it a bit more detail and how they can be both used and misused. Where possible, I've also tried to give concrete examples of issues I've seen at customer sites related to each.
Ignoring the Content Engine domain as the lowest level container, the primary data container or organizational unit within the Content Engine is an object store. An object store contains a distinct set of documents, properties, metadata and security. Since the data is completely separate from any other object store's data, it provides a convenient way to subdivide a data set or provide separation between lines of business.
When working with customers, I'm often asked how many object stores should they create and use. The answer I always give is the same, it depends. There is no hard and fast rule, no mathematical equation that I can offer to help decide. Instead, what I generally offer are four basic guidelines:
As few as necessary
Object stores are generally easy to create, so an administrator might be inclined to create a large number. In fact, it may be possible for each line of business, no matter how small or large, to have its own object store. And that may work, but remember that with the addition of each object store, adds another set of databases or schemas (depending on database), storage areas, classes, properties and security that have to be managed.
For this reason, we (ECM Software Services) tend to recommend a smaller number of object stores, if for no other reason than to limit the amount of maintenance that needs to be done on a regular basis. In addition, with larger numbers of object stores (and just about any type of item in Content Manager), the more confusing the choice of where to store data can become for your users and even your developers and administrators.
Enough to segregate unrelated data
Of course, the flip side of having a large number of object stores is that there is an increased risk of improper security on documents allowing the wrong user access to information. For example, if Human Resources and Marketing information are mixed, it is possible to use access control to ensure that marketing people cannot access HR documents and vice-versa. But if someone were to either by accident or intentionally modify security on one of the HR documents, someone from marketing could potentially find and view this sensitive information.
If these documents are in separate object stores, the object store security itself can ensure that only HR users can access HR documents. Think of it as both locking the HR file cabinet, but also locking it in a separate room. If the cabinets are in a separate room, even if someone were to accidentally leave a cabinet open (or their key in the cabinet lock), the lock on the office door itself will provide an additional measure of security.
Enough to meet performance requirements
Another facet of the object store count discussion revolves around performance and activity levels. While there is no limit to the number of items in an object store (or any limits would be database dependent), there is a more practical set of limits. In addition, even if a dataset is not likely to hit that upper limit, there may be two datasets that are both accessed very heavily requiring different performance tuning settings. Having them in separate object stores can allow for each schema or tablespace to be treated separately, indexed properly and maybe even located on separate spindles for better performance.
Minimum 2 (settings, content)
One other recommendation is to create at minimum two object stores: one for content and one for Workplace/Workplace XT preferences. I've worked with a number of customers who created only one object store, stored their preferences there with their data. Then, when it comes time to bring a second line of business into the system, they discover that they need to lock down the original object store. However, since their preferences are in there and every user of the system needs access, they cannot lock that object store down. The result is they have to reconfigure Workplace, and while not a monumental undertaking, it is still a maintenance task that can easily be avoided by just separating it during initial configuration.
Control Object Store
This recommendation came from some of my coworkers in field delivery. If a third object store is created and used solely as a control object store, it can used to validate out of the box functionality when upgrading. If an upgrade causes issues with creating or retrieving documents in the primary object store(s), the same operations can be performed in the control object store. If it fails there, it is more likely a system issue and not a code or data issue.
Part two will cover additional components in a Content Manager object model, including properties, classes and folders.
Welcome to the ECM Application Center! This community is long overdue but as they say, better late than never. I won’t rehash what is already available in other blog entries about the purpose of the community. I will however make the point of its importance to customers, partners, and IBMers. First and foremost, the community is a place where these three key stakeholders can collaborate and discuss in an open forum ECM topics of relevance and importance. The primary goal of the community is to share information with those who have common interest.
To our customers, you have a forum to solicit and exchange information on ECM topics related to your business. The value here is the ability to tap into the collective knowledge of not only IBM sources but the wealth of knowledge that our business partner community brings to the table. IBM ECM business partners are the most innovative and technically astute business partners you will find in the industry today. Their subject matter expertise and industry focused solution offerings have proven time after time the success that can be achieved when you align industry expertise, product expertise, and the best ECM software products in the industry to solve business problems. Take advantage of this collective information source to help drive better outcomes for your business.
To our business partners, you have a forum to share with others your technical insights and the unique value your company has to offer. Your active participation in the community is an opportunity to demonstrate industry and product expertise. Insight into best practices and how your company has solved similar difficult problems will demonstrate thought leadership. Insight into your solution offering, technology assets, and past successes will demonstrate experience and “know how” with deploying innovative ECM solutions. The opportunity to gain mindshare is there for the taking when you actively share your knowledge and tangible assets (e.g., sample code, whitepapers, solution assets, etc.).
The community will also serve as a forum for IBM to gauge support demands from both customers and partners and identify areas of improvement. From the perspective of Channel Technical Sales, I hope and expect to find new ways to support our business partners as they interact with the community.