ExampleCo. Enterprises, a fictitious company, might be targeted in a lawsuit. The company needs to ensure that potentially relevant documents are placed under the control of a records management system.
Bob, an IT administrator for ExampleCo. Enterprises, is tasked with adding a large number of documents from multiple file systems to the corporate data repository. To prepare for a potential legal dispute, the company needs to declare the potentially relevant documents as records and assign appropriate retention and disposition rules to each document. The documents and records must be stored in particular folders and document classes in the case vault repository so that they are available for legal review. After the relevant documents are declared as records and stored in the appropriate folders, the legal team will review the documents by using IBM® eDiscovery Manager and IBM eDiscovery Analyzer.
Bob selects IBM Enterprise Records for records management. Bob decides to use IBM Content Collector with IBM Content Classification to automatically and intelligently classify documents and email and declare them as records according to the company's records management policies.
To control storage and legal review costs, Bob needs to filter out irrelevant data such as company bulletins, newsletters, personal email, and personal documents that have no relevance to the pending legal case. Bob will work closely with Anne, a business analyst who has expertise in the company's knowledge management hierarchy, to define the rules for determining which documents are potentially relevant and need to be retained.
Bob is already familiar with IBM Content Collector and has used it for similar purposes in the past, but it has typically collected too much content, which increases legal review costs and time. He plans to work with Anne to identify a set of representative documents that are pertinent to the case to use as a training set.
Working together, they configure classification rules in IBM Content Classification that are based on a list of keywords provided to them by the legal team. The rules specify that the documents are to be declared as records in IBM Enterprise Records and identify which file plan is to be used to manage the records.
Although a typical case might have approximately 50,000 to 200,000 potentially relevant documents, the documents must be identified across departmental and enterprise repositories that can hold hundreds of millions of documents. It is critical that Bob and Anne understand how IBM Content Classification filters different documents and email so that they can ensure that all content that might be relevant is captured while everything else is omitted. After classifying the training set, Bob and Anne can review how the decisions were applied and adjust the rules as needed.
To ensure that content is classified when it is captured, Bob sets up a task route for IBM Content Classification in IBM Content Collector. After the system is in production, Bob expects that an additional 1000 potentially relevant documents might be identified out of newly collected document and email each week.
Anne plans to use IBM Content Classification to review the classification decisions. She can help train the system by reclassifying content, and she can work with Bob to fine tune the rules. For example, if irrelevant documents are classified into the case vault because of the occurrence of some keyword, she might recommend that a rule be changed so that documents are classified only when the keyword occurs in proximity to another keyword.