EDRM XML
The EDRM XML load file can be mapped to a third-party legal review application to import all data pertinent to a legal matter.
- Native files: The original source files, such as doc, ppt, txt, and optionally, plain-text versions of the original source files.
- EDRM XML load files: EDRM XML files that fully describe each data object that is produced in a discovery export run.
During the harvest procedures, the system indexes and classifies all of the files in a volume and every file is evaluated for duplicates, content-based attributes, and metadata. Content deduplication is performed automatically while it retains all document instance metadata.
An EDRM XML discovery export policy then copies objects responsive to the classification for the given legal matter and produces the EDRM XML load file. The EDRM XML describes each exported document in terms of where it was sourced from, where the corresponding native files were copied to, and important metadata about the document. The policy is incremental, in that metadata for a document instance cannot be repeated in subsequent export policy cycles unless that data changed.
- Export information, which is information about the legal matter that is tied to the policy.
- Batch information, which is information about the run that exported the load file.
- Document metadata. For each document in the load file, all metadata in the EDRM XML v1.1 specifications and the attributes in the EDRM XML v1.1 Metadata Tags sheet are described when applicable. These metadata and attributes are available from http://www.edrm.net. In addition, metadata that is defined by IBM StoredIQ is also exported for each document to enrich the metadata available to legal review tools.
- Source location information. The source location where each document was sourced and the Custodian that is attached to that copy of the document is also described in the load file in conformity with the EDRM XML standards.
- Native copy location information. For the legal review tools to locate the individual copies of the native files that are produced for each document, the EDRM XML describes where on the target volume these native files were copied.
- Run files, which are files that cover the entire run. These files include the IBM StoredIQ audit trail report in XML format, a CSV file, a CSV report, and a CSV user-ID-to- name mapping file. The CSV file details all of the EDRM XML tag definitions applicable to the produced load file. The CSV report is about all load files that are produced for the run.