The example application development use case
This tutorial uses as a fictitious example a company named XYZ Inc., which has a number of organizations. The locations span multiple geographies, so this company must conform to the regulations in each geography. Each organization runs several projects of its own, and each project requires skilled personnel and various resources. All XYZ employees are managed by a single HR system. Based on the needs of the business and the employees' interests, the employees can be moved across projects within an organization or even to a different organization. Each organization manages its projects and its finances independently.
Major challenges XYZ faces are conforming to legal regulations and staffing upcoming projects with specialized skilled personnel. Most of time for these new projects, the resources are available within XYZ, but getting all the necessary data is difficult because it is located in several systems. Today, the systems must follow a pattern of logging in to a system to collect some information and then logging in to another system to use the information from the first system to get the next piece of information. In order to collect information more efficiently, XYZ decides to build a new system: the Staffing System. Assume that your job is to build this system.
The Staffing System must interact with the following other systems to meet the requirements for information:
- Org System — This system maintains all the details about an organization in the company. Details include the name of the organization, the name of the head of the organization, number of employees in the organization, and financials. This system generates a unique organization ID other systems can refer to.
- Projects System — This system maintains the details about each project that has been completed or is being done by the company. Details include the names of the employees working for a project, name of the project manager, billing details, and name of the organization to which the project belongs. This system generates a unique project ID other systems can refer to.
- HR System — This system maintains the details of the company employees. Details include the employee's name, employee's address, project the employee is working on, and references to any work permits. This system generates a unique employee ID other systems can refer to.
- Legal System — This system maintains the legal approvals the company must obtain. It also maintains references to document IDs issued by government agencies.
The new system must be able to meet the following needs of the company:
- When a new project is started, the project manager must be able to find the available staff with the required skills for the new project.
- Based on the location of the project, the staff, and other resources, such as software, the system must be able to find the legal approvals for the project.
- The project manager must be able to consider the staff's interests in working on specific technology areas and eligible skills.
- Today, the Staffing System must integrate with only the four systems specified earlier, but the company expects that in the future, the Staffing System must integrate with other systems. Addition of new systems should be seamless and must not affect existing systems.
- Each existing system must be able to independently undergo other revisions, data schema changes, or interface changes without taking into account compatibility with the new Staffing System. Similarly, the Staffing System should not be affected by changes to the internals of those existing systems.
The four existing systems are isolated from each other, although the maintained information is interconnected. The unique identifiers each system generates identify some entities, but their relationships with other entities in other systems are not available in one place. This results in multiple hops among these existing systems. The company wants to use the Staffing System to simply and efficiently navigate relationships across entities in the different systems.
To address this, you require a data store that can store the identifiers generated in each system and store information about their relationships. Because interfaces with other systems might be required in the future, this data store must be generic in nature. RDF provides a data store that is ideal for meeting these kinds of requirements. An RDF data store does not have a fixed schema. You can use the SPARQL query language to query the data without having to know the data schema. This is an architecture commonly referred to as linked data. The new Staffing System will adopt this architecture.
The following table describes the URI structure each system will use to uniquely identify its entities.
Table 1. URI structures
|<http://xyz.com>||Refers to XYZ Inc.|
|<http://xyz.com/org/hr>||Refers to HR organization in the company XYZ Inc.|
|<http://xyz.com/employee>||Employee of XYZ Inc.|
|<http://xyz.com/manager>||Manager in XYZ Inc.|
|<http://xyz.com/project>||An XYZ project|
|<http://xyz.com/product>||An XYZ Product|
|<http://xyz.com/project/lead>||Lead of a project|
|<http://xyz.com/project/QA>||QA for a project|
|<http://xyz.com/project/ID>||Information development person for a project in XYZ Inc.|
|<http://xyz.com/org/legal/approvals/OS98765>||Approval ID to use some software|
|<http://xyz.com/org/legal/gov/approvals/WP76543>||Government legal approval for an employee etc.|
Whenever data is inserted or modified within the existing four systems, these systems must generate the relevant RDF data, then the Staffing System loads this data into its RDF store. Users can then run SPARQL queries on the Staffing System.
Each of the existing systems needs access to the unique IDs (such as URIs) that the other systems generate. For example, when the Projects System refers to a member of a particular project, it must use the unique ID that the HR system generates for the employee. An existing system can obtain a unique URI by querying the new Staffing System.
The existing information in the Org, Projects, HR, and Legal systems must be generated in RDF. Then this data must be loaded into the new Staffing System. To keep the application simple for this tutorial, assume that each of the existing systems generates an N-triple formatted file with this data, which is then loaded into the Staffing System. Therefore, the file is available on disk. Again, for simplicity, assume that the existing systems generate an N-triple file whenever data is inserted or updated in those systems and that this file is accessible on disk to the Staffing System. (In reality, updates would flow through the systems through an online restful service.)
This tutorial provides a file called DB2RDFTutorial.zip. This contains a sample Java™ Eclipse project, which contains Java programs. Download it to a local disk and extract the compressed files.
In this tutorial, it is assumed that you will perform all DB2 tasks by using the db2admin authorization ID, which has all required administrative privileges. If you use a different authorization ID, first ensure that it has the required privileges.