The Enterprise Expertise Locator System has to be as flexible as possible while being technology and standards neutral. This system needs to allow other enterprise applications to leverage the Expertise Locator (EL) within their own solutions. This article discusses use cases, solution architecture, technology, data model and key architectural decisions required in building an enterprise expertise location system.
The Expertise Locator's major use cases are depicted in User Case Model below. It has the following six actors:
- EndUser -- a user who interacts with the system to locate experts, search the knowledge base or submit a question to the system
- SME -- -- a business or self-nominated Subject Matter Expert who is registered within the system to help others. A SME may be derived from 'social media' within the enterprise. Social media includes publications, wikis, blogs patents, conference presentations, etc. For example, a person who is blogging extensively on cloud computing and / or filing patents related to cloud computing could potentially be an expert in cloud computing.
- Administrator -- a user who manages the Expertise Area taxonomy and is usually a stakeholder with a vested interest to ensure taxonomies are valid and experts are enrolled to help others.
- QA Manager, Question Router, Collaboration Broker -- system actors who provide questions management and routing as well as instant messaging collaboration services.
User Case Model
A Service Component Architecture (SCA) was selected as the primary architectural model for Expertise Locator. One of the key requirements for Expertise Locator was to be as flexible as possible while being technology and standards neutral as possible. This allows other Enterprise applications to leverage the Expertise Locator solution within their own solutions. SCA has emerged as a standard in SOA and provides a way to decouple service implementation and service assembly. This allows developers to focus on business logic while at the same time providing bindings for multiple technologies . Architectural Model illustrates solution architecture and its components.
One of the major components in Expertise Locator is Taxonomy Administration. The following terms were defined: Expertise Area - a classification based on professional or business unit alignment. For example, Industry Consulting, Software Products, Sales Teams, etc. Taxonomy is a classification of high level Expertise. Each taxonomy is divided into topics of knowledge which are denoted as Expertise. The Expertise could be further divided into more refined sub-topics of knowledge, creating "children" in a hierarchal taxonomy of Expertise.
The Expert Management component provides mapping service to map a SME to expertise defined in Taxonomy Administration. SMEs can also specify free-tag expertise outside of defined expertise taxonomies. SMEs can automatically populate expertise tags through API calls to retrieve tags based on the SME's social media activities inside the enterprise.
Another Expertise Locator component provides expertise search and taxonomy browse services to locate SMEs registered into the Expertise Locator system. APIs are called to determine social proximity and social path from user to the SMEs and to locate "potential experts" who are experts not registered in the system but are presumed to be experts based on their social media activities (example: publications, patents, tags assigned to them, etc.).
The following technologies (products) were selected for implementation:
- Application Server (WebSphere Application Server)
- RDBS/Search -- Solution was implemented with DB2/DB2 NetSearch Extender, but any RDMS with search capability or RDBS together with information retrieval software library like Apache Lucene could be used.
- Instant Messaging (Lotus Sametime)
- Service Component Architetcure (Apache Tuscany)
- AJAX toolkit (Dojo)
- Application Framework (Spring)
The Architectural Fragment Diagram below illustrates the relevant architecture.
Architectural Fragment Diagram
The Enterprise Expertise Locator data model is comprised of entities representing user profile, expert profile, expertise taxonomy structure, skills dictionary, previously asked questions and instant messaging exchanges. The data model also includes various mapping tables (e.g. experts to skills or expertise) and access control lists to manage different levels of expertise taxonomy. Data Model Diagram shows the high-level Data model for Expertise Locator.
Data Model Diagram
What technologies should be adopted for implementing the Expertise Locator services to provide flexibility of different access bindings from remote sites within the enterprise? For example, a mobile application enabling the sales force may display experts on a particular software product. The data about the expert will be provided by Expertise Locator but the communication protocol or binding between Expertise Locator and the Mobile application may not be known when Expertise Locator is implmented. Please note that these architetcural decisions were made before implementation began.
Expertise Location data needs to be accessed and/or updated from multiple sources (or sites) within an Enterprise. These different sites may implement different technologies or languages to access the Expertise Locator service to retrieve data or make updates. For example, a remote site may implement Dojo and need to access the Expertise Locator service via JSON-RPC, another site may want to access the EL service via SOAP based web services, another site via HTTP, or another via JMS, RSS, ATOM, etc.
Adoption of standards. Taking advantage of newer technology for rapid development. Adaptability to infrastructure changes. Services implemented in a way that makes any presentation technology to access the services -- in other words -- there is loose coupling between the presentation tier and the services implementation.
Services Component Architecture (SCA) has emerged as a standard in SOA and the promise of SCA is that developers will only need to focus on developing business logic and SCA provides a single programming model for invoking a component. Also, SCA allows a single component to have multiple configurable bindings.
- Fabric 3.High Risk -- Open source implementation of SCA standard focused on automated service provisioning and management. However, it is still Beta.
- Apache Tuscany. Low Risk. Supported by Websphere Application Server 6.x and above. It is open source and simplifies the task of developing SOA solutions by providing a comprehensive infrastructure for SOA development and management that is based on the Service Component Architecture (SCA) standard. It enables service developers to create reusable services that only contain business logic. Protocols are pushed out of business logic and are handled through a wide range of pluggable bindings. Applications can easily adapt to infrastructure changes without recoding since protocols are handled via pluggable bindings and quality of services (transaction, security) are handled declaratively. SCA provides a lightweight runtime. The Tuscany runtime provides support for integration with Java, BPEL, Java EE, Spring, etc as defined by SCA specifications. Additionally, the Apache Tuscany development team is implementing support for JAX-RS binding to make RESTful API implementation easier.
Apache Tuscany Diagram
Note: An Enterprise Service Bus (ESB) is typically used to allow software components to find and invoke each other irrespective of technology used in their implementation. The ESB does not generally describe the composite application as a whole. It sits between components and manages message transfers from one component to another. In contrast, Tuscany and SCA allows one to describe an entire composite application. However, Tuscany can easily integrate with an ESB to exploit its inter-service message routing and transformation capabilities. Tuscany provides excellent support to exploit a wide variety of service interaction patterns between SCA components. The patterns that describe the locality of the interacting components are:
Local: The interacting components are running in the same JVM. Tuscany uses in-memory communication.
Remote: Interacting components are in different JVMs that could also be running on different physical machines. A remote binding such as JSON-RPC, Web Services, etc could be used for the components to interact. Regardless of locality, the following interaction patterns focus on the message exchange style.
Request Response: The calling component expects immediate response after each request.
One Way: The calling component does not wait for a response after a request.
Conversational:The calling component sends a series of related requests that are associated with each other by means of common context maintained between calling and called components.
Call-back: call is made from the called component to the calling component at some point in the future.
- Websphere® Process Server - Vendor specific implementation. WPS is comparatively heavy-weight and there are no requirements for complicated workflows in the Expertise Locator system.
Go forward with Apache Tuscany since there is tremendous support from the IBM Tuscany development community and adopting Apache Tuscany will enable rapid development and this will help with meeting schedules, standards adoption and support for multiple bindings. Apache Tuscany Diagram above shows how multiple external sources may interact with the Enterprise Expertise Locator services and how these services may integrate with a ESB and/or External sources.
Based on requirements, it has been determined that end users will need to use a web-based User Interface to search for expertise, register themselves as experts, or search for potential experts. They also need to use a User Interface to create a community or group-based expertise discovery and view reports. What is the set of technologies that should be used to develop the web-based User Interface?
- Adoption of Enterprise Strategic recommendations.
- Taking advantage of the larger Enterprise community asset contributions.
- Improved user experience
- Google Web Toolkit (GWT). Unlike Dojo, GWT does not provide a large number of out of the box widgets. Enterprise developers either have to spend considerable amount of time to develop basic widgets or rely on third party vendors.
- JQuery. Jquery is not as structured as Dojo, more suited for development of separate widgets than the entire application presentation tier.
Go forward with Dojo for web presentation tier implementation.
Which framework and technology is better suited for server-side development of an Enterprise Expertise Location system?
Use an industry-standard light-weight solution for building a J2EE application, easily integrated with Tuscany, and which can offer testability and loose coupling.
- Spring Framework/ Spring JDBC - Low Risk. Widely used and up-to-date open source framework. The Expertise Locator database was mostly intended for data retrieval. Spring JDBC provides complete persistence functionality without the overhead of ORM. Spring also provides a consistent exception hierarchy which converts database vendor-specific checked exceptions into more consistent runtime exceptions that could be caught only in a specified layer of the application.
- Hibernate, JPA -- Low Risk. Expertise Location database was mostly intended for data retrieval, hence using ORM products like Hibernate would not provide sufficient benefits to cover its overhead.
Go Forward with Spring/Spring JDBC
The first part described how SOA patterns are applied to Expertise Location system requirements and how a reference architecture was derived. In this article you learned about Solution Architecture, Data Model, Implementation technologies needed for building flexible and technology and standards neutral Enterprise Location System.
Murali Vridhachalam is an Open Group Certified Distinguished Architect and a member of the IBM Academy of Technology with 17 years experience in the IT industry. He published a reference architecture for Enterprise Expertise Location.