IBM WebSphere Extended Deployment is made up of three components:
- Operations Optimization provides application virtualization and features for improving the resiliency of a middleware infrastructure.
- Compute Grid is an enterprise grid and batch computing infrastructure.
- Data Grid is a caching and data-access infrastructure for eXtreme Transaction Processing (XTP) style applications.
This article describes how Compute Grid technology combines high-performance computing (HPC), eXtreme transaction processing (XTP), grid, and utility computing to form the foundation of for next generation batch processing.
What is batch processing?
Batch processing is a fundamental component of enterprise application infrastructures. The purpose of this computing paradigm is to process a tremendous amount of data within a constrained, and often narrowing, period of time. Batch applications are non-interactive; its counterpart is online transaction processing (OLTP) which typically has some clients or users waiting at a terminal for a response.
Businesses depend on the successful execution of their batch workloads. For example, banks must reconcile all of the day's financial activities before a new business day can begin. Insurance companies must sift through terabytes of data to quantify the risk (and therefore the price) of insurance policies. Credit card companies must apply interest calculations to hundreds of thousands of accounts at the end of each billing period. Financial institutions generally must calculate credit scores, adjust billing rates, deliver statements, generate reports, and so on. Without these activities, the respective businesses could not function. Batch processing is, therefore, a fundamental, mission-critical component of the business and its IT infrastructure.
A key concept of SOA is to reuse assets across the enterprise, which is composed of both OLTP workloads and batch. The focus of J2EE and other such technologies has been to address OLTP problems; the adoption of these newer middleware technologies has caused batch processing applications to be left behind such that batch and OLTP have split and are in silos. A recently released Gartner study (see Resources) discussing batch processing and its role in SOA says that: "business function used in online transactions may be the same business function used in batch processes, so organizations should think about their IT modernization strategy and consider SOA as a standardized application integration mechanism". A key challenge now is how to effectively bring these disparate worlds back together to deliver an agile, efficient, and high-performance middleware infrastructure.
Origins of Compute Grid
Several new computing paradigms have emerged that solve specific batch-related issues. However, used independently, these computing paradigms often fail to take into account enterprise requirements that mission-critical, batch applications must meet. The requirements for batch go beyond the simple task of quickly processing data. A tremendous amount of infrastructure has been built around existing batch systems, including:
- Job schedulers with operational plans that describe end-to-end job flows
- Disaster recovery procedures
- Archiving and auditing mechanism for data logs
- Resource utilization for chargeback
- Goal-oriented execution and concurrent execution with OLTP
- Operational control
- Transactional data access
Next generation batch systems must take into account the requirements from sophisticated batch customers and leverage their existing infrastructures as best they can while incorporating the best practices and lessons learned from the emerging paradigms of HPC, XTP, grid computing, and utility computing. These new infrastructures must adapt and easily incorporate future technologies. The applications hosted within this infrastructure must leverage modern tooling and programming best practices. To summarize, next generation batch processing systems must be agile enough to meet requirements from customers of all shapes and sizes.
Compute Grid delivers an enterprise grid and batch processing infrastructure that has incorporated several key best practices from these computing paradigms. More importantly, it has been significantly influenced through development partnerships with customers that are heavily dependent on batch. Figure 1 illustrates the influences of Compute Grid.
Figure 1. The origins and influences of Compute Grid
HPC and Compute Grid
Two important HPC patterns have been incorporated into Compute Grid . The dispatcher-worker pattern facilities a highly-scalable computing environment. The divide-and-conquer pattern is critical for high performance.
The dispatcher-worker pattern has two key components: the dispatcher and a set of workers. The role of the dispatcher is to manage the execution of some work across a collection of workers; work is performed within the worker component. As more work enters the system, the dispatcher can dynamically scale up the infrastructure by starting additional worker instances. Figure 2 shows the topology of the dispatcher-worker pattern.
Figure 2. The dispatcher-worker pattern
Figure 3 shows the scaling of workers as the load in the system increases.
Figure 3. Scaling the dispatcher-worker-based infrastructure as load increases on the system
The WebSphere Compute Grid architecture is, in fact, based on the dispatcher-worker pattern. The dispatcher component is called the Job Scheduler (JS), and the worker component is called the Grid Execution Environment (GEE); both the Job Scheduler and GEE are multi-threaded application servers. As load increases on the system, the Job Scheduler can start additional GEE's to achieve a stated service-level agreement and meet demand. The Job Scheduler can leverage the dynamic load balancing features of WebSphere Extended Deployment operations Optimization and load-balance batch jobs based on available capacity of the GEE’s in the cluster and business-level execution goals of the batch job. You learn more about WebSphere Operations Optimization in Grid computing influences.
To submit jobs to the Job Scheduler, you can use Web services, Java Messaging Service (JMS), an Enterprise Java Bean (EJB), a shell command, or an embedded job console. Figure 4 illustrates the WebSphere Compute Grid topology and channels for submitting jobs.
Figure 4. Compute Grid topology
The second HPC pattern, divide-and-conquer, is also known as highly parallel execution. With this pattern, a large task is first decomposed into many small tasks which then run concurrently across a collection of resources. For example, a single batch job that must process 100,000,000 records can be broken into 100 jobs that each process 1,000,000 records and executed concurrently. The elapsed time for completing the task becomes a function of the partition size f(1,000,000) versus a function of the total number of records to process f(100,000,000).
Each batch job within Compute Grid runs on a single-managed thread. A managed thread is one that has meta-data associated with it, such as security context for authorization and authentication, transaction context used by the application server for data integrity, and so on. Therefore, 100 parallel jobs will execute on 100 threads across the collection of multi-threaded GEE's. This relationship of one job per managed thread is important to know, especially for z/OS customers, because this model minimizes the resources consumed by the batch runtime. For example, one job per process consumes many more resources than one job per thread.
In addition, this threading model hides the complexities of multi-threaded batch jobs from the developer. Batch application developers write code that is thread-safe and executes on a single thread. Multiple instances of that application can be dispatched using the Parallel Job Manager (PJM), where each partition processes a different section of the data. Alternative solutions impose this complexity of multi-threaded programming on the developer.
A difficult challenge in the area of highly-parallel batch processing is operational control and managing the many jobs that are executing concurrently. For example, to stop a highly-parallelized batch job, an administrator should not need to manually stop each of the hundreds or thousands of jobs running in parallel. Instead, the administrator should have better operational control and stop the logical job, where the infrastructure would stop the many jobs running in parallel. Furthermore, upon completion, the infrastructure should aggregate the many job logs that have been produced rather than relying on the use of a tool or requiring an administrator to manually collect these logs.
Using the Parallel Job Manager
Compute Grid includes a feature called Parallel Job Manager (PJM) which you can use to define rules for decomposing large jobs into many small jobs. You install these rules into the PJM, which decomposes large jobs using your rules and enables operational control. Figure 5 shows the Compute Grid infrastructure including the PJM, and shows the general flow of execution for the PJM. The set of diagrams that follows illustrates the type of operational control the PJM provides.
Figure 5. General flow of execution for the Parallel Job Manager (PJM)
The general work flow using the PJM is:
- A large, single job is submitted to the Job Dispatcher in Compute Grid .
- The Job Scheduler passes the request to the PJM, which uses specified partition rules to decompose the large job into many smaller partitions.
- The PJM dispatches those partitions across the cluster of GEEs.
- The cluster of GEEs run the job partitions, applying qualities of service such as check pointing, job restart, transaction management, security, and so on.
Figure 6 shows the flow for monitoring many parallel batch jobs through the PJM.
Figure 6. Monitoring highly-parallelized batch jobs
The steps for monitory highly parallelized batch jobs are:
- Execution status of the many parallel jobs is reported to the PJM from the collection of GEEs.
- The PJM persists the status data in its database tables.
- The administrator or job submitter monitors the progress of the logical job that represents the collective summary of the job partitions.
Figure 7 describes the flow for executing commands such as: stop, restart, and cancel for a logical job.
Figure 7. Stop/Restart/Cancel of the logical job and its many job partitions
To run a command for a logical job:
- The submitter issues a stop, restart, or cancel command of the logical job using one of several channels to the Job Scheduler.
- The PJM determines which job partitions are associated with the logical job.
- The PJM issues the command for each of the job partitions associated with the logical job.
Figure 8 describes the flow for how the PJM aggregates the job logs for the job partitions.
Figure 8. Aggregating the logs for the job partitions
To aggregate the job logs:
- PJM determines which job partitions are associated with the logical job and retrieves the job logs for each partition.
- PJM then aggregates the many job logs into a single log.
- Job Dispatcher stores the single job log in a location where the submitter can access it.
The domain of XTP-style applications is broad and continuously evolving. This article focuses on one of many important principles in this domain: bring the data as close to the business logic as possible.
Achieving this principle requires implementing:
- Aggressive caching across an N-tier caching infrastructure.
- Affinity routing which is the intelligent routing of work requests to their data.
- Decreasing data-access latency using the map-reduce pattern coupled with in-memory databases.
Examples of some of the numerous challenges to address are:
- Coordinating the invalidation of stale cache data as well as the replication of updated across an N-tier cache.
- Dynamic repartitioning of data based on resource usage (memory, CPU, and so on) of data servers.
- Devising methodologies for partitioning the data to maximize cache hits.
With batch computing, the proximity of the business logic to the data significantly impacts performance. The closer the data is to the business logic, the faster the workload will run. The two alternatives to change the proximity are:
- Bring the data to the business logic, which can be achieved through caching and affinity routing.
- Bring the business logic to the data, which can be achieved by co-deploying the business logic to the same server or platform as the data. For example, on distributed platforms, map-reduce coupled with in-memory databases; on System Z, co-deploy for access to enterprise data.
There is natural synergy between the concepts and technologies within the XTP domain and batch. Compute Grid can leverage these technologies and principles in several ways. For the first approach (bringing the data closer to the business logic), you can define an N-tier caching infrastructure and associate it with the GEE. Figure 9 depicts this topology using Data Grid.
Figure 9. Bringing the data to the business logic via N-tier caching
For the second approach (bringing the business logic to the data), you can host business logic on the same platform as the data. This typically relates to "big-iron" hardware such as System z and System p. If the data is hosted on System z, the latency for accessing data is lower for applications hosted on the same hardware than for applications hosted on remote systems. Figure 10 illustrates how Compute Grid would fit within WebSphere Application Server for z/OS.
Figure 10. Bringing the business logic to the data via WebSphere for z/OS
You can apply affinity routing to both proximity-narrowing approaches. With affinity routing, the data is broken down into partitions and hosted across multiple locations (Java virtual machines, databases, and so on). The dispatcher is cognizant of the data partitions and their locations; the dispatcher is also aware of the work partitions to which a large request has been decomposed, along with the data needs for each work partition. Work partitions are then dispatched to the locations hosting the corresponding data partitions. Figure 11 shows affinity routing on distributed platforms.
Figure 11. Affinity routing with Compute Grid and Data Grid on distributed platforms
Figure 12 shows affinity routing on z/OS.
Figure 12. Affinity routing with Compute Grid and DB2 for z/OS
You need to have a strong grasp of the underlying data model and workload patterns to effectively utilize affinity routing, but this approach can significantly improve performance. Matching work partitions to their data partitions will increase the probability of a cache hit within the server and decrease remote accesses to the database, which introduces quite a bit of latency. Figures 13-15 illustrate the advantages of higher cache hits with a quantitative model. With this model, you can simulate the effects a higher near-cache hit will have on the average data access time.
Figure 13. Quantitative model for modeling the data access time in an N-tier caching hierarchies
The quantitative model in Figure 13 describes a method for calculating the average data access time given the probability distribution and service rates for each of the tiers of the N-tier cache. In this case, there are three tiers to the cache:
- The near-cache, a cache hosted in the same JVM as the business logic
- The Data Grid Server cache, an object server cache residing within close proximity of the JVM executing the business logic
- The database, which typically will have the slowest access time
To compute the data access time, you need to know the probabilities for the data existing in the near-cache, in the server cache, and in the database. With that information you can model and calculate the data access time.
Consider the specific example shown in Figure 14. In this example, 30% of the data is located in the near-cache, 70% of the data resides in the object server cache, and the rest of the data would be retrieved from the database. After applying the computations, the estimated data access time works out to be ~47 ms.
Figure 14. A concrete example of a quantitative model for modeling data access time in an N-tier caching hierarchy
The example shown in Figure 15 doubles the probability of the data existing in the near-cache, which you could achieve by either increasing the size of the near cache, or establishing affinity routing and sending like-requests to the same server and cache. The effect of increasing the probability of a cache hit in the near cache is a 42% improvement in data access time.
Figure 15. Increasing the probability of a near-cache hit, either through larger near caches or affinity routing, reduces the average data access time.
Combining the Parallel Job Manager (PJM), N-tier caching structures, and affinity routing provides a powerful, high-performance compute grid. The PJM decomposes large work requests into smaller partitions. These smaller partitions are divided in a way that matches the location of the data in the near-caches. The Job Dispatcher then intelligently dispatches the job partitions, working to ensure that each job partition runs on a grid endpoint where its data is already loaded in memory, increasing the probability of a cache hit.
The underlying technologies that facilitate XTP-style applications are crucial components for high-performance systems. This is especially true for batch processing, where applications must process vast amounts of data within some narrowing window of time. In addition, the location of the data and the platform on which the business logic runs influences how XTP technologies are adopted in the data center.
Grid computing influences
The term grid computing has become overloaded. To some, it describes an architectural pattern similar to SETI@Home and other scientific computing initiatives, where well-defined data packets are bundled with the business logic and dispatched to a system such as a desktop where every available CPU cycle is utilized. In these grid environments, there are tens of thousands of nodes available to run the business logic, minimal qualities of service, and redundancy where the same work is dispatched to multiple machines, helping to assure the computations will be completed. The data to be processed is typically not confidential or sensitive, therefore there are minimal security requirements; and transactional data access is atypical. These models were designed to scale to tremendous sizes.
Within an enterprise, data access is well-controlled; data access is frequently transactional; there are finite resources on which the business logic can run; and there are stringent operational requirements in terms of managing and monitoring of resources, managing execution logs for auditing, and so on. Enterprise grid computing blends the approaches of scientific grid computing with enterprise application infrastructures to provide a secure, transaction-oriented, manageable, and highly scalable grid execution environment.
Compute Grid is an enterprise grid computing run time. Compute Grid delivers features for providing operational control and management of a collection of highly-scalable grid endpoints. Compute Grid does this while enforcing the enterprise-oriented requirements of security, efficient resource management, transactional integrity etc previously mentioned.
Utility computing influences
The domain of utility computing involves service provisioning, monitoring of consumed system resources, and the enforcement of business-level objectives. Utility computing has been an integral part of the mainframe and has now become an emerging area of distributed computing. Compute Grid has incorporated many utility computing features. Specifically, it includes:
- A goals-oriented execution environment where business-level service agreements dictate the execution priority of the business logic.
- Resource usage and monitoring where the resources consumed by the business logic can be quantified and the owner of the logic can be billed for the exact amount of resources consumed.
On the mainframe, Compute Grid integrates with the z/OS workload manager (zWLM) to provide these functions. You can associate Compute Grid batch jobs with zWLM transaction classes and zWLM service policies; zWLM then works with the z/OS operating system to manage hardware and software allocations to ensure the service policies are achieved. If more resources are needed to achieve some service policy, zWLM can, for example, dynamically provision more application server processes or adjust the dispatching priority for the batch job to provide more execution time on the CPU.
On distributed platforms, where zWLM is not available, Compute Grid works with WebSphere Operations Optimization to deliver similar functions. Operations Optimization can enforce business-level service policies, dynamically provision additional server instances within the collection of resources, and provide detailed usage reporting statistics for chargeback.
With utility computing, software-as-a-service models become possible. Figure 16 illustrates an example of batch-as-a-service, where the Compute Grid infrastructure is used to run batch jobs for various clients, keeping track of the resource usage and providing charge-back data. The infrastructure, which would have been a cost-center, suddenly becomes part of the business model for the enterprise.
Figure 16. Compute Grid leverages the principles of utility computing and enables datacenters to provide batch as a service.
The flow for this batch-as-a-service example is:
- Smaller banks submit requests to the Job Scheduler hosted in the data center.
- The Job Scheduler executes the workloads for each bank, keeping track of exactly how many resources each bank's jobs used. On distributed platforms, this functionality is achieved by integrating with Operations Optimization. On z/OS platforms, the usage accounting facilities of zWLM, RMF, and other system facilities of z/OS are leveraged.
- At the end of the month, the data center sends bills for services rendered, based on the exact CPU seconds consumed, to each bank.
Customer collaborations and partnerships
Enterprise grid computing is emerging as an important computing paradigm, especially for customers in the banking, data warehousing, and insurance industries. Compute Grid has quickly evolved to become a robust enterprise grid computing run time because of the strong partnerships and collaborations with aggressive technology adopters within those industries. Customers in these areas have shared experiences, requirements, and expectations for operational controls, auditing and archiving methodologies, scalability, utility computing, highly parallel job execution, and integration with external schedulers.
For example, many enterprise customers are already using an enterprise scheduler. Therefore the requirement for Compute Grid to integrate with the existing enterprise scheduler was non-negotiable. These customers were not willing to redefine their operational plans within Compute Grid. Likewise, Compute Grid is not in the scheduling business; instead its focus is to run batch and grid workloads quickly while ensuring business-level service agreements are achieved.
Using Extended Deployment as the foundation for enterprise grid computing
You can use the three components of WebSphere Extended Deployment — Operations Optimization, Data Grid, and Compute Grid — individually; however, by integrating them together you get a scalable, resilient, high-performance grid computing platform. Figure 17 shows an architecture that incorporates all three WebSphere Extended Deployment components.
Figure 17: WebSphere Extended Deploymentas the foundation for enterprise grid computing
In this integration:
- Job Scheduler tier, which is part of the Compute Grid component, accepts new job submissions, dispatches the jobs to the end points, and provides operational control over the jobs.
- Parallel Job Manager tier, also part of Compute Grid, decomposes large jobs into smaller partitions and provides operational control over partitioned jobs executing across the cluster.
- Grid Execution tier, composed of technology from both Compute Grid and Data Grid, runs the batch jobs. The Data Grid near-cache is the entry point to the N-tier caching infrastructure to facilitate better performance.
- Data Server tier, part of Data Grid, buffers database access by caching objects in-memory for higher performance.
- On-Demand Router (ODR), part of Operations Optimization, provides a goals-oriented execution environment, dynamic provisioning of server resources, and health management.
Independently, each computing paradigm – HPC, XTP, grid, and utility computing – provides unique advantages for enterprise computing. The clever combination results in a truly scalable, high-performance system capable of addressing the qualities of service that enterprises demand. By bringing together features for each of these computing paradigms, Compute Grid, working together with Operations Optimization and Data Grid, provides the foundation to run your mission-critical, batch applications.
- Gartner Study: Batch Processes Can Take Advantage of SOA
- WebSphere Extended Deployment technical resources
- WebSphere on System z technical resources
- Compute Grid
- Compute Grid value proposition and some common use-cases
- Development tooling story with WebSphere Extended Deployment Compute Grid
- Enterprise Java Batch for z/OS with Compute Grid Webcast replay
- Introduction to programming with WebSphere Extended Deployment Compute Grid, from the IBM WebSphere Developer Technical Journal
- Compute Grid "honorable mention" in Jerry Cuomo's blog onWebSphere 2008 technology trends
- WebSphere Extended Deployment Compute Grid and JZOS
- The role of Extended Deployment Compute Grid and Enterprise Schedulers including Tivoli Work Scheduler
- Video Demonstrations of WebSphere Extended Deployment Compute Grid
- WebSphere Extended Deployment Compute Grid forum
- WebSphere Extended Deployment Compute Grid Wiki
- Data Grid and XTP
- Operations Optimization
Get products and technologies
- Free trial download: WebSphere Extended Deployment Compute Grid V6.1
- Free trial download WebSphere Extended Deployment Data Grid V6.1