MapReduce basic concepts
Application
application is a type of application software, where the business logic is encapsulated in one or more software programs called services that are separated from its client logic. Each application is associated with an application profile.
Application profile
The application profile is an XML file that defines the properties of an application, including the name of the service that performs the calculation and the scheduling parameters to apply. It contains runtime parameters for workload, service, and the middleware that define how workload is run.
An application profile provides flexibility to dynamically change application parameters without requiring you to change your application code and rebuild the application.
An application profile is associated with an application, which in turn is associated with one consumer. You must register the application profile of every application you want IBM® Spectrum Symphony to manage.
IBM Spectrum Symphony provides a default application profile for its MapReduce framework: "MapReduceversion", which is registered by default to the MapReduceConsumer consumer.
Client application
A program or executable that needs work done through a service. Requests are submitted via an API to the service.
Service
A service is a self-contained business function that accepts one or more requests and returns one or more responses through a well-defined, standard interface. The service performs work for a client program. It is a component capable of performing a task, and is identified by a name. IBM Spectrum Symphony runs services on hosts in the cluster.
The service is the part of your application that does the actual calculation. The service encapsulates business logic.
Connection
A connection provides an entry to an application. Once you are connected to an application, you can begin to submit workload by creating a job on that connection.
Job
A group of tasks that share common characteristics, such as data. A MapReduce job consists of four types of tasks: setup, map, reduce, and cleanup.
Task
A task is the unit of work that runs on each individual host when workload is running. The task consists of a message request (input) and, when completed by a service, a response (output). A MapReduce job consists of four types of tasks: setup, map, reduce, and cleanup.
Consumer
A consumer is a generalized notion of something that uses resources. MapReduce applications are registered by default to the MapReduceConsumer consumer.
Resource
Resources are physical and logical entities that are used by applications in order to run. CPU slots are the most important resource.
Resource group
A resource group is a logical group of hosts. It can be specified by resource requirements in terms of OS, memory, swap space, CPU factor, and so on. Or, it can be explicitly listed by host names.