Unattended Automation to Manage Big Data in the Cloud
xgiannak 270000AD3R Visits (3107)
Here at Pulse the best part, for me, are the client conversations. The efforts of clients to understand our IBM categories and for me to understand the customers’ scenario have led to interesting exchanges and raised some strange questions. Talking to a business partner, I found myself asking "What is the shape of your computation?" Does it look like a banana or a dolphin? A whale tail or a multi-drop jet? A rhizome or a Pacific atoll map?
Does it matter? It is certainly a useful insight to visualise the general shape of how the business flows unfold. When using a workload automation tool, each action becomes a unit of work. These units of works are linked together by the conditions and dependencies that sequence their execution in the right order. When large graphs of such units of works are built and executed, the layout of thousands of small units of work can take the most diverse shapes, and that shape tells something about what is being accomplished. The case of this Business Partner and his project with Big Data and massively parallel micro-ETLs, makes no exception to that rule.
Big data projects have shown their capability to extract insight from data through powerful operators and clever data transformation, but often the result needs data cleaning, preparation, and looks experimental unless an important polishing effort is applied. In fact, multiple analysts have recognized the need for Big Data to become more automated and repeatable in order to serve as key input into decision making, especially if the kind of decision making is disruptive to mainstream practice.
That is where the origin of the data sources, the sequence of the processing steps and the conditions that link the local "islands of processing" become of importance to stabilise the global calculation map, share a common understanding about how insight is constructed and lead to agreement about the right way to proceed. This interesting article warns that the quantity of applications and systems involved in information management is the first obstacle to address, and can easily be worsened by the use of Big Data powerful systems.
So the shape of your computation indeed provides a visual cue to resolve the next challenge of fruitful usage of Big Data, and it is probable that by using such graphical representation we collectively build better pattern recognition and discussion capabilities, like "Oh yes, your streams are too thin, you might have forgotten some data correlation."
Someone might think a rocket scientist is needed to display that "computation shape," but solutions for Workload Automation provide such images automatically, among other benefits, when business processes are described into it. As business processes are described once to be repeatedly executed, they will be triggered automatically with a lot of fringe benefits including:
In short, Workload Automation provides governance over even the most complex systems and a set of tools designed to take the conversation to the next level -- above daily operations and experimental setups -- whether the system handles Big Data, SAP jobs or a robotic tape arm. Providing Visibility, Control and Automation over numerous business flows is called Unattended Automation, and the new pressures created by Cloud and Big Data have raised attention on it to a high level.