A second big data pattern for adapting real time analytics
Chris Nott 100000MPDE Visits (3122)
I described my first pattern for adapting real time analytics in my previous post. This second pattern gives a high level view of another example of how adaptive real time analytics has been deployed to realise value. The pattern shows how a business can retrieve existing information about an area of interest and continuously update that information for users based on what is changing in the business.
I shall assume that IBM's InfoSphere Streams is used to implement real time analytics on data in motion and that InfoSphere BigInsights – IBM's commercial offering with Hadoop – holds data at rest.
The outline steps of the use case are as follows:
In the following architecture diagram, a data scientist first builds the analytics model using data at rest and deploys that model in Streams to filter information for chosen areas of interest.
As in the first pattern, the implementation of this use case assumes that data reaches BigInsights via Streams. (Data filtered by Streams to store in BigInsights must include the possible areas of interest that an analyst might choose to select.) It means that the data is aligned across both technologies.
Similarly, the control of the introduction of a new area of interest was implemented in Streams.
One limitation in the design is that the analyst application must retrieve the new information on the area of interest from the query optimised data store. It would be better for new information to be pushed directly to analysts.
The main benefit of this pattern is that a business can form a richer view of areas of interest by combining information it already holds and information on what happens from now on.