January 9, 2017 | Written by: Tony Giordano
Categorized: Customer Analytics | Research
2016 was a year that saw the next wave of adopter finally moving forward with their Big Data programs. As results from early adopters and lessons are learned, organizations are moving big data and digital analytics efforts into the mainstream. We see this change accelerating in 2017 along with several other areas which include:
Big Data is more than Hadoop
As organizations begin to use Big Data platforms they will expand the uses beyond simple analytics. Digital uses cases integrate operational and analytic that now haves that large gray area where digital overlaps with both.
As these digital use cases extend traditional analytic use cases for Big Data, technologies such as Spark, Cassandra, and others will become more main stream. For example, for those applications that require a higher degree of fault tolerance, we will see Cassandra become a more common platform in our Big Data environments. Cassandra, Mongo, and others will also move from “talked about” technologies to implemented technologies in the Big Data space.
Information Governance has renewed interest in Big Data
The irony of Information Governance in the Big Data space, that one of the reasons that Big Data was so attractive was that there was no need to model, structure, or cleanse data, just load it and use it. While the use of raw data is still one of the major capabilities in Big Data for capabilities such as Data Science, there are many, many use cases for data that requires data to be cleansed, conformed, and structured for other uses. For example, it is next to impossible to provide a 360 view of data in a raw data layer with data from multiple internal and external systems all structured in its native formats. Once you start structuring data into a common technical format, the inevitable question arises, “so then, what is the business definition?” where both determining the right technical and business definition takes us right back into classic Information Governance 101 processes. The reality is that while Information Technology made the development of data structures (data warehouses, data marts) too long in relational world, there is a need for both quick and easy access to raw data, and the need for structured data even in Hadoop. Expect to see this trend of increasing requirements for information governance and classic data management to continue in 2017. What is interesting is that many of the data management (e.g. data modeling) and information governance technologies are just now addressing how to create structures in HDFS, HIVE, Spark, and others. Apache Atlas is becoming the de facto metadata repository for Big Data, with commercial packages such as IBM’s Information Governance Catalogue having interfaces into Atlas.
Data Science, Cognitive, and traditional Analytic technologies are merging
With the API economy in full swing we will see increasing cognitive, and statistical capabilities built into new and classic analytic platforms.
Analytic packages are increasingly using common statistical and cognitive libraries in the new releases of their software and are creating “bundles” of functionality, where you can have an integrated experience with the same set of tools rather than having to use multiple unconnected technologies. The Watson Data Platform Data Science Experience is a great example of this convergence.
Edge Analytics is moving into the Main Stream
Edge Analytics the process of performing analysis on streaming data near the source rather than in a central repository has been to-date an Internet Of Things (IOT) process. Edge Analytics removes the latency in analytics and provides cognitive processes to act based on parameters in responding to events rather than waiting for human intervention. As digital processes become more mainstream in business functions the need for analysis will move closer to the source and we will see the need to design not only what are the analytics needed, but where are they needed in 2017.
Real-Time Data Integration is finally happening (really this time!)
For over 20 years, the clarion cry of “we need real-time!” has echoed in the marketing materials of software vendors. However now, there is a business case. Digital is making the need to understand events (see Edge Analytics above) real-time a necessity. “Read Once, Write Many, Part Two” For the past ten years mature data integration architectures have leveraged the concept of read once write many in order simplify and avoid point-to-point ETL processes that are expensive and cause data definition and quality issues. Now with Digital the need to not only read, but decide (cognitive), and act (publish) is a required. This not really that new of a concept, digital advertising has for several years picked up a “event” from a web page with a users cookie and has determined based on the known purchases and buying patterns of that prospect, what ad to send them. This type of sub-second response requires real-time data integration. As the Millennial Generation expects more and more digital interaction with their service providers, from retailers to bankers, expect to see the need for not only real-time data integration, but more prescriptive event publishing in Big Data.
2017 will be a year that will continue to push talked about Big Data theories into practice in the field. This will create new opportunities for our clients and practitioners to leverage and extend those new skills we have been acquiring.