Share this post:
The business objectives of an IT or network operations team have not changed substantially for years or even decades. Measures like mean time to repair (MTTR) or budget use frequently can be reduced to time, money and quality of service. Fundamentally, IT and network teams must maximize the availability of high-quality services while minimizing the cost of doing so.
The demand for services supported by larger, more sophisticated infrastructure has increased steadily, even if the objectives have not changed. Disciplines such as fault or event management have emerged and matured, and led to a set of key capabilities that are table-stakes for a credible solution. More complex infrastructure requires a solution that can:
- Consume events from highly heterogeneous environments
- Minimize the amount of noise that is presented to the people or processes tasked with responding to events
- Integrate with other operations support systems, folding applicable context into the process of event management and resolution
- Help pinpoint the probable causes of events
- Scale and grow as the business and attendant infrastructure grows
- Help automate responses to events
- Drive efficiency improvements in operations
The new operations management playing field
I talked about what has stayed the same for IT and networks teams. So what’s changed? Pretty much everything else.
Businesses are increasingly driven by the demand for continuous delivery of cloud-scale applications and service capabilities. companies are employing key enabling technologies and architectural patterns including virtualization, containerization and microservices. And they’re relying on newer methodologies and practices, such as agile software development and DevOps. Many new services and applications sit atop and leverage backend systems that have been developed and updated over years. Some IBM clients are enabling their users with new, rapidly evolving systems of engagement—like mobile—by taking advantage of hybrid cloud.
Two things have driven the emergence of highly instrumented monitored environments: a renewed focus on the user’s experience of a business service or application, and extremely high expectations for availability. Faults and events are reported from the bottom of the technology stack to the top in traditional, cloud and hybrid environments.
While large portions of the industries we serve have begun to standardize on mechanisms for communicating management data—such as RESTful HTTP interfaces—the payload formats remain heterogenous and relate to additional layers of infrastructure with complex patterns of dependency.
In summary, apps and services are becoming more complex, dynamic, business critical and talkative. And the companies that build them have much higher expectations on availability and time-to-market.
So, how are DevOps managers and developers tasked with managing these environments going to be successful? IT and network applications might move to a point where successful operations cannot be achieved with human cognition alone.
In the next blog in this series, I’ll talk about how event analytics in Netcool Operations Insight helps with the challenges that operations management face.
To learn more, register for our webinar on the value predictive insight brings to IT operations. Check out the earlier posts in our IBM Operations Analytics series. And stay tuned for additional key learnings from our colleagues in coming weeks.