Integrating Service Level Baselining, Performance Reporting, and Monitoring into your Software Development Methodology
This is a multi-part blog series that will examine how to leverage application and user experience monitoring when developing applications, especially customer facing applications, to achieve world class service levels. It will examine integration with different methodologies, using various infrastructure deployment approaches. The series is not intended to be comprehensive, but is a reflection on my personal experiences and time spent with hundreds of ECM customers since starting with the ECM industry in 1996.
Traditionally monitoring has been considered a “Run the Engine” (RTE) type of activity, much like the dashboard lights and gauges on your automobile. Deploy the application and start monitoring to make sure the application is running. In reality, monitoring must be integrated early in the development process to provide data, get user feedback, and to prepare for deployment and RTE activities. Monitoring helps to improve the development process and be better prepared for production, especially when the integration starts early. Done correctly, application monitoring contributes to the ‘DevOps’ shift occurring within IT organizations.
When developing a new business application or product there are many items to consider including:
- Does the application meet the business requirements?
- Is the end product relatively defect free?
- Does it integrate cleanly into the existing environment?
- Does it follow established coding practices?
- Is delivery going to meet the designated timeline?
- And of course, can it be delivered within budget?
Timelines can be a real problem. Given a set of business requirements, develop a product that meets those requirements and will be complete in time to meet a business, regulatory, or other timeframe. I’ve experienced, that since requirements need to be met and are the most visible and tangible deliverables, it’s the operational items, performance testing, and comprehensive in-depth understanding of the application that suffer the most.
In many cases, as developers strive to meet the previous concerns, other important topics get pushed out until after deployment or end up being dropped all together.
- What is the true performance of the application?
- What are the potential bottlenecks and how are issues identified?
- In the production environment, what is the user experience?
- Does the application meet the user’s performance expectations?
- What are the baseline performance and service levels?
- Is the application “instrumented” to provide good metrics?
- Can the application support team properly support this application?
- Is the application support team trained and prepared to support this application?
As a software developer, there are a couple types of customers that your application answers to.
- The first is the traditional end-user, the employee who uses your system to complete their day-to-day activities and/or the general public who uses the system from outside the organization.
- The second is the business manager/owner who requested the application be developed. They need to make sure the end-user customers are happy with the application and that it is providing real value to the business. The business owner customer needs to understand how the application is operating/performing – how it is “getting the job done”. Having a happy and well informed business owner customer is very important, because they most likely have just financed your project and they (or their peers) will be paying for your next project.
As a starting point for some of the topics that will be discuss in future entries, it’s important to outline some terms. There may be more exhaustive definitions or slightly different definitions for these terms, but I’m using the terms as described below. I’ve introduced a couple already.
RTE – “Run the Engine”. After the application has been deployed and put into production, RTE is the effort and adjustments to keep the application performing its designated task(s).
SDM- Software Development Methodology. The plan, processes, and controls that an application development group uses to deliver an application that meets specified business requirements. Also can specify a linear or iterative approach to development.
SDLC- Software Development Lifecycle. Closely related to the SDM, this outlines the processes, phase, and deliverables needed. The SDLC encompasses much more than development phase.
Waterfall Development – A linear and sequential development approach. Traditionally “big project” type development with long timelines.
RAD – Rapid Application Development. Iterative development approach. Agile and Scrum are popular development approaches.
Baselining- The process of measuring, analyzing, and documenting performance at a given point in time. These metrics are used as a reference to compare and relate to future metrics. A “snapshot” of system performance.
Performance Reporting- The process of gathering, storing, consolidating, and distributing operational metrics for an application or process. This applies not only to “physical” metrics (CPU, memory, I/O), but also process metrics (time from ingestion to completion).
SLA- Service Level Agreement. An agreement between a service provider and the consumer of that service. Typically outlines items such as: system availability, response time, processing volumes, and other metrics.
System Monitoring – “Ping, power, and pipes” monitoring. Provides information that the “hardware” and operating system is operational . Often provides some system performance information like CPU, storage, and memory usage.
AM - Application Monitoring. Monitoring at the “application” level. Provides information on how the application is performing, processing information, any errors or potential issues. End-to-end status of data flow is possible, with metrics and reporting throughout the process. Extends system monitoring to a more granular level on items related to the application.
ASLM - Application Service Level Monitoring. Externally and objectively looking at system AND application performance. Alerting, reporting, and automatically responding to the metrics gathered. Through analysis of metrics gathered over time, a better understanding of application operation is achieved. Using alerting and automated response, a more stable system and process that meets agreed upon SLAs is provided to the customer.
EPM - Experience and Performance Monitoring. Monitoring actual user experience while using an application (not synthetic transaction monitoring). Helps support staff bridge the gap between how the application is running and what the business user is experiencing.
The next blog entry will examine integration of these monitoring topics into a “traditional” SDLC and with Waterfall methodologies. Future topics will include: integration with RAD methodologies, working with infrastructure, communicating the appropriate information to keep the customer happy, and monitoring technology guidelines.