Integrating Service Level Baselining, Performance Reporting, and Application Monitoring into your Software Development Methodology
This is the second installment of a multi-part blog series that examines how to leverage application and user experience monitoring when developing applications, especially customer facing applications. It examines integration with different methodologies and varied infrastructure deployment. The series is not intended to be comprehensive, but is a reflection on my personal experiences and time spent with hundreds of ECM customers since starting with the ECM industry in 1996.
Integration of application and user experience monitoring into your SDLC with a traditional Waterfall methodology can quickly pay dividends. As a software developer, timelines are tight and the demand for low defect code is high. A developer can’t take a casual attitude towards the early stages of a project, because little issues at the beginning can become “show stoppers” later on.
Some terms and acronyms used in this installment are outlined in the first installment.
The Software Development Life Cycle typically contains several phases: Requirements, Design, Development/Implementation, Testing, and Operation/Maintenance phases.
During the Requirements phase, the functional requirements are specified, outlining what the application is “supposed to do”. An important part of the Requirements phase is the non-functional requirements – requirements that outline how the system is supposed to “operate”. These can include: SLAs, HA/DR operation, auditability, performance, usability, capacity, supportability, and response times.
The Design phase creates the system architecture that meets the requirements outlined in the previous phase. The design should not ignore “Run the Engine” (RTE) components and must take into consideration the non-functional requirements. Design should eliminate any “magic happens here” black boxes, especially those driven by vague business requirements or last minute management cursory reviews.
The Development/Implementation phase is where the application development team “cranks out the code”. In my experience, development teams produce a quality product that meets the outlined functional requirements. Issues with the overall application are typically found with the interaction between other applications or systems. Often the non-functional requirements aren’t uniform between applications or systems leading to problems during integration. Also business requirements can have issues of “That’s what we specified, but not what we meant”.
The Testing phase is where an independent testing team compares the application developed against the requirements (functional AND non-functional) specified during the Requirements phase. Integration testing, where the new application interacts with existing or newly built applications, can lead to considerable remediation efforts. Testing should not be underestimated; it is the application’s first contact with real world users and issues.
The Operation/Maintenance phase begins when the development team completes development on the release and hands off the application to the operations/support staff. The application is made available to the target internal or external customers and starts to perform the work that the requirements outlined. The support teams need to keep the “engine running” and provide feedback to management and the development teams about what is working well and where remediation may be needed.
The Waterfall methodology is a sequential process that closely follows the SDLC outlined above. Each step in the SDLC “flows” to the next, with some overlap between steps. Requirements lead to a design, which leads to development, then onto testing, production, and finally maintenance.
The Requirements phase should produce a number of non-functional requirements. The Business User community imposes some of these: response times, SLAs, system availability. Other non-functional requirements come from operations staff and management: HA/DR, capacity, auditability. Business users typically tend to gloss over many non-functional but critical “plumbing” requirements, as they are focused on what the application is “supposed to do”. However, during application testing and deployment, the lack of defined non-functional requirements can become a large point of contention with the development team.
The Design phase needs to include detailed design elements on how to meet non-functional requirements. For example, is logging/reporting to meet audit requirements going to be written directly into the application or accomplished externally? Application Monitoring (AM) and Experience and Performance Monitoring (EPM) should be integrated into the design, allowing full use of tools and reporting during subsequent phases. The application development teams needs the design to be able to demonstrate that they are meeting the functional and non-functional requirements.
During the Development/Implementation phase monitoring starts taking a more pronounced role. Leveraging application, system, and experience monitoring in the development phase provides the development team insight to meeting the non-functional requirements. How is the system performing, what are the user response times, are there any errors being generated? AM can provide: early information on performance metrics, queue depths, component status and component interaction from application operating behavior versus finding design issues in near production rollout. Automated monitoring, reporting, and correction can reduce the waste of valuable development time troubleshooting basic operational issues during development. Including AM early in development helps provide a more complete and supportable product to support/operations, freeing the development team to work on new projects.
Both AM and EPM pay huge dividends during the Testing phase. The testing team can certainly follow test scripts to validate functional requirement use cases, but how do they objectively measure non-functional requirements? The testing team can’t submit a report to management stating that “we think the system responds fast enough”. EPM allows the team to provide exact response times for users during testing and over a variety of situations (load, location, error conditions). Properly implemented monitoring can provide objective reporting on capacity, storage use, response times, errors generated (and remediated) and a host of other “non-functional” items. Properly designed monitoring also allows troubleshooting of issues encountered during integration testing. Providing information about data flowing in and out of an application and sending alerts if actual operation is different than expectations.
Baselining a system during the final user acceptance test is critical; a baseline gives management and support an overall “picture” of the application. Once the application goes into production, comparison to the baseline will help identify bottlenecks, volume related performance issues, capacity, and growth. As application or environment fixes and enhancements are put in place after “go live”; comparison to the baseline measurements provides verification that the changes remediated the issue, improved performance/operation, or most importantly “did no harm”. Service Level Baselining allows management to have a measurement of user interactions in an ideal situation and again offers a comparison point when user load or environmental issues occur.
As the project follows the Waterfall methodology and the application is placed into production, AM allows the support and management teams to know that the application is “really working”. Typically, the support team has a whole host of applications they are responsible for and can’t have the level of understanding and involvement with the application that previous teams (architects, developers, testers) do. Properly designed and configured monitoring allows the support staff to have in-depth and immediate visibility into an application. By using Application Service Level Monitoring the system can be administered by less experienced administrators, freeing up senior resources for critical issues elsewhere. In the case where the application is running out of JVM memory or storage, the support team can be alerted before it becomes an issue and a “fire drill” eliminated. If during production users report “slow” performance, EPM objectively reports response times and SLAs as accurate input for constructive business operation reviews. It’s important the Application owners have full visibility of the application operating “stack” when an issue is occurring.
Maintaining the application takes a couple tracks. First, the application development team many need an effort to remediate any application defects found upon contact with the users (never under estimate the ability of the user base to quickly exercise defects!). These defects also include non-functional issues. Often how the Business Analysts think a user will use the system is very different from how a user actually uses the system. Performance issues and bottlenecks may be identified with Application Monitoring and the gathered metrics and comparisons will help the application development team track these down.
Secondly maintaining the operating environment, which usually falls to the application support team, needs to be considered. After common support tasks occur, such as network changes, additional storage/CPU/memory, are complete is there an automated set of monitors that lets management and support know that the system is back in a fully operational state? Have the changes impacted the business users, either positively or negatively? The network changes may have been made for performance reasons, but do the users see a 1 second improvement or only a 1ms improvement? Getting ahead of potential issues helps maintain the system. Giving the storage team a couple extra weeks to purchase, provision, and configure storage make a huge difference for deployment and harmony. Having properly configured monitoring can help achieve this harmony between groups.
Planning for and integrating Application and User Experience Monitoring early in the SDLC, and with each phase, provides immediate benefits and helps to produce a better, more complete, and sustainable application. Application and Service Level Monitoring shouldn’t be an afterthought, only for the application support team to worry about. Every phase of a Waterfall based project can make use some aspect of monitoring, ultimately making your life as an application developer, support engineer, project manager, or application manager better when working with a new application. Implementing AM / EPM during throughout the SDLC sends positive signals to the business participants in the project– that you understand that service levels and end user response are key project success criteria.
The next blog entry will talk about integration of monitoring with Rapid Application Development methodologies. In the world of short timeframes and high expectations, it’s imperative to know what’s really going on.