IBM Support

How to think about performance

Technical Blog Post


Abstract

How to think about performance

Body

Thinking about the US Healthcare.gov website issues, it reminded me that people don’t always know what “performance” means.  When we talk about performance we think it means “slow” or, if we think about our car, maybe we think it means “it works or it doesn’t work”.  Well, already we have two different definitions of performance and we haven’t really begun to look deep into it.  Having 17 years of experience in the performance business, I know that performance means the end user experience or perception.  It doesn’t really matter why something is not what we want it to be, it only matters to us that it is not what we want it to be.

From an end user perspective, if they search for something and it takes too long, that’s a performance problem.  If they select a new screen and it takes too long to paint the screen then it is a performance problem.  In complex, multi-tier applications, try to imagine the number of things that can go wrong that result in an end user not getting the experience they want.

I’ll use a typical Maximo implementation as an example but the concepts can be applied across the board.  The thing I hope you will take away is that there is nothing that can’t be fixed with regard to performance but you will first need to identify what someone is talking about when they report bad performance.  Maximo is a multi-tier Java application. It uses the network to communicate with the various tiers and with clients.  It runs on hardware (computers) with various processor, memory, and operating modes.  It uses a database to store metadata and application transactions.  It uses an application server to publish the application to the net.  It uses a connection to a directory server to store users.  It integrates with external applications to store and retrieve data used by those applications.  It has application settings that can be optimized based on how the product will be used. It has client computers that connect to it using network browsers. In this over simplified list, there are 8 major areas where things can go wrong.

I’ll list a few below but it can never be an exhaustive list because there is always something else in the mix and if the result is a bad experience, the end users perception will be the same.

Network – It won’t matter if you have the fastest, best tuned hardware and an application that runs at light speed, if your end users connect to it over a slow network, their experience will be, well, “Poor performance”.  Start with making sure you have the fastest network connections available.  And by the way, the connections between the various tiers should not be overlooked.  If the application queries the DB for information and the network connection between the application and the DB is slow, the result will be that the users will get a slow response.

Hardware – This is where the basic functions of the application are executed.  Underpowered processors, slow storage devices or connections to storage devices, and too little memory will all have an impact on the performance of the application.  Some of these can have a domino effect. Consider too little memory which may result in memory swapping to storage devices.  The point of electronic memory is so we don’t have to store things on storage devices.  Their performance cannot compare with electronic memory.  Then there is Java garbage collection (GC).  GC is the most processor intensive function in Java and it operates by watching memory availability and executing to clear memory either on a timed or required basis.  If memory starts to become over used, GC will execute more often.  The more memory used, the more often GC will execute. GC will often take 60% or more of processor but typically only takes sub-second to execute so it is never noticed by an end user but as memory runs low and GC operations occur on a near constant basis, processor utilization will go to 100%, users will not get any response to requests because everything is in queue waiting for GC to complete but because there is not enough memory GC cannot complete and this endless circle often terminates with an OutOfMemory error.  All this because we tried to run 50 users in a Java virtual machine (JVM) that had 2GB of memory, after all, it worked fine in test with 2 users.

 

Virtualization – Today, with public and private clouds being the handwriting on the wall for the future of computing, virtualization is a key part of maximizing our resources and responding to demand but virtualization can rob up to 22% of the physical servers performance depending on the vendor and version.  Using minimum hardware specifications from the vendor and then virtualizing it could mean your net capability is far below the vendors recommendations.  Using swapped memory for virtual machines (VMs) is a bad idea in the same way that it is bad for applications on physical machines.  Of the entire cost of an application, memory might just be the worst place to cut corners.

 

Operating Systems – Applications run on hardware, that is true, but to provide a consistent operating environment, all hardware has operating systems (OS) for applications to run in.  Today, the most common OS’s are several flavors of Unix, Linux, and Windows.  Some database servers provide their own streamlined OS’s to bypass the overhead required by a typical OS.  In the pyramid of reliance, if an application must run in an OS, and the OS, must run on hardware and the hardware must reside on the network, the logic is, the network must perform, the hardware must perform, and the OS must perform.  Each OS has its own tuning parameters and it is important to the function of the application that the OS be properly tuned.  Since every hardware call will ultimately be passed through the OS, if the OS does not perform well, the application won’t either.  The database, the application server, and even the clients browser all run on OS’s so don’t underestimate the impact of poorly tuned OS’s.

 

Database – If everything the application does is driven by the database (DB), it makes sense that if the DB does not perform, the application will not perform.  Things can get a little convoluted here.  First, the DB itself must be properly tuned and there are literally hundreds of tuning parameters and other factors that can impact the DB server performance.  Things like command compiling, allocated memory to various pools, temp space and many more can result in the DB not saving or returning results in reasonable time.  And remember, in environments where many thousand requests may be executed every minute, sub-second response time is required.  But other aspects of DB performance include well designed commands to the environment and application settings used to connect to the DB.  DBs execute commands using structured query language (SQL).  Like most things, there are good and bad ways of doing things.  Using wildcards to search for things might provide flexibility but may also preclude the use of indexes which can have a negative impact on the return speed.  The balance is to use wildcards where they are needed but not to use them for everything.  There are many instances where using the wrong approach in a SQL statement can mean the difference between a 1 second or 5 minute result set. 

 

Application Server – This is a term at is often confused.  The Application Server is the publishing product often referred to as middleware.  WebSphere, WebLogic, JBoss, and Apache are common application servers.  Their function varies but generally provides an environment that can publish multiple instances of an application service (JVM) for load balancing, high availability, and complimentary capability.  They are typically also responsible for the relationship of the application to the user directory server, Java Messaging Service (JMS), and other common functions. Tuning the application server including ensuring enough application instances exist to support the user load can have an impact on the performance of the application. User login times or problems with login or poorly configured messaging for integrations can impact the users response time.

 

The Application – Each application will have settings that interact with other components of the product based on how the product will be used.  In the case of Maximo, different versions can use different database releases and each of the database releases use different JDBC parameters that may improve the way they respond.  Using Maximo configured for MS SQL Server 2000 with MS SQL Server 2005 may have a very negative impact on the response time.  This isn’t an application related performance problem; this is related to proper tuning for the environment.  Another example might be where Maximo is shipped for maximum flexibility and simplicity in searching.  End users need only enter “B456” to find all records with “A123B456C789”, “B456123”, “C789B456”, etc…  This is done by using wildcards on the beginning and end of each search like “%B456%” but this means the indexes will not be used in searching for these values.  This usually works fine for small to medium-small databases with under 100 users but as the organization requirement grow, this approach is no longer a good one.  Imagine loading 10 million rows and searching each for the value “B456”.  Databases are fast but not that fast.  In larger environments, wildcard searching should be disabled by default and the users taught how to use wildcards when they need them.

 

The Client – Oddly enough, this is the topic that encouraged me to write today’s blog.  Many of us want to focus on the infrastructure of an application.  Anytime something goes wrong in a hosted or cloud environment we think there must be something we can do to make the server faster.  Don’t forget my original definition of performance “performance means the end user experience or perception”.  In a recent critical situation, I was called in to help solve performance problems.  We looked at infrastructure 6 ways to Sunday before the critical question came to the table – had anyone directly spoken to the users about what their problems were.  After interviewing several silos of users it became apparent that a number of the problems were really at their desktop.  Not enough memory on the client, different browsers like all different versions of IE, Firefox, and Chrome reacting differently, users not knowing the best way to do things (training), poorly designed screens with too many fields, too many clicks to accomplish tasks.  Remember again “It doesn’t really matter why something is not what we want it to be, it only matters to us that it is not what we want it to be.”  And this means everything from end to end of this solution can be reported as a performance problem.

 

With all the news about healthcare.gov, I wanted to try it for myself.  What I found was not that the server response time was all that slow but that the screens had been heavily laden with Javascript.  Tons of functionality for dynamic building of screens fancy message boxes reduced need for loading new screens as results can be refreshed in existing screens.  All this sounds wonderful from a technological perspective but when you roll this out to an uncontrolled client base with underpowered client machines with too little memory, all different versions of browsers supporting multiple levels of technology, you can expect the results to be inconsistent.  To add to all of that, think of the impact of language. If you don’t know healthcare and insurance terms, using them to describe things to the end user just adds to the frustration.  In terms of Maximo, a mechanic that has spent his whole life performing work based on a “Work Order” and they get a new system that starts talking about “Service Requests” and Incidents” it is a sea change for them and only adds to their frustration.  I know there are still infrastructure things that need to be done to shore up the healthcare.gov system but for me, the experience was poor because of what was happening right at my desktop.  I hope they don’t forget to look there.

When you are working on performance, don’t forget the definition of performance.  Spend some time with the people it’s impacting.

 

 

[{"Business Unit":{"code":"BU059","label":"IBM Software w\/o TPS"},"Product":{"code":"SSLKT6","label":"IBM Maximo Asset Management"},"Component":"","Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"","Edition":"","Line of Business":{"code":"LOB59","label":"Sustainability Software"}}]

UID

ibm11133181