Analyzing performance issues with IBM TXSeries for Multiplatforms

This article describes some IBM TXSeries for Multiplatforms performance issues along with some guidelines for investigating, diagnosing, and solving them.

Li Hui (lihhli@cn.ibm.com), Staff Software Engineer and Server Support Specialist, IBM TXSeres for Multiplatforms Support team, IBM China

Photo of Hui LiHui Li is a Staff Software Engineer and Server Support Specialist on the IBM TXSeries for Multiplatforms Support team for IBM China in Beijing. You can contact Hui at lihhli@cn.ibm.com.



Jin Shang, Staff Software Engineer and Technical Support Specialist, IBM BPM Support team, IBM

Photo of Jin Shang Jin Shang is a Technical Support Specialist on the WebSphere Adapters and WebSphere ESB Technical Support teams at the IBM China Software Development Lab. He has many years of experience with WebSphere business integration products. You can contact Jin at shangjin@cn.ibm.com.



Yu Z Zeng (zengzyu@cn.ibm.com), Staff Software Engineer and Technical Support Specialist, IBM TXSeres for Multiplatforms Support team, IBM

Photo of Yu Z ZengYu Z Zeng is a Staff Software Engineer and Technical Support Specialist on the IBM TXSeries for Multiplatforms Support team for IBM China in Beijing. You can contact Yu at zengzyu@cn.ibm.com.



23 January 2013

Introduction

IBM® TXSeries® for Multiplatforms is general-purpose online transaction-processing software that handles security, data integrity, and resource scheduling. The TXSeries for Multiplatforms architecture of integrated software components can be used to create a Customer Information Control System (CICS®) environment.

In CICS terms, a transaction is a basic operation offered to a user. For example, a banking application can include a query transaction, a debit transaction, and so on. The transactions that make up an application are written by CICS application developers.

Because of the complexity of the TXSeries environment, performance problems can have multiple possible causes that need to be investigated. This article uses some sample cases to show you how to investigate slow transaction issues in TXSeries. Based on statistics from resolved cases, slow transaction issues are rarely caused by defects in TXSeries, and are more often caused by application programs, database access problems, operating system interactions, or network issues.

Analyzing performance problems

  • Problem -- From CICS Transaction Gateway (CICS TG) client side, transaction response is slow.
  • Environment information -- TXSeries V7.1 PTF3, AIX V6.1.
  • Data workflow -- WebSphere Application Server application programs => CICS TG => TXSeries => Z/CICS => Database.

Here is a typical CICS environment:

Picture 1
Picture 1

Before checking

Transactions run in the TXSeries region. If a transaction takes a long time to complete, you may think it is a slow transaction issue, but the issue could also be that the time cost of the transaction process in TXSeries is too high.

  1. First, determine the expected value, and compare the expected TPS to the actual TPS. If they are similar, then it is usually not a slow transaction issue.
  2. If possible, check the performance on the TXSeries side versus a similar system. If they are similar, (multiple nodes with similar configuration are often used by many users), then it is usually not a slow transaction issue.

Begin checking

Begin checking from the client side first. Get the timestamp logs of CTG, and based on the initial analysis of them, you may need to get the timestamp logs for TXSeries, Z/CICS, the database interface, and possibly TCP/IP. The timestamp information and trace files will tell you where the time is being consumed.

Example A

In the CTG trace, the transaction response time is 405ms (16:35:44:295 - 16:35:44:890), but in non-issue time, the response time is about 20ms. To check it step by step:

  1. The CTG client, whose conversation ID (ConvId) is 51927, sent data at 16:35:43.890 and got data at 16:35:44.295, with the process consuming 405 ms (hexadecimal 0xCAD7 = decimal 51927).
    Picture 2
  2. Based on the CTG trace, the response time at the CTG side is 405ms, so you need to check the TXSeries trace files. The Cicsip process got the request at 16:35:44.89616611, and the convid is 51927. TXSeries task number 14235 was assigned for the request:

    Picture 3
    585 CICS Mon Jan  9 16:35:44 2012 89616611 ABIS01   cicsip   422594/14   (20202020)
        96C4009C79C223A    { ComSU_XPGetBuf   (732)
    . . . 
    581 CICS Mon Jan  9 16:35:44 89815228      ABIS01   cicsip   422594/14   (20202020)
        96C4009C79DAF6F    TCP/IP Listener (ComIP)
    cicsip using: 
        Listener      = 422594
        Connection    = 8
        Session       = 51927
    581 CICS Mon Jan  9 16:35:44 89821916      ABIS01   cicsip   422594/14   (20202020)
        96C4009C79DBCC7    TCP/IP Listener (ComIP)
    cicsip inbound datastream
    000002DC  0000CAD7
    
    585 CICS Mon Jan  9 16:35:44 2012 89934982 ABIS01   cicsip   422594/14   (20202020)
        96C4009C79E9EF1   { ConTS_ScheduleWork   TranId=CPMI, UserId=0, DeviceId=0, 
        Sync=0, Wait=0, InData=a031c930, Details=0, Fold=0, TClassTran = CPMI)
    581 CICS Mon Jan  9 16:35:44 2012 90029273 ABIS01   cicsip   422594/14   (20202020)
        96C4009C79F5B84     Scheduler component (ConTS):
    Task number 14235 assigned to transaction CPMI
    586 CICS Mon Jan  9 16:35:44 2012 90265687 ABIS01   cicsip   422594/14   (20202020)
        96C4009C7A1344A   } ConTS_ScheduleWork   return(0, RSN_6)
  3. The Cicsip process send the datastream back to CTG at 16:35:44.135205996:
    Picture 4
  4. TXSeries responded quickly, but it took long time for CTG to get it, which interestingly, indicates that the issue is in neither CTG nor in TXSeries.
  5. The next step is to check the network side. In the case, IP trace logs tell you that the network didn't send the data immediately after CTG sent data to the network side. After changing the TCP/IP parameter from tcp_nodelayack=0 to tcp_nodelayack=1, the issue is resolved.
  6. The tcp_nodelayack option prompts TCP to send an immediate acknowledgement, rather than the usual 200 ms delay. Sending an immediate acknowledgement may add a little more overhead, but in some cases it greatly improves performance.

Conclusions for Example A

In Example A, more than one subunit in the CICS environment can have a problem that can become a bottleneck for the entire system. In this case, the issue was in the TCP/IP delay setting, so if you just check the TXSeries and CTG side, you will not find the cause. Here are some tips to guide you:

  • Check the trace step-by-step, following the datastream from different products.
  • Determine the time costs in different parts of the system.
  • Remember that TXSeries is usually used in complex environments, and therefore it is not enough to just check TXSeries logs.

Example B

Here is another slow transaction issue, this one on the region side. In this case, it took cicsas more than two seconds to finish one transaction:

  1. The transaction started at10:10:47.489781869 and finished at 10:10:49.505439646:

    Picture 5
    Cicsas trace file:
    585 CICS Wed Jul 11 10:10:47 2012 489781869  TESTZY  cicsas  1069096/1  
        CPMI(43504D49) 98908B5E761ECBE   { TasTA_Run   (Tranid <CPMI> (49) 
        for user <> on device <> . . . TermD <0>, Indata <0xa00f2d40>)
    . . . 
    586 CICS Wed Jul 11 10:10:49 2012 505439646  TESTZY  cicsas  1069096/1  
        CPMI(43504D49) 98908B624E54054   } TasTA_Run   return(0)
  2. After checking functions between the beginning and end of the task (function TasTA_Run start and finish), you find no TXSeries trace logs between 10:10:47.497521257 and 10:10:49.497607736. The time was consumed in function TasPR_CallApplication, which means that the user's application program and TXSeries didn't consume time. What had happened during that period? See below:

    Picture 6
    Cicsas as trace file
    581 CICS Wed Jul 11 10:10:47 2012 496688892  TESTZY  cicsas  1069096/1  
        CPMI(43504D49) 98908B5E797E2C3   
        Program Control Module (TasPR)
        LINKing Program: TEST1 in Task: 49
    585 CICS Wed Jul 11 10:10:47 2012 496694796  TESTZY  cicsas  1069096/1  
        CPMI(43504D49) 98908B5E797EE99  { TasPR_CallApplication      (void)
    . . . 
    585 CICS Wed Jul 11 10:10:47 2012 497515648  TESTZY  cicsas  1069096/1  
        CPMI(43504D49) 98908B5E79E584D  { SupTM_GetThreadCancelData  (void)
    586 CICS Wed Jul 11 10:10:47 2012 497521257  TESTZY  cicsas  1069096/1  
        CPMI(43504D49) 98908B5E79E637D  } SupTM_GetThreadCancelData  return(0x302c1028)
    585 CICS Wed Jul 11 10:10:49 2012 497607736  TESTZY  cicsas  1069096/1  
        CPMI(43504D49) 98908B624A8108A  { SupTM_GetThreadCancelData  void)
    . . .                                               
    586 CICS Wed Jul 11 10:10:49 2012 498334378  TESTZY  cicsas  1069096/1  
        CPMI(43504D49) 98908B624ADBDBB  } TasPR_CallApplication      return(0, RSN_0)
  3. Referencing Step 2, TXSeries didn't consume time, so to track down the problem, check the application log, which shows that the application program took two seconds, from 11:10:47 to 11:10:49:
    Picture 7
  4. Checking the application program code reveals the root cause shown below:
    Picture 8

Conclusions for Example B

After checking the Example B application program, you can see that the root cause is the sleep() statement. in application program was found as the root cause. Here are some tips to guide you:

  • Check the detailed TXSeries trace timestamp to locate the main time cost.
  • If the detailed TXSeries trace looks okay, check the application program log and trace.
  • If the application program log and trace also looks okay, then check the operating system and database side trace files (such as the output of errpt -aon AIX).
  • Finally, if you fin any TXSeries internal function issues, please contact IBM TXSeries support.

Examples A and B showed how to check CTG and TXSeries trace files slow transaction issues, and how to break down complex performance issues. It is important to check all log and trace files, from one side of the datastream to the other, not just on the TXSeries side. While it may still look like a slow TXSeries transaction is the problem, the root cause may lie elsewhere, and in more than one component of the CICS system. So in addition to checking the logs, you may need to check the performance of some user application programs.

Analysis metrics

Analyzing logs

  1. At the operating system level, you can get more information to check a slow transaction issue:

    Picture 9
    Libraries usage statistics "tprof -ktuesjx sleep 60"
        Truss output of run slow cicsas process 
            ("truss -d -f -a -rall -wall -o /tmp/truss_cicsas.log -p process_id")
        Errpt, 
        env setting 
        etc. . . .
  2. On the TXSeries side, you need to collect the following log information: console, symrecs, CSMT, CCIN, <region_name>.env, <region_name>/database directory, and <region_name>/dumps/dir1 directory. You may also need trace and MustGather documents -- for more information, see MustGather documents for TXSeries CICS.
  3. On the application (database) side, you usually need the timestamps of the application's start time, major functions, and stop time, in order to check for issues inside the application.
  4. On the network side, you usually need to check TCP/IP regarding trace files for CTG and TXSeries communications.

Code for Example B (standalone program)

Picture 10
TEST1.ccs:
#include <stdio.h>
#include <string.h>
int main(){
    char *pcomm;
    struct cics_eib    * pEIB;
    unsigned long resp;
    unsigned long resp2;
char cicsutime[15];
    char wk_c8_date[15];
    char wk_c6_time[15];

    EXEC CICS ADDRESS EIB( dfheiptr ) RESP(resp);
    if (resp != DFHRESP(NORMAL)){
        fprintf(stderr,"Adress EIB error %d.\n", resp);
        EXEC CICS RETURN;
    }
    EXEC CICS ADDRESS COMMAREA( pcomm ) RESP(resp);
    if (resp != DFHRESP(NORMAL)){
        fprintf(stderr,"Address Communication area error %d.\n", resp);
        EXEC CICS RETURN;
    }
    pEIB = (struct cics_eib *) dfheiptr;

    memset(cicsutime, 0, sizeof(cicsutime));
    memset(wk_c6_time, 0, sizeof(wk_c6_time));
    memset(wk_c8_date, 0, sizeof(wk_c8_date));

    EXEC CICS ASKTIME ABSTIME(cicsutime) RESP(resp);
    EXEC CICS FORMATTIME ABSTIME(cicsutime) YYMMDD(wk_c8_date);
    fprintf(stderr,"1ST ASKTIME.wk_c8_date:%s,wk_c6_time:%s\n",wk_c8_date,wk_c6_time);
    sleep(2);
    fprintf(stderr,"sleep\n");
    EXEC CICS ASKTIME ABSTIME(cicsutime) RESP(resp);
    EXEC CICS FORMATTIME ABSTIME(cicsutime) YYMMDD(wk_c8_date) 
        TIME(wk_c6_time) RESP(resp);
    fprintf(stderr,"2ND SKTIME.wk_c8_date:%s,wk_c6_time:%s\n",wk_c8_date,wk_c6_time); 
    EXEC CICS RETURN;
}

In the above example, TEST1.ccs indicated that the application program ran slow. Sometimes a standalone program is useful for simple performance testing. Comparing properties and actions between a standalone program and a similar program running on TXSeries can give you important troubleshooting clues. For example, on the same machines, if function A in a healthy standalone application costs 20ms, but function A in the same program on TXSeries costs 100 ms, then you need to check for a TXSeries performance issue, because of the large difference in performance between the standalone program and the TXSeries program.

Conclusion

Many users analyze TXSeries performance issues only from a TXSeries perspective. This article has described other issues that you may need to consider, such as operating system, application program, and database issues. The article also showed you how to use a standalone program to compare it to the same program on TXSeries.

Resources

Comments

developerWorks: Sign in

Required fields are indicated with an asterisk (*).


Need an IBM ID?
Forgot your IBM ID?


Forgot your password?
Change your password

By clicking Submit, you agree to the developerWorks terms of use.

 


The first time you sign into developerWorks, a profile is created for you. Information in your profile (your name, country/region, and company name) is displayed to the public and will accompany any content you post, unless you opt to hide your company name. You may update your IBM account at any time.

All information submitted is secure.

Choose your display name



The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerWorks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

Required fields are indicated with an asterisk (*).

(Must be between 3 – 31 characters.)

By clicking Submit, you agree to the developerWorks terms of use.

 


All information submitted is secure.

Dig deeper into WebSphere on developerWorks


static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=WebSphere
ArticleID=855771
ArticleTitle=Analyzing performance issues with IBM TXSeries for Multiplatforms
publish-date=01232013