Topic
9 replies Latest Post - ‏2010-08-25T12:51:41Z by SystemAdmin
SystemAdmin
SystemAdmin
228 Posts
ACCEPTED ANSWER

Pinned topic Slow Response Times

‏2010-08-24T11:46:27Z |
Hi,

I'm new to Informix, but I've got an instance (OLTP system) that has been performing fine for months now, but recently has be displaying slow repsonse times for any/all SQL queries executed on it. These queries are for CC authorisations, that require sub-second responses (typically around a 1/10th of a second), but recently they've been taking up for 12 seconds!!

Therefore, can someone please suggest where best to look for probable causes, as the AIX server is only running at 50% CPU/memory, and the other 14 Informix instances running on the prod server aren't having any performance issues. Our network team have ruled out any on the connection from the application <-> db server too.

thanks in advance.
Updated on 2010-08-25T12:51:41Z at 2010-08-25T12:51:41Z by SystemAdmin
  • SystemAdmin
    SystemAdmin
    228 Posts
    ACCEPTED ANSWER

    Re: Slow Response Times

    ‏2010-08-25T08:02:18Z  in response to SystemAdmin
    Hi,

    The main target for you here is to identify where is the bottleneck located. That can be disk I/O, network I/O, lack of resources (such as mutexes, locks or buffers).

    I would suggest you to analyze the environment at first. Can it be that it has changed (i.e. the amount of connections/users has increased significantly)?

    Use onstat utility to analyze the server status. Check 'onstat -g ioa' for I/O queues - if they are large the server has been stuck on disk I/O.

    Monitor the user threads to see what are they actually doing (onstat -g ath, onstat -g stk).

    Check 'onstat -p' or 'onstat -g buf' for bufwaits. If it's large, you're having lack of available buffers.
    • SystemAdmin
      SystemAdmin
      228 Posts
      ACCEPTED ANSWER

      Re: Slow Response Times

      ‏2010-08-25T11:12:44Z  in response to SystemAdmin
      Hi

      thanks for the onstat suggestions. Below are the results, as although buffer waits number is large, not sure at what number of waits is deemed too many?

      ukbiprodmvrs01[/home/informix]$ onstat -p

      IBM Informix Dynamic Server Version 11.50.FC3 -- On-Line -- Up 4 days 02:09:16 -- 1850848 Kbytes

      Profile
      dskreads pagreads bufreads %cached dskwrits pagwrits bufwrits %cached
      1176887565 1841116027 16511495965 92.87 13666795 24122934 119306573 88.94

      isamtot open start read write rewrite delete commit rollbk
      16192932555 102915248 434689413 13862080158 39012259 10789882 2684051 2020736 780

      gp_read gp_write gp_rewrt gp_del gp_alloc gp_free gp_curs
      0 0 0 0 0 0 0

      ovlock ovuserthread ovbuff usercpu syscpu numckpts flushes
      0 0 0 84009.13 28363.27 1206 1378

      bufwaits lokwaits lockreqs deadlks dltouts ckpwaits compress seqscans
      82297149 1712 9807705426 1 0 874 5906670 147790

      ixda-RA idx-RA da-RA RA-pgsused lchwaits
      494948436 4650232 41689791 539540757 901118

      ukbiprodmvrs01[/home/informix]$
      ukbiprodmvrs01[/home/informix]$
      ukbiprodmvrs01[/home/informix]$ onstat -g buf

      IBM Informix Dynamic Server Version 11.50.FC3 -- On-Line -- Up 4 days 02:09:59 -- 1850848 Kbytes

      Profile

      Buffer pool page size: 4096
      dskreads pagreads bufreads %cached dskwrits pagwrits bufwrits %cached
      1176906592 1841138398 16513538062 92.87 13667539 24124030 119316434 88.55

      bufwrits_sinceckpt bufwaits ovbuff flushes
      58025 82300787 0 1378

      Fg Writes LRU Writes Avg. LRU Time Chunk Writes
      0 9522242 0.012 1533059

      Fast Cache Stats
      gets hits %hits puts
      732836386 728437231 99.40 84696574

      ukbiprodmvrs01[/home/informix]$ onstat -g ioq

      IBM Informix Dynamic Server Version 11.50.FC3 -- On-Line -- Up 4 days 02:15:00 -- 1850848 Kbytes

      AIO I/O queues:
      q name/id len maxlen totalops dskread dskwrite dskcopy
      drda_dbg 0 0 0 0 0 0 0
      sqli_dbg 0 0 0 0 0 0 0
      kio 0 0 17 353978622 350468414 3510208 0
      kio 1 0 31 281099087 277715892 3383195 0
      kio 2 0 17 329960823 326546537 3414286 0
      kio 3 0 17 225666031 222299692 3366339 0
      adt 0 0 0 0 0 0 0
      msc 0 0 2 116602 0 0 0
      aio 0 0 2 23527 6427 0 0
      pio 0 0 0 0 0 0 0
      lio 0 0 0 0 0 0 0
      • SystemAdmin
        SystemAdmin
        228 Posts
        ACCEPTED ANSWER

        Re: Slow Response Times

        ‏2010-08-25T11:36:29Z  in response to SystemAdmin
        Hi,

        The I/O queues seem to be allright. Though bufwaits may be too big. Try to compare it against bufwaits on other production instances which don't have any performance issue. If the difference is huge it may worth to add some more buffers (BUFFERPOOL parameter).

        You should also check that you have enough poll threads to service the amount of connections you have. It is recommended to have 1 poll thread for ~250 connections.
        • SystemAdmin
          SystemAdmin
          228 Posts
          ACCEPTED ANSWER

          Re: Slow Response Times

          ‏2010-08-25T12:01:55Z  in response to SystemAdmin
          How do I check what the poll thread to no. of connections are?
          • SystemAdmin
            SystemAdmin
            228 Posts
            ACCEPTED ANSWER

            Re: Slow Response Times

            ‏2010-08-25T12:16:02Z  in response to SystemAdmin
            I take it you mean the NETTYPE parameter:

            NETTYPE soctcp,2,200,NET
            LISTEN_TIMEOUT 60
            MAX_INCOMPLETE_CONNECTIONS 1024
            FASTPOLL 1

            The NUMCPUVPS is 4

            onstat -u returns: 151 active, 256 total, 169 maximum concurrent - are these parameter values okay then do you think?
            • SystemAdmin
              SystemAdmin
              228 Posts
              ACCEPTED ANSWER

              Re: Slow Response Times

              ‏2010-08-25T12:19:00Z  in response to SystemAdmin
              Ah, just noticed your new post.

              Yep, these settings looks valid.
              • SystemAdmin
                SystemAdmin
                228 Posts
                ACCEPTED ANSWER

                Re: Slow Response Times

                ‏2010-08-25T12:31:50Z  in response to SystemAdmin
                Good - that's something set right then!

                Back to the bufwaits, below is out BUFFERPOOL params:

                BUFFERPOOL default,buffers=10000,lrus=8,lru_min_dirty=50.000000,lru_max_dirty=60.500000
                BUFFERPOOL size=4K,buffers=132000,lrus=16,lru_min_dirty=0.750000,lru_max_dirty=1.550000

                Would you suggest adding another buffer pool?

                Also doing an onstat -g cpu, are the below normal:

                29 kaio 1cpu* 08/25 13:22:38 184564.2279 443414572 IO Idle
                59 aslogflush 5cpu 08/25 13:22:38 1.9307 358470 sleeping secs: 1
                60 btscanner_0 1cpu 08/25 13:22:36 662.4204 78508340 sleeping secs: 9
                77 kaio 4cpu* 08/25 13:22:38 152400.4080 357141153 IO Idle
                78 kaio 3cpu* 08/25 13:22:38 188091.5425 418102214 IO Idle
                79 kaio 5cpu* 08/25 13:22:38 118105.7210 286945987 IO Idle

                Our instance is using kernal I/O, but should they be idle more often than not?
                • SystemAdmin
                  SystemAdmin
                  228 Posts
                  ACCEPTED ANSWER

                  Re: Slow Response Times

                  ‏2010-08-25T12:51:41Z  in response to SystemAdmin
                  'IO idle' is the normal state for i/o threads, which means there is no work for them so far.

                  Regarding the BUFFERPOOL. As I already said, I suspect that 'bufwaits' may be too big, but I can't tell you that I'm sure without real experience with such instance. You mentioned that there are other similar instances which work ok. If there is a big difference between "bufwaits" of this and "normal" instance, I would suggest you to increase the amount of buffers (+10-20k). Be aware that this will affect IDS memory utilization! And you have to bounce the instace so parameter change could take effect.
          • SystemAdmin
            SystemAdmin
            228 Posts
            ACCEPTED ANSWER

            Re: Slow Response Times

            ‏2010-08-25T12:17:30Z  in response to SystemAdmin
            The number of poll threads & number of connections can be set via NETTYPE parameter.

            For example:

            NETTYPE onsoctcp,1,200,CPU

            In the above example 1 means one poll thread, 200 - maximum number of expected connections, CPU means that poll threads will be handled by CPU VP(s). For more than one poll thread it is recommended to set NET instead of CPU if you have more than one poll thread.

            Check following link for detailed description:
            http://publib.boulder.ibm.com/infocenter/idshelp/v115/topic/com.ibm.perf.doc/ids_prf_105.htm?resultof=%22%6e%65%74%74%79%70%65%22%20%22%6e%65%74%74%79%70%22%20