IBM Support

Performance Tuning Best Practices for FileNet Image Services

Troubleshooting


Problem

This document describes common configuration settings that is investigated when a performance issue is suspected during FileNet Image Services tuning.

Resolving The Problem

This performance tuning document describes the most common items to check regarding the current configuration:

    • Performance Reports
      The FileNet Image Services performance reports are created at any time after the software was started by running the command: perf_report -a. The performance reports are in the /fnsw/local/logs/perf directory on UNIX servers and the DRIVE:\fnsw_loc\logs\perf directory on Windows servers.

      By default, the perf_mon process that gathers performance data, stores performance data every 15 minutes. If a performance issue is suspected, IBM suggests changing the interval from 15 minutes to 5 minutes. The perf_mon.script file is located in 1 or 2 locations. Both files are modified if they exist.

      UNIX: /fnsw/lib/perf/perf_mon.script
      /fnsw/local/sd/perf_mon.script

      Windows: DRIVE:\fnsw\lib\perf_mon.script
      DRIVE:\fnsw_loc\sd\perf_mon.script

      Change:
      From:
      schedule 0 0:00:00 2:00:00
      schedule 0 6:00:00 0:15:00
      schedule 0 19:00:00 2:00:00
      To:
      schedule 0 0:00:00 2:00:00
      schedule 0 6:00:00 0:05:00
      schedule 0 19:00:00 2:00:00
      Edit the perf_mon.script file or files. Then, restart FileNet Image Services to put the change into effect.

    • Verify Shared Memory Segment Size
    • The preferred shared memory segment size is different for each hardware platform. It is important to configure the correct shared memory segment size to avoid running out of the resources that can cause critical errors. See the IBM technote below for detailed information about troubleshooting shared memory issues and setting the correct shared memory segment size.



    • Set Document & Directory Buffers to Maximum Values
      Generally, the number and sizes of the document and directory buffers is set to the maximum size and number in the fn_edit -> Performance Tuning -> Server memory tab.

      The maximum sizes and counts are:

      Document Buffer Count: 256 Directory Buffer Count: 256
      Document Buffer Size: 1024 Directory Buffer Size: 256

      Document Buffer Count - Each process that accesses cache has a document buffer associated with it. These processes include all of the dtp processes, the number of CSMs processes configured and the committal processes (bes_commit, fbc_commit. rmt_commit, etc.). If a FileNet Image Services server does not have enough document buffers configured, overall system performance is affected because the processes that access cache must wait for a buffer to become available.

      Document Buffer Size - A document buffer is used to transfer an object (document page) to and from cache and an optical/MSAR surface. By setting the document buffer size to the maximum size, it requires fewer transfers, which result in better performance because less time is required for the transfer to complete.

      To determine the number of processes currently configured that use document buffers, obtain the total of the following three items:


      • The total number of CSMs, DOCs, and PSMs request handler processes configured in the /fnsw/etc/serverConfig file. If the /fnsw/etc/serverConfig.custom file exists, use the total number from this file instead.
      • The number of optical and MSAR drives configured in the Storage Library tab in fn_edit. All MSAR libraries have 12 drives each. The number of optical drives can vary depending upon the model of the optical library.
      • The total number of FileNet Image Services committal processes is in the /fnsw/local/sd/as_conf.g file. If the number of processes for a specific committal process is not shown, use 1, which is the default.
          processes {
          notify ds_notify 2
          scheduler dsched
          dtp dtp
          dtp_tran dtp_tran 1
          rmt_commit rmt_commit 2
          fbc_commit fbc_commit
          del_commit del_commit
          osi_migrate osi_migrate
          }
      Directory Buffers Count and Size - Directory buffers are used to break up a Fast Batch Object, and they are also used by the Integral SDS_worker processes.

      The dbp -s command is run to monitor the buffer configuration when the FileNet Image Services software is running at peak load or when there is a performance issue. See that the FileNet Image Services System Tools manual got information about interpreting the dbp output.



    • Turn off MKF Verify Disk Writes
      This feature is configured in the fn_edit -> MKF databases tab. This feature is only enabled if the system administrator suspects that there is some type of hardware or network error on the device where the MKF databases reside. When Verify Disk Writes is enabled, the FileNet Image Services software reads back and verifies everything it Writes. This verification process causes two transactions to occur for every write operation. Verify Disk Writes slows down performance. By turning it off, only the write transaction occurs. The result is the server performance improves.

    • Separate MKF databases and cache to different disk drives
      Each of the MKF databases (Permanent, Security, and Transient) and Cache is on different physical disk drives to prevent disk contention with the disk read/write heads on very active systems. Having cache and the transient database on the same physical disk drive would slow down performance when objects are moving in and out of cache. Both the transient database and cache are updated at the same time as documents are processed. The same thing holds true between the transient and permanent databases.


    • Separate MKF database RL partitions from their databases


    • Each of the three MKF databases has one or more database data sets (permanent_db0. permanent_db1, etc.) and redo logs (permanent_rl0, permanent_rl1). The database data sets are where data is stored and the redo log is used to maintain log files as sequential records of database changes. These logs are flushed as information is written to the actual database.

    • Set the MKF buffers to get as close as possible to a 100% cache hit ratio


    • The MKF buffers can be increased in the fn_edit -> Performance Tuning -> Server Memory tab. By increasing the size of the buffers, more of the database is held in memory and avoids having to read out of the database on the physical disk drive. The cache hit ratio for each of the databases is obtained by creating the performance reports and looking at the MKF I/O reports.
          cmb1.permdb_io.Apr15.txt
          cmb1.secdb_io.Apr15.txt
          cmb1.transdb_io.Apr15.txt

      Tuning the MKF buffers is done gradually. MKF buffers are stored in shared memory. If the buffers are made too large, performance problems might result.

    • Tune cache for best performance


    • In the fn_edit -> System Application Services -> Cache tab, the system administrator can allocate the minimum percentage of cache to devote to the main types of cache.

      The minimum percentages for all four caches should add up to 100%.

      Allocate most of the cache percentage to page cache so more documents are kept ageable in cache before they are aged out by the CSM_daemon process. Performance improves by preventing the documents in cache from having to be retrieved from an optical surface, MSAR surface, or Integral SDS device.

      The current cache hit ratio can be obtained by looking at the performance reports (Client Page Request Report)

    • Configure the locked threshold for BES


    • Configure the locked threshold for BES cache to leave sufficient free space to hold the largest batch that might be created plus 10%.

      For example, for an environment where:
        1. The Minimum Allocation for BES cache is 20% and the total cache size is 1 GB
          or 200 MB
        2. The largest batch size is five 1000 page documents
        3. The size of each page 50 KB
        4. The largest batch size would be 25 MB (5 x 1000 x 50 KB)
        5. The free space that is required would be 27.5 MB (25 MB + 10%)
      Set the Locked Threshold % in fn_edit for BES cache to at least 86% since 27.5 MB is almost 14% of the Minimum Allocation of BES cache (200 MB).

    • Turn on Fast Batch Committal and Fast Batch Breakup


    • If a server uses Fast Batch Committal, Fast Batch Breakup is enabled in the fn_edit -> System Application Services -> Other tab. Typically, after a Fast Batch Object is migrated, the documents that are contained in Fast Batch Object do not reside in cache. By turning on Fast Batch Breakup, the documents remain in page cache after they are migrated to the storage device. Most users access new documents immediately. By allowing the Fast Batch documents to remain in cache, they do not have to be migrated back into cache if they are immediately requested by a user.

      Applications that use fast Batch Committal are COLD, HPII, and Capture. A custom application can also be written to use Fast batch Committal.

      The current cache hit ratio can be obtained by looking at the performance reports (Client Page Request Report).

    • Set the number of Integral SDS_worker processes


    • If Integral SDS performance issues are suspected, there are several things that can be used to investigate problems.

      1. DOC_tool - All ISDS statistics are kept in memory. When the Image Services software is recycled, all of the information since the software was started is lost. There are no performance reports that track ISDS committals or retrievals. DOC_tool has an “SDS” option that can be used to display the current statistics that are kept in memory. This command is used infrequently in a production environment because it causes all of the SDS_worker processes to pause while the performance information is collected and displayed.

      Here is an example of the DOC_tool SDS output for an SDS Unit. It shows a high AVG requests queue wait time, which indicates the number of SDS_worker processes are increased.

      The AVG requests queue wait time shows how long the requests are in the queue before they are processed by SDS_worker. Requests are put in the queue by SDS migration background job.

        DOC_tool
        DOC_tool> SDS
        Summary information, Detailed, Worker information, All information, Find object, or list?
        ('s', 'd', 'w', 'a', 'f', 'l'): d

        The current time is Fri Feb 5 10:25:31 2010

        SDS info: ALL option
        All SDS units mode (y/n) [y]: : n
        SDS unit ID: 2
        ****** SDS unit = cen_kirin (2)
        SYSTEM state = SYSTEM ENABLED (0x0)
        USER state = USER ENABLED (0x0)
        Worker = 'SDS_worker' Number Instances = 4
        info = 'Centera2.usca.ibm.com?/fnsw/local/sd/1/QAImport.pri'
        SDS priority = high
        DEBUG Setting = MAX
        dynamic repository lib = 'SDSw_centera'
        retention default offset (1 days)
        SDS content delete setting=YES
        SDS supports: EBR=YES HOLDS=YES Retention Extension=YES

        Total Accumulated counters from all workers(4):
        ** Configured workers =4 active workers=2
        TOTAL WORKER COUNTERS (sds_id=2):
        Read Requests processed: 0
        Write Requests processed: 20
        Copy Requests processed: 0
        Errors: 0
        Requests processed = 20
        Successful requests processed = 20
        Errors = 0
        AVERAGE ACCUMULATED ELAPSE TIMES:
        Up time: 274.098067 secs/workers (4.568301 mins)
        Idle time: 263.780673 secs/workers (4.396345 mins) (96.24%)
        Total processing time: 7.472040 secs/workers
        (0.373602 secs/reqs)
        (0.373602 secs/image page)
        (0.013004 secs/KB)
        AVG requests queue wait time: 3.466658 secs/reqs

        *****Total READ REQUEST PERFORMANCE (sds_id=2)
        Total retrieval requests = 0
        Images retrieved from SDS = 0
        Data retrieved = 0.000000MB
        Number of read requests where the whole blob fits
        into the internal image_buffer (1024K): 0
        Number of read requests where the whole blob does not fits
        into the internal image_buffer (1024 K): 0
        Cache hits: 0
        Number of redirection: 0
        Number of redirection errors: 0

        Total Time to process read requests: 0.000000 secs (0.000000 mins)

        *****Total WRITE/COPY REQUEST PERFORMANCE (sds_id=2)
        Total write requests = 20
        Total copy requests = 0
        Documents written = 20 (FBC=0, MSAR reads=0, Cache=20, Copy=0)
        Images written = 20
        Data written = 0.561123 MB
        AVG Image Size = 28.729492 K
        Cache hits in copy: 0
        Total Time to process write requests: 14.943911 secs (0.249065 mins)
        Total Time to process copy requests: 0.000000 secs (0.000000 mins)
        Time in SDS device create and write object: 0.423189 secs (0.007053 mins)
        (0.021159 secs/reqs)
        (0.021159 secs/image page)
        (0.000737 secs/KB)
        Time in SDS device write only: 0.053464 secs (0.000891 mins)
        (0.002673 secs/reqs)
        (0.002673 secs/image page)
        (0.000093 secs/KB)
        Time in cache(CSM) to process write/copy: 0.025077 secs (0.000418 mins)
        (0.001254 secs/reqs)
        (0.001254 secs/page)
        (0.000044 secs/KB)
        Time in MSAR read to process write/copy: 0.000000 secs (0.000000 mins)

      2. Number of SDS_worker processes – The default number of SDS_worker processes is 3 when the SDS unit is configured in fn_edit. Each of the three default worker processes takes on a different function.

      SDS_worker # 1 - The first worker process performs only reads.
      SDS_worker # 2 - The second worker process performs only copies.
      SDS_worker # 3 - The third worker process performs only copies.

      If the number of worker processes is not increased from the default of three, performance can be degraded. Only one process performs all of the ISDS writes and one process performs all of the read operations.

      When more than three SDS_worker processes are created, SDS_worker #4 and higher, perform all three operations (read, copy, and write). The maximum number of SDS_worker processes that can be configured is 99 for each SDS unit. Setting the number of SDS_worker processes too high can also have an adverse effect. When excessive SDS_worker processes are configured, they can use resources and can never be accessed.

      The number of initial SDS_worker processes is set to around 20 and then monitored by using the DOC_tool SDS option. The SDS_worker processes can be monitored to see whether they are active or idle (too many configured).

      The DOC_tool SDS statistic, AVG requests queue wait time, can be monitored for the SDS unit to determine whether the number of SDS_worker processes needs to be increased for the SDS Unit.

    • Configure the TCP and Ephemeral Ports Network Settings


    • It is important to set the TCP and ephemeral port settings to avoid ports being in a time wait state or ports being blocked. When the ephemeral port settings are not correct, 15,16,17 errors might be written on the error log. This type of error can cause performance and network connection issues. The preferred settings for each hardware platform are provided below.


      AIX
        Preferred settings:
          tcp_keepidle 80
          tcp_keepintvl 20
          tcp_ephemeral_high 65535
          tcp_ephemeral_low 42767
          udp_ephemeral_high 65535
          udp_ephemeral_low 42767
        Verify current settings:
          The no -a command can be used to verify the current settings.

        Resolution:
          /usr/sbin/ no -p -o tcp_keepidle=80
          /usr/sbin/ no -p -o tcp_keepintvl=20
          /usr/sbin/ no -p -o tcp_ephemeral_high=65535
          /usr/sbin/ no -p -o tcp_ephemeral_low=42767
          /usr/sbin/ no -p -o udp_ephemeral_high=65535
          /usr/sbin/ no -p -o udp_ephemeral_low=42767

      HP-UX & HP Integrity
        Preferred settings:
          udp_smallest_anon_port 42767
          udp udp_largest_anon_port 65535
          tcp tcp_smallest_anon_port 42767
          tcp tcp_largest_anon_port 65535
          tcp tcp_time_wait_interval 30000

        Verify current settings:
          ndd -get /dev/udp udp_smallest_anon_port
          ndd -get /dev/udp udp_largest_anon_port
          ndd -get /dev/tcp tcp_smallest_anon_port
          ndd -get /dev/tcp tcp_largest_anon_port
          ndd -get /dev/tcp tcp_time_wait_interval

        Resolution:
          ndd -set /dev/udp udp_smallest_anon_port 42767
          ndd -set /dev/udp udp_largest_anon_port 65535
          ndd -set /dev/tcp tcp_smallest_anon_port 42767
          ndd -set /dev/tcp tcp_largest_anon_port 65535
          ndd -set /dev/tcp tcp_time_wait_interval 30000


      Solaris
        Preferred settings:
          udp_smallest_anon_port 42767
          udp udp_largest_anon_port 65535
          tcp tcp_smallest_anon_port 42767
          tcp tcp_largest_anon_port 65535
          tcp tcp_close_wait_interval 30000 (for Solaris 2.x only)
          tcp tcp_time_wait_interval 30000 (for Solaris 8 and above)

        Verify settings:
          ndd -get /dev/udp udp_smallest_anon_port
          ndd -get /dev/udp udp_largest_anon_port
          ndd -get /dev/tcp tcp_smallest_anon_port
          ndd -get /dev/tcp tcp_largest_anon_port
          ndd -get /dev/tcp tcp_close_wait_interval (for Solaris 2.x only)
          ndd -det /dev/tcp tcp_time_wait_interval (for Solaris 8 and above)

        Resolution:
          Make a backup copy of the /etc/rc2.d/S69inet file before you modify it. If the file does not exist, create it.
          As root user, make sure that you have write permission on the file by entering:
            chmod 754 /etc/rc2.d/S69
          Use your preferred text editor (such as vi) to modify the /etc/rc2.d/S69inet file.
          Add the following lines somewhere near the end of the file:
            ndd -set /dev/udp udp_smallest_anon_port 42767
            ndd -set /dev/udp udp_largest_anon_port 65535
            ndd -set /dev/tcp tcp_smallest_anon_port 42767
            ndd -set /dev/tcp tcp_largest_anon_port 65535
            ndd -set /dev/tcp tcp_close_wait_interval 30000 (for Solaris 2.x only)
            ndd -set /dev/tcp tcp_time_wait_interval 30000 ndd -set /dev/udp
          Save your change and exit from the file.
          Restart the server or servers.

      Windows
        Use the Registry Editor (regedt32.exe) to make the modifications.

        MaxUserPort
          Description: Determines the highest port number TCP can assign when an application requests an available user port from the system. Typically, ephemeral ports (those ports that are used briefly) are allocated to port numbers 1024 - 5000.

          Note: Windows does not add this entry to the registry. You can add it by editing the registry or by using a program that edits the registry.
          Location: HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\Tcpip\Parameters
          Data type: REG_DWORD
          Range: 5,000-65,534 (port number)
          Default Value: 5000
          Recommended value: 65534 (65534 DEC)

        TcpMaxConnectTransmissions
          Description: Determines how many times TCP retransmits an unanswered request for a new connection. TCP retransmits new connection requests until they are answered or until this value expires.
          Location: HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\Tcpip\Parameters
          Data type: REG_DWORD
          Recommended value: 5 (5 DEC)

        TcpMaxConnectRetransmissions
          Description: Determines how many times TCP retransmits an unanswered request for a new connection. TCP retransmits new connection requests until they are answered or until this value expires.
          Location: HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\Tcpip\Parameters
          Data type: REG_DWORD
          Range: 0–255 (retransmission attempts)
          Default Value: 2
          Recommended value: 5 (5 DEC)

        TcpTimedWaitDelay
          Description: Determines the time that must elapse before TCP/IP can release a closed connection and reuse its resources. This interval between closure and release is known as the TIME_WAIT state or twice the maximum segment lifetime (2MSL) state. During this time, reopening the connection to the client and server costs less than establishing a new connection. By reducing the value of this entry, TCP/IP can release closed connections faster and provide more resources for new connections. Adjust this parameter if the running application requires rapid release, the creation of new connections, or an adjustment because of a low throughput caused by multiple connections in the TIME_WAIT state.

          Location: HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\Tcpip\Parameters

          Data type: REG_DWORD

          Default Value: 0xF0 (240 DEC)

          Recommended value: 0x1F (30 DEC)

        FN_COR_QLEN
          Description: The 15,16,17 error indicates that a process is not able to connect to the COR_Listen process due to the unavailability of COR queue space.

          To resolve this issue, an environmental variable that is named FN_COR_QLEN is created.

          The default COR queue length is 5. The environmental variable must be set for the user that starts the Image Services software and named FN_COR_LEN. The value set for the environmental variable should initially be 20 - 25.

[{"Product":{"code":"SSNVUD","label":"FileNet Image Services"},"Business Unit":{"code":"BU053","label":"Cloud & Data Platform"},"Component":"Image Services","Platform":[{"code":"PF002","label":"AIX"},{"code":"PF010","label":"HP-UX"},{"code":"PF027","label":"Solaris"},{"code":"PF033","label":"Windows"}],"Version":"4.2;4.1.2;4.1.1","Edition":"All Editions","Line of Business":{"code":"LOB45","label":"Automation"}}]

Document Information

Modified date:
17 June 2018

UID

swg21634425