Topic
  • 4 replies
  • Latest Post - ‏2013-07-19T15:24:30Z by Dev_Dhoot
SystemAdmin
SystemAdmin
1485 Posts

Pinned topic XIO timeout on 1 million records while ORB works fine

‏2013-03-28T20:16:47Z |
Hi
I receive timeout error as in the logs below.When I reduce the no of record to be preloaded , it works fine . It is a client based preload
Another interesting observation is that for the same preloading with ORB the results are good and I do not receive any error .
I am sure it has something to do due to XIO as transport.
I changed the XIO timeout to 90 secs from 30 secs on the client and server side .Still the time out occurs and I see a reference to a 30000 ms timeout. I am not sure what timeout value it is and how can it be set to a tuned value.

Is there a way i can optimize or increase the timeout to get all the records preloaded
################################LOGS##############################

Catalog Service Endpoints not specified. Starting an embedded server using end points: localhost:2809
3/28/13 15:23:51:088 EDT 00000001 WsLoggerConfi W com.ibm.ws.logging.WsLoggerConfigurator getExtensionPointLoggingConfiguration Unable to get extension point - com.ibm.wsspi.extension.logger-properties
3/28/13 15:23:50:992 EDT 00000001 RuntimeInfo I CWOBJ0903I: The internal version of WebSphere eXtreme Scale is v7.0.0 (8.6.0.0) http://cf11305.31182928.
3/28/13 15:23:51:434 EDT 00000001 WXSProperties I CWOBJ0054I: The value of the "com.ibm.websphere.objectgrid.container.reconnect.block.reconnect.time" property is "30000".
3/28/13 15:23:51:434 EDT 00000001 WXSProperties I CWOBJ0054I: The value of the "com.ibm.websphere.objectgrid.container.reconnect.min.successful.heartbeats" property is "10".
3/28/13 15:23:51:435 EDT 00000001 WXSProperties I CWOBJ0054I: The value of the "com.ibm.websphere.objectgrid.container.reconnect.restart" property is "true".
3/28/13 15:23:51:436 EDT 00000001 WXSProperties I CWOBJ0054I: The value of the "com.ibm.websphere.objectgrid.container.reconnect.restart.delay" property is "2000".
3/28/13 15:23:51:437 EDT 00000001 WXSProperties I CWOBJ0054I: The value of the "com.ibm.websphere.objectgrid.container.reconnect.restart.parent.timeout" property is "180000".
3/28/13 15:23:51:437 EDT 00000001 WXSProperties I CWOBJ0054I: The value of the "com.ibm.websphere.objectgrid.container.reconnect.retry.forever" property is "false".
3/28/13 15:23:51:542 EDT 00000001 LocationServi I CWOBJ0204I: The transport type of this Client JVM is being determined by contacting the catalog service domain with the catalog service endpoints of: localhost:2809.
3/28/13 15:23:52:473 EDT 00000001 XIOOutboundTr I CWOBJ0054I: The value of the "minXIONetworkThreads" property is "1".
3/28/13 15:23:52:474 EDT 00000001 XIOOutboundTr I CWOBJ0054I: The value of the "maxXIONetworkThreads" property is "256".
3/28/13 15:23:52:475 EDT 00000001 XIOOutboundTr I CWOBJ0054I: The value of the "xioTimeout" property is "30".
3/28/13 15:23:52:475 EDT 00000001 XIOOutboundTr I CWOBJ0054I: The value of the "xioReadTimeout" property is "30".
3/28/13 15:23:52:476 EDT 00000001 XIOOutboundTr I CWOBJ0054I: The value of the "xioWriteTimeout" property is "30".
3/28/13 15:23:52:476 EDT 00000001 XIOOutboundTr I CWOBJ0054I: The value of the "extendedCatalogServerTimeout" property is "180000".
3/28/13 15:23:52:519 EDT 00000001 XIORegistry I CWOBJ9054I: The eXtremeIO registry is using the endpoint ID 4000013db274cb90e0000dac569537b4.
3/28/13 15:23:52:526 EDT 00000001 XIOQueueManag I CWOBJ0054I: The value of the "minXIOWorkerThreads" property is "1".
3/28/13 15:23:52:527 EDT 00000001 XIOQueueManag I CWOBJ0054I: The value of the "maxXIOWorkerThreads" property is "256".
3/28/13 15:23:52:608 EDT 00000001 LocationServi I CWOBJ0200I: The transport type is eXtremeIO.
3/28/13 15:23:52:873 EDT 00000001 ObjectGridMan I CWOBJ2433I: Client-side ObjectGrid settings are going to be overridden for domain DefaultDomain using the URL file:/home/a039939/extremescaletrial860/ObjectGrid/samples/westest/build/classes/META-INF/objectgrid2.xml.
Client proerties are ClientPropertiesImpl{preferLocalJVM=true, preferLocalHost=true, preferZones=null, clientInfo=ClientInfo{xsver=48,features=http://XSSYSTEM],sysprop={java.specification.version=1.7, java.runtime.version=1.7.0_17-b02, java.version=1.7.0_17, os.version=2.6.32-279.1.1.el6.x86_64, os.name=Linux, os.arch=amd64},unmarkedCfgd=true}, bootStrapListShuffle=true
3/28/13 15:23:52:926 EDT 00000019 ClusterStore I CWOBJ1132I: An updated routing entry for domain:grid:epoch DefaultDomain:SAMPLE:1364496101308 was obtained from the catalog server.
3/28/13 15:23:52:934 EDT 00000019 LocationServi I CWOBJ2521I: The catalog server bootstrap addresses changed from localhost:2809 to rxpoc.testcom:2809.
3/28/13 15:23:53:165 EDT 00000001 ObjectGridImp I CWOBJ0059I: The transaction timeout value was not configured or was set to 0 for ObjectGrid SAMPLE. With this configuration, transactions never time out. The transaction timeout value is being set to 600 seconds.
3/28/13 15:23:53:194 EDT 00000001 XDFHelper I CWOBJ6306I: XDF has been enabled for map Ticket.
3/28/13 15:23:53:281 EDT 00000001 ClientDomainC I CWOBJ1126I: The ObjectGrid client connected to the SAMPLE grid in the DefaultDomain domain using connection 0.
Client proerties are ClientPropertiesImpl{preferLocalJVM=true, preferLocalHost=true, preferZones=null, clientInfo=ClientInfo{xsver=48,features=http://XSSYSTEM],sysprop={java.specification.version=1.7, java.runtime.version=1.7.0_17-b02, java.version=1.7.0_17, os.version=2.6.32-279.1.1.el6.x86_64, os.name=Linux, os.arch=amd64},unmarkedCfgd=true}, bootStrapListShuffle=true
com.ibm.ws.objectgrid.ObjectGridImpl@3dc264b1{name=SAMPLE, type=CLIENT, isOffheapEligible=true}

3/28/13 15:25:07:893 EDT 00000014 FutureImpl W CWOBJ7851W: Received a timeout while waiting for a response to a com.ibm.ws.xs.xio.protobuf.ContainerMessages$ReadWriteRequestMessage message from endpoint 10.115.209.90:42643. The current timeout is 30 seconds. When the message was added, the queue size was 1.
3/28/13 15:25:07:894 EDT 00000001 XIOClientCore W CWOBJ1130W: Communication with the partition with the domain:grid:mapSet:partitionId DefaultDomain:SAMPLE:Ticket:0 failed with an XIO 'com.ibm.ws.xsspi.xio.exception.MessageTimeOutException originating=10.115.209.90:0;causedby=10.115.209.90:42643;exid=0: com.ibm.ws.xs.xio.protobuf.ContainerMessages$ReadWriteRequestMessage await timeout after 30000 ms contacting 10.115.209.90:42643 queue size on insert=1, waiter {id: 68, index: 16}' exception communicating with the cont0_C-1 container server at rxpoc.test.com.
Exception in thread "main" com.ibm.websphere.objectgrid.TransactionException: rolling back transaction, see caused by exception
at com.ibm.ws.objectgrid.SessionImpl.rollbackPMapChanges(SessionImpl.java:2487)
at com.ibm.ws.objectgrid.SessionImpl.commit(SessionImpl.java:2087)
at com.demo.newloaderApp.myload(newloaderApp.java:192)
at com.demo.newloaderApp.main(newloaderApp.java:131)
Caused by: com.ibm.websphere.objectgrid.ClientServerTransactionCallbackException: Client Services - Exception during commit processing: com.ibm.websphere.objectgrid.TargetNotAvailableException: DefaultDomain:SAMPLE:Ticket:0 com.ibm.ws.xsspi.xio.exception.MessageTimeOutException originating=10.115.209.90:0;causedby=10.115.209.90:42643;exid=0: com.ibm.ws.xs.xio.protobuf.ContainerMessages$ReadWriteRequestMessage await timeout after 30000 ms contacting 10.115.209.90:42643 queue size on insert=1, waiter {id: 68, index: 16}
at com.ibm.ws.objectgrid.client.RemoteTransactionCallbackImpl.processReadWriteAsyncRequest(RemoteTransactionCallbackImpl.java:1604)
at com.ibm.ws.objectgrid.client.RemoteTransactionCallbackImpl.processReadWriteRequestAndResponse(RemoteTransactionCallbackImpl.java:1405)
at com.ibm.ws.objectgrid.client.RemoteTransactionCallbackImpl.commit(RemoteTransactionCallbackImpl.java:328)
at com.ibm.ws.objectgrid.SessionImpl.commit(SessionImpl.java:1992)
... 2 more
Caused by: com.ibm.websphere.objectgrid.TargetNotAvailableException: DefaultDomain:SAMPLE:Ticket:0 com.ibm.ws.xsspi.xio.exception.MessageTimeOutException originating=10.115.209.90:0;causedby=10.115.209.90:42643;exid=0: com.ibm.ws.xs.xio.protobuf.ContainerMessages$ReadWriteRequestMessage await timeout after 30000 ms contacting 10.115.209.90:42643 queue size on insert=1, waiter {id: 68, index: 16}
at com.ibm.ws.objectgrid.client.XIOClientCoreMessageHandler.sendMessage(XIOClientCoreMessageHandler.java:593)
at com.ibm.ws.objectgrid.client.CommonClientCoreMessageHandler.sendReadWriteRequest(CommonClientCoreMessageHandler.java:443)
at com.ibm.ws.objectgrid.client.RemoteTransactionCallbackImpl.processReadWriteAsyncRequest(RemoteTransactionCallbackImpl.java:1575)
... 5 more
Caused by: com.ibm.ws.xsspi.xio.exception.MessageTimeOutException originating=10.115.209.90:0;causedby=10.115.209.90:42643;exid=0: com.ibm.ws.xs.xio.protobuf.ContainerMessages$ReadWriteRequestMessage await timeout after 30000 ms contacting 10.115.209.90:42643 queue size on insert=1, waiter {id: 68, index: 16}
at com.ibm.ws.xs.xio.actor.impl.FutureImpl.run(FutureImpl.java:99)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
at java.util.concurrent.FutureTask.run(FutureTask.java:166)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:178)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:292)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:722)
Updated on 2013-04-03T17:26:12Z at 2013-04-03T17:26:12Z by SystemAdmin
  • lisaw
    lisaw
    101 Posts

    Re: XIO timeout on 1 million records while ORB works fine

    ‏2013-04-03T14:34:58Z  
    In your server.properties file, make sure you have xioTimeout=90 set. If that isn't getting picked up, make sure the server.properties file is actually loading...

    
    startXsServer.bat cs0 -serverProps server\config\server.properties
    


    or when you are starting your container add

    
    startXsServer.bat cs0 ... -xioTimeout 90
    


    to the command.

    Let me know if that works, and here's some more options: Tuning IBM eXtremeIO (XIO)

    • Lisa

    Websphere eXtreme Scale Development
  • SystemAdmin
    SystemAdmin
    1485 Posts

    Re: XIO timeout on 1 million records while ORB works fine

    ‏2013-04-03T17:26:12Z  
    • lisaw
    • ‏2013-04-03T14:34:58Z
    In your server.properties file, make sure you have xioTimeout=90 set. If that isn't getting picked up, make sure the server.properties file is actually loading...

    <pre class="jive-pre"> startXsServer.bat cs0 -serverProps server\config\server.properties </pre>

    or when you are starting your container add

    <pre class="jive-pre"> startXsServer.bat cs0 ... -xioTimeout 90 </pre>

    to the command.

    Let me know if that works, and here's some more options: Tuning IBM eXtremeIO (XIO)

    • Lisa

    Websphere eXtreme Scale Development
    I changed the XIO timeout to 90 secs, added XIO readtimeout, XIO write timeeut to 90 secs. When I started the catalog and container , i did see in the logs they get reflected.

    I use a client server topology .On my eclipse console , i found the client also has the property different and defaulted to 30 .I used client clustercontext setClientProperties methd to over ride the value to 90 secs. In the logs i see when the client instance start up , the tiemout set to 30 secs and then overridden to 90 secs when i s..o.p the getClientProperties.

    Now i expect the timeout to be set to 90.but i still get the same error. I doubt is it the same timeout value that throws the error.

    I expect XIO to perform better than ORB as per the documents .

    Regards,
    Andrew
  • jhanders
    jhanders
    260 Posts

    Re: XIO timeout on 1 million records while ORB works fine

    ‏2013-06-10T11:07:47Z  
    I changed the XIO timeout to 90 secs, added XIO readtimeout, XIO write timeeut to 90 secs. When I started the catalog and container , i did see in the logs they get reflected.

    I use a client server topology .On my eclipse console , i found the client also has the property different and defaulted to 30 .I used client clustercontext setClientProperties methd to over ride the value to 90 secs. In the logs i see when the client instance start up , the tiemout set to 30 secs and then overridden to 90 secs when i s..o.p the getClientProperties.

    Now i expect the timeout to be set to 90.but i still get the same error. I doubt is it the same timeout value that throws the error.

    I expect XIO to perform better than ORB as per the documents .

    Regards,
    Andrew

    The 30 seconds is likely the request retry timeout.  By default it is 30 seconds or the remaining time in the transaction timeout, which ever is smaller.  If you want a request to retry for longer you need to set the request retry timeout to a higher number in your client properties or Session.

    It appears though that you are running into an issue where the server is not giving a response in the necessary time.  If you are doing a simple transaction it would seem that you may be running into a product defect possibly.  With more details of what you are attempting in your transaction, we will be able to know more of what the problem could be.

    Jared Anderson

  • Dev_Dhoot
    Dev_Dhoot
    42 Posts

    Re: XIO timeout on 1 million records while ORB works fine

    ‏2013-07-19T15:24:30Z  
    • jhanders
    • ‏2013-06-10T11:07:47Z

    The 30 seconds is likely the request retry timeout.  By default it is 30 seconds or the remaining time in the transaction timeout, which ever is smaller.  If you want a request to retry for longer you need to set the request retry timeout to a higher number in your client properties or Session.

    It appears though that you are running into an issue where the server is not giving a response in the necessary time.  If you are doing a simple transaction it would seem that you may be running into a product defect possibly.  With more details of what you are attempting in your transaction, we will be able to know more of what the problem could be.

    Jared Anderson

    Hi , 

     

    I am facing the same issue.

     

    My exception is below:-

    ===

    
    Caused by: com.ibm.ws.xsspi.xio.exception.InvalidXIORefException
    

    com.ibm.ws.xsspi.xio.exception.InvalidXIORefException:Target has an endpoint id=4000013ff3a43a25e0000ee9569b01a9 but local id=4000013ff3df3435e00022b1569b01a9. Local process probably restarted.

    
      at com.ibm.ws.xsspi.xio.dispatch.MessageInfo.getMessage(MessageInfo.java:205)
      at com.ibm.ws.objectgrid.client.XIOClientCoreMessageHandler.sendMessage(XIOClientCoreMessageHandler.java:316)
      ... 280 more
    

     

    ===

    Did this get resolved?

    Can anyone let me know what can be done to solve the issue.

     

    --Devendra