Problem
Starts up of application server intermittently hang with the following hung threads shown in SystemOut.log
[3/20/12 17:55:09:415 ICT] 00000018 ThreadMonitor W WSVR0605W: Thread "Default : 7" (00000021) has been active for 677593 milliseconds and may be hung. There is/are 1 thread(s) in total in the server that may be hung.
at java.lang.Object.wait(Native Method)
at java.lang.Object.wait(Object.java:167)
at com.ibm.mq.jmqi.remote.internal.system.ReentrantMutex.acquire(ReentrantMutex.java:86)
at com.ibm.mq.jmqi.remote.internal.RemoteHconn.requestDispatchLock(RemoteHconn.java:540)
at com.ibm.mq.jmqi.remote.internal.RemoteFAP.MQCTL(RemoteFAP.java:2104)
...
com.ibm.mq.connector.ResourceAdapterImpl.endpointActivation(ResourceAdapterImpl.java:463)
at com.ibm.ejs.j2c.ActivationSpecWrapperImpl.activateUnderRAClassLoaderContext(ActivationSpecWrapperImpl.java:631)
at com.ibm.ejs.j2c.ActivationSpecWrapperImpl.activateEndpoint(ActivationSpecWrapperImpl.java:338)
at com.ibm.ejs.j2c.RAWrapperImpl.activateEndpoint(RAWrapperImpl.java:1084)
at com.ibm.ejs.j2c.RALifeCycleManagerImpl.activateEndpoint(RALifeCycleManagerImpl.java:1717)
at com.ibm.ejs.container.MessageEndpointFactoryImpl.activateEndpoint(MessageEndpointFactoryImpl.java:280)
...
[3/20/12 17:55:09:426 ICT] 00000018 ThreadMonitor W WSVR0605W: Thread "WMQJCAResourceAdapter : 18" (000000dc) has been active for 650455 milliseconds and may be hung. There is/are 2 thread(s) in total in the server that may be hung.
at java.lang.Object.wait(Native Method)
at java.lang.Object.wait(Object.java:167)
at com.ibm.ejs.container.MessageEndpointFactoryImpl.createEndpoint(MessageEndpointFactoryImpl.java:498)
at com.ibm.mq.connector.inbound.WorkImpl.run(WorkImpl.java:192)
at com.ibm.ejs.j2c.work.WorkProxy.run(WorkProxy.java:399)
at com.ibm.ws.util.ThreadPool$Worker.run(ThreadPool.java:1604)
...
Fighting the error
My first reaction to the error is to identify changes I had made. I found that I created many resources (WebSphere MQ Queue Connection Factory, WebSphere MQ Queues, WebSphere MQ Activation Specifications, DataSources, URLs).Oh, I introduced so many changes (several hundreds of resources were created).
I am not sure what to do now. But the error seems to be related to WebSphere MQ stuffs.
However, WebSphere MQ queue managers to which our applications connect were still working fine. Doubts grow whether it is really a WebSphere MQ problem.
Looking for helps from google using keywords such as ReentrantMutex, I found a useful document (APAR IZ68236).
The APAR is the perfect match to my problem but I found its certain explanation quite confusing.
I opened a PMR (84727,000,856) and it was very well taken care of. I would like to use this opportunity to express my gratitude to IBM support team (Jack M. White, Angel Rivera, Yana Johnson and Ravi S. Sinha) for their excellent support.
What do I find confusing?
- Where should I fix the problem?
- What is the impact of setting ConnectionConcurrency to 1?
- Should we change the property at cell scope?
Where should I fix the problem?
Do we have to know where we should fix the problem? Is it a WebSphere MQ or WebSphere Application Server problem?Certainly Yes.
Technically speaking, the defect is neither in WebSphere MQ nor WebSphere Application Server problem.
In a strict sense, this is a WebSphere Resource Adapter problem which is developed by WebSphere MQ team.
The document mentions a SHARECNV channel property on WebSphere MQ side. However, updating it will not help resolving this problem in any way.
To solve this problem we have to update a built-in WebSphere MQ Resource Adapter which is located within WebSphere Application Server.
What is the impact of setting ConnectionConcurrency to 1?
The connectionConcurrency property controls how many Activation Specifications can share a single JMS connection to WebSphere MQ.The default value of the property is 5. If we have 17 Activation Specifications defined, there will be 4 JMS Connections (WebSphere MQ HConns) from the application server to WebSphere MQ for the Activation Specifications.
When connectionConcurrency is set to 1, all Activation Specifications will create their own JMS connection to WebSphere MQ. If we defined 17 Activation Specifications, there will be 17 connections (WebSphere MQ HConns) to the queue manager, one for each Activation Specification.
Apart from increasing the number of connections from the application server to WebSphere MQ, setting connectionConcurrency to 1 will have no other side effect on the system. The performance of the application server and the WebSphere MQ queue manager should remain the same.
Updating WMQ Resource Adapter at cell scope?
IZ68236 explains how we can update ConnectionConcurrency custom property of WebSphere MQ Resource Adapter.However, the document is not entirely correct. As of now (June 2, 2012), it tells us to modify the property at the "cell" scope.
Modifying Resource Adapter for Activation Spec is very much similar to modifying JDBC Provider for DataSource. If you want to create the datasource with cluster visibility, you need to define a JDBC Provider at that cluster first. The same rule applies to the resource adapter as well. We need to modify the Resource Adapter at the same scope of the problematic Activation Spec.
Given practically no impact as described above in the impact section, the best policy for me is to modify the ConnectionConcurrency at all possible scopes. It leaves us no concern when we have to define a new activation spec. We don't have to check whether the ConnectionConcurrency of its corresponding resource adapter is already updated.
If our cell topology is small, updating the property is a simple task. When our cell is big and contains many nodes, clusters and servers. The task of updating the property at all scopes can become a nightmare.
Fortunately, WebSphere supports a command-line interface called wsadmin. We can list all available resource adapters and apply the change to all of them.
To do that:
First, we define a function to modify the ConnectionConcurrency property of the resource adapter.<<JMS.py>>
...
def modifyConnectionConcurrencyOnWMQRA(raConfigId,concurrent):
propSet = AdminConfig.showAttribute(raConfigId,'propertySet')
propList=AdminConfig.list('J2EEResourceProperty', propSet).splitlines()
concurrentProps = [ prop for prop in propList if AdminConfig.showAttribute(prop,'name')=='connectionConcurrency']
print "Modifying %s : %s => %s" % (raConfigId,AdminConfig.showAttribute(concurrentProps[0],'value'),concurrent)
AdminConfig.modify(concurrentProps[0],[['value',concurrent]])
... propList=AdminConfig.list('J2EEResourceProperty', propSet).splitlines()
concurrentProps = [ prop for prop in propList if AdminConfig.showAttribute(prop,'name')=='connectionConcurrency']
print "Modifying %s : %s => %s" % (raConfigId,AdminConfig.showAttribute(concurrentProps[0],'value'),concurrent)
AdminConfig.modify(concurrentProps[0],[['value',concurrent]])
Then, we list all the resource adapter and modify each of them with ConnectionConcurrency value of 1.
import JMS,WasTools
WasTools.setAdminRefs((AdminConfig, AdminControl, AdminTask, AdminApp))
wMQRAs=AdminConfig.list('J2CResourceAdapter','WebSphere MQ Resource Adapter*').splitlines()
for ra in wMQRAs:
JMS.modifyConnectionConcurrencyOnWMQRA(ra,'1')
print 'Saving changes ...'
AdminConfig.save()
print 'Synchronizing changes to all active nodes'
AdminNodeManagement.syncActiveNodes()
We can be safely assured that all the resource properties are updated. And we only have to get back to run this script again when the topology changes.
No comments:
Post a Comment