Discussion:
AMQ9209 error in MQ server causing 'MQRC_CONNECTION_BROKEN' error in weblogic server
Pasha, Mir Musthafa Ali - Contractor {BIS}
2013-10-02 16:40:49 UTC
Permalink
Dear Folks,
I am seeing some weird issue here and would appreciate some advice on what could be causing it.
My MQ server(7.0.1.5) running on Linux SuSE infrequently writes the following error in its error logs(AMQERR01.LOG shown below) for one of the qmgrs. When this occurs, the applications accessing the particular qmgr on the MQ server fail and throw a connection broken 2009 error message (App log shown below). The timestamps in the Mq error log and the Application log match as highlighted shown below.
However, when the app tries to reconnect, the connection goes thru and everything is back to normal.

This issue occurs once a week and we couldn't figure out a pattern or reason for the occurrence of this error. There are about 150 max channels allocated to the SVRCONN and the application reaches a peak of 85-90 channel connections instances. The Application runs atop a weblogic server that connects to MQ via a JMS MQ bindings file. There is no MQ client s/w on the weblogic server.

I would appreciate if gurus can throw some insight into what could be the potential problem? Thanks.

MQ Log from AMQERR01.LOG file
-------------------------------------------------------------------------------
09/29/2013 08:35:24 AM - Process(15545.83) User(mqm) Program(amqrmppa)
Host(MQSERVER)
AMQ9209: Connection to host 'APPSERVER (XXX.XX.XX.XX)' closed.

EXPLANATION:
An error occurred receiving data from 'APPSERVER (XXX.XX.XX.XX)' over
TCP/IP. The connection to the remote host has unexpectedly terminated.
ACTION:
Tell the systems administrator.
----- amqccita.c : 3472 -------------------------------------------------------
09/29/2013 08:35:24 AM - Process(15545.83) User(mqm) Program(amqrmppa)
Host(MQSERVER)
AMQ9999: Channel program ended abnormally.

EXPLANATION:
Channel program 'ORBP2P.ORBPRD00' ended abnormally.
ACTION:
Look at previous error messages for channel program 'ORBP2P.ORBPRD00' in the
error files to determine the cause of the failure.


Application log from Weblogic Server:

2013-09-29 08:35:24,610 [ACTIVE] ExecuteThread: '0' for queue: 'weblogic.kernel.Default (self-tuning)' ERROR com.fritolay.suppliernet.common.security.SupplierNetUserDetailsService - JMSException Occured While Pushing Message to MQ
com.ibm.msg.client.jms.DetailedJMSException: JMSWMQ0018: Failed to connect to queue manager 'ORBPRD00' with connection mode 'Client' and host name 'MQSERVER (1414)'. Check the queue manager is started and if running in client mode, check there is a listener running. Please see the linked exception for more information.
.......
at weblogic.work.ExecuteThread.run(ExecuteThread.java:173)
Caused by: com.ibm.mq.MQException: JMSCMQ0001: WebSphere MQ call failed with compcode '2' ('MQCC_FAILED') reason '2009' ('MQRC_CONNECTION_BROKEN').
at com.ibm.msg.client.wmq.common.internal.Reason.createException(Reason.java:223)
... 80 more
Caused by: com.ibm.mq.jmqi.JmqiException: CC=2;RC=2009;AMQ9204: Connection to host 'MQSERVER (1414)' rejected. [1=com.ibm.mq.jmqi.JmqiException[CC=2;RC=2009;AMQ9213: A communications error for occurred.],3= MQSERVER (1414),5=RemoteConnection.receiveAsyncTsh]
at com.ibm.mq.jmqi.remote.internal.RemoteFAP.jmqiConnect(RemoteFAP.java:2019)
at com.ibm.mq.jmqi.remote.internal.RemoteFAP.jmqiConnect(RemoteFAP.java:1233)
at com.ibm.msg.client.wmq.internal.WMQConnection.<init>(WMQConnection.java:366)
... 79 more
Caused by: com.ibm.mq.jmqi.JmqiException: CC=2;RC=2009;AMQ9213: A communications error for occurred.
at com.ibm.mq.jmqi.remote.internal.system.RemoteConnection.receiveAsyncTsh(RemoteConnection.java:3199)



Sincerely,
Mir Musthafa Ali Pasha
PepsiCo, Inc. | BIS


To unsubscribe, write to LISTSERV-0lvw86wZMd9k/bWDasg6f+***@public.gmane.org and,
in the message body (not the subject), write: SIGNOFF MQSERIES
Instructions for managing your mailing list subscription are provided in
the Listserv General Users Guide available at http://www.lsoft.com
Archive: http://listserv.meduniwien.ac.at/archives/mqser-l.html
Roger Lacroix
2013-10-02 18:12:01 UTC
Permalink
Hello,

> There is no MQ client s/w on the weblogic server.

So that means the application team or application
deploy-ers stuffed the MQ JAR files into the
WebLogic lib directory. Get someone to look at
the manifest of the com.ibm.mq.jar file and see
what version of MQ the client-side is
using. Also, make sure that all of the MQ JAR
files are all at the same version - no mixing and matching!!

I have discovered lots of weird issues with MQ
JAR files for v7.0.0.* and v7.0.1.* (mainly
related to exits). You probably just need to use
the latest/greatest MQ JAR files.

Regards,
Roger Lacroix
Capitalware Inc.

At 12:40 PM 10/2/2013, you wrote:
>Dear Folks,
>I am seeing some weird issue here and would
>appreciate some advice on what could be causing it.
>My MQ server(7.0.1.5) running on Linux SuSE
>infrequently writes the following error in its
>error logs(AMQERR01.LOG shown below) for one of
>the qmgrs. When this occurs, the applications
>accessing the particular qmgr on the MQ server
>fail and throw a connection broken 2009 error
>message (App log shown below). The timestamps
>in the Mq error log and the Application log match as highlighted shown below.
>However, when the app tries to reconnect, the
>connection goes thru and everything is back to normal.
>
>This issue occurs once a week and we couldn’t
>figure out a pattern or reason for the
>occurrence of this error. There are about 150
>max channels allocated to the SVRCONN and the
>application reaches a peak of 85-90 channel
>connections instances. The Application runs
>atop a weblogic server that connects to MQ via a
>JMS MQ bindings file. There is no MQ client s/w on the weblogic server.
>
>I would appreciate if gurus can throw some
>insight into what could be the potential problem? Thanks.
>
>MQ Log from AMQERR01.LOG file
>-------------------------------------------------------------------------------
>09/29/2013 08:35:24 AM - Process(15545.83) User(mqm) Program(amqrmppa)
> Host(MQSERVER)
>AMQ9209: Connection to host 'APPSERVER (XXX.XX.XX.XX)' closed.
>
>EXPLANATION:
>An error occurred receiving data from 'APPSERVER (XXX.XX.XX.XX)' over
>TCP/IP. The connection to the remote host has unexpectedly terminated.
>ACTION:
>Tell the systems administrator.
>----- amqccita.c : 3472
>-------------------------------------------------------
>09/29/2013 08:35:24 AM - Process(15545.83) User(mqm) Program(amqrmppa)
> Host(MQSERVER)
>AMQ9999: Channel program ended abnormally.
>
>EXPLANATION:
>Channel program 'ORBP2P.ORBPRD00' ended abnormally.
>ACTION:
>Look at previous error messages for channel program 'ORBP2P.ORBPRD00' in the
>error files to determine the cause of the failure.
>
>
>Application log from Weblogic Server:
>
>2013-09-29 08:35:24,610 [ACTIVE] ExecuteThread:
>'0' for queue: 'weblogic.kernel.Default
>(self-tuning)' ERROR
>com.fritolay.suppliernet.common.security.SupplierNetUserDetailsService
>- JMSException Occured While Pushing Message to MQ
>com.ibm.msg.client.jms.DetailedJMSException:
>JMSWMQ0018: Failed to connect to queue manager
>'ORBPRD00' with connection mode 'Client' and
>host name 'MQSERVER (1414)'. Check the queue
>manager is started and if running in client
>mode, check there is a listener running. Please
>see the linked exception for more information.
>…….
>at weblogic.work.ExecuteThread.run(ExecuteThread.java:173)
>Caused by: com.ibm.mq.MQException: JMSCMQ0001:
>WebSphere MQ call failed with compcode '2'
>('MQCC_FAILED') reason '2009' ('MQRC_CONNECTION_BROKEN').
> at
> com.ibm.msg.client.wmq.common.internal.Reason.createException(Reason.java:223)
> ... 80 more
>Caused by: com.ibm.mq.jmqi.JmqiException:
>CC=2;RC=2009;AMQ9204: Connection to host
>‘MQSERVER (1414)' rejected.
>[1=com.ibm.mq.jmqi.JmqiException[CC=2;RC=2009;AMQ9213:
>A communications error for occurred.],3=
>MQSERVER (1414),5=RemoteConnection.receiveAsyncTsh]
> at
> com.ibm.mq.jmqi.remote.internal.RemoteFAP.jmqiConnect(RemoteFAP.java:2019)
> at
> com.ibm.mq.jmqi.remote.internal.RemoteFAP.jmqiConnect(RemoteFAP.java:1233)
> at
> com.ibm.msg.client.wmq.internal.WMQConnection.<init>(WMQConnection.java:366)
> ... 79 more
>Caused by: com.ibm.mq.jmqi.JmqiException:
>CC=2;RC=2009;AMQ9213: A communications error for occurred.
> at
> com.ibm.mq.jmqi.remote.internal.system.RemoteConnection.receiveAsyncTsh(RemoteConnection.java:3199)
>
>
>
>Sincerely,
>Mir Musthafa Ali Pasha
>PepsiCo, Inc. | BIS
>
>
>
>----------
><http://listserv.meduniwien.ac.at/archives/mqser-l.html>List
>Archive -
><http://listserv.meduniwien.ac.at/cgi-bin/wa?SUBED1=mqser-l&A=1>Manage
>Your List Settings -
><mailto:LISTSERV-0lvw86wZMd9k/bWDasg6f+***@public.gmane.org?subject=Unsubscribe&BODY=signoff%20mqseries>Unsubscribe
>
>
>Instructions for managing your mailing list
>subscription are provided in the Listserv
>General Users Guide available at
><http://www.lsoft.com/resources/manuals.asp>http://www.lsoft.com

To unsubscribe, write to LISTSERV-0lvw86wZMd9k/bWDasg6f+***@public.gmane.org and,
in the message body (not the subject), write: SIGNOFF MQSERIES
Instructions for managing your mailing list subscription are provided in
the Listserv General Users Guide available at http://www.lsoft.com
Archive: http://listserv.meduniwien.ac.at/archives/mqser-l.html
Robert Biedinger
2013-10-02 17:38:06 UTC
Permalink
Do you have any FDC files on MQSERVER or on Weblogic? You may not have the
full MQ client installed on Weblogic, but the MQ Java classes do exist,
otherwise there would not be references in the Weblogic logs to com.ibm.*
classes.


On Wed, Oct 2, 2013 at 11:40 AM, Pasha, Mir Musthafa Ali - Contractor {BIS}
<MirMusthafaAli.Pasha-+***@public.gmane.org> wrote:

> Dear Folks,****
>
> I am seeing some weird issue here and would appreciate some advice on what
> could be causing it. ****
>
> My MQ server(7.0.1.5) running on Linux SuSE infrequently writes the
> following error in its error logs(AMQERR01.LOG <#1417a26bcfd4509d_mqlog>shown below) for one of the qmgrs. When this occurs, the applications
> accessing the particular qmgr on the MQ server fail and throw a *connection
> broken 2009 error message *(App log <#1417a26bcfd4509d_Applog> shown
> below). The timestamps in the Mq error log and the Application log match
> as highlighted shown below. ****
>
> However, when the app tries to reconnect, the connection goes thru and
> everything is back to normal. ****
>
> ** **
>
> This issue occurs once a week and we couldn’t figure out a pattern or
> reason for the occurrence of this error. There are about 150 max channels
> allocated to the SVRCONN and the application reaches a peak of 85-90
> channel connections instances. The Application runs atop a weblogic server
> that connects to MQ via a JMS MQ bindings file. There is no MQ client s/w
> on the weblogic server. ****
>
> ** **
>
> I would appreciate if gurus can throw some insight into what could be the
> potential problem? Thanks. ****
>
> ** **
>
> *MQ Log from **AMQERR01.LOG** file*
>
>
> -------------------------------------------------------------------------------
> ****
>
> 09/29/2013 08:35:24 AM - Process(15545.83) User(mqm) Program(amqrmppa)****
>
> Host(MQSERVER)****
>
> AMQ9209: Connection to host 'APPSERVER (XXX.XX.XX.XX)' closed.****
>
> ** **
>
> EXPLANATION:****
>
> An error occurred receiving data from 'APPSERVER (XXX.XX.XX.XX)' over****
>
> TCP/IP. The connection to the remote host has unexpectedly terminated.***
> *
>
> ACTION:****
>
> Tell the systems administrator.****
>
> ----- amqccita.c : 3472
> -------------------------------------------------------****
>
> 09/29/2013 08:35:24 AM - Process(15545.83) User(mqm) Program(amqrmppa)****
>
> Host(MQSERVER)****
>
> AMQ9999: Channel program ended abnormally.****
>
> ** **
>
> EXPLANATION:****
>
> Channel program 'ORBP2P.ORBPRD00' ended abnormally.****
>
> ACTION:****
>
> Look at previous error messages for channel program 'ORBP2P.ORBPRD00' in
> the****
>
> error files to determine the cause of the failure.****
>
> ** **
>
> ** **
>
> *Application** log **from Weblogic Server:*
>
> ** **
>
> 2013-09-29 08:35:24,610 [ACTIVE] ExecuteThread: '0' for queue:
> 'weblogic.kernel.Default (self-tuning)' ERROR
> com.fritolay.suppliernet.common.security.SupplierNetUserDetailsService -
> JMSException Occured While Pushing Message to MQ****
>
> com.ibm.msg.client.jms.DetailedJMSException: JMSWMQ0018: Failed to
> connect to queue manager 'ORBPRD00' with connection mode 'Client' and host
> name *'**MQSERVER* (1414)'. Check the queue manager is started and if
> running in client mode, check there is a listener running. Please see the
> linked exception for more information.****
>
> …….****
>
> at weblogic.work.ExecuteThread.run(ExecuteThread.java:173)****
>
> Caused by: com.ibm.mq.MQException: JMSCMQ0001: WebSphere MQ call failed
> with compcode '2' ('MQCC_FAILED') reason '2009' ('MQRC_CONNECTION_BROKEN').
> ****
>
> at
> com.ibm.msg.client.wmq.common.internal.Reason.createException(Reason.java:223)
> ****
>
> ... 80 more****
>
> Caused by: com.ibm.mq.jmqi.JmqiException: CC=2;RC=2009;AMQ9204:
> Connection to host ‘*MQSERVER* (1414)' rejected.
> [1=com.ibm.mq.jmqi.JmqiException[CC=2;RC=2009;AMQ9213: A communications
> error for occurred.],3= *MQSERVER** *
> (1414),5=RemoteConnection.receiveAsyncTsh]****
>
> at
> com.ibm.mq.jmqi.remote.internal.RemoteFAP.jmqiConnect(RemoteFAP.java:2019)
> ****
>
> at
> com.ibm.mq.jmqi.remote.internal.RemoteFAP.jmqiConnect(RemoteFAP.java:1233)
> ****
>
> at
> com.ibm.msg.client.wmq.internal.WMQConnection.<init>(WMQConnection.java:366)
> ****
>
> ... 79 more****
>
> Caused by: com.ibm.mq.jmqi.JmqiException: CC=2;RC=2009;AMQ9213: A
> communications error for occurred.****
>
> at
> com.ibm.mq.jmqi.remote.internal.system.RemoteConnection.receiveAsyncTsh(RemoteConnection.java:3199)
> ****
>
> ** **
>
> ** **
>
> ** **
>
> *Sincerely,*
>
> Mir Musthafa Ali Pasha****
>
> PepsiCo, Inc. | BIS****
>
> ** **
>
> ------------------------------
> List Archive <http://listserv.meduniwien.ac.at/archives/mqser-l.html> - Manage
> Your List Settings<http://listserv.meduniwien.ac.at/cgi-bin/wa?SUBED1=mqser-l&A=1>-
> Unsubscribe<LISTSERV-0lvw86wZMd9k/bWDasg6f+***@public.gmane.org?subject=Unsubscribe&BODY=signoff%20mqseries>
>
> Instructions for managing your mailing list subscription are provided in
> the Listserv General Users Guide available at http://www.lsoft.com<http://www.lsoft.com/resources/manuals.asp>
>

To unsubscribe, write to LISTSERV-0lvw86wZMd9k/bWDasg6f+***@public.gmane.org and,
in the message body (not the subject), write: SIGNOFF MQSERIES
Instructions for managing your mailing list subscription are provided in
the Listserv General Users Guide available at http://www.lsoft.com
Archive: http://listserv.meduniwien.ac.at/archives/mqser-l.html
Tim Zielke
2013-10-02 19:29:59 UTC
Permalink
Hi Mir,

Here is something that could help your situation. Assuming you are using a version 7 client, I believe JMS supports Automatic Client Reconnection -> http://publib.boulder.ibm.com/infocenter/wmqv7/v7r0/topic/com.ibm.mq.csqzaf.doc/cs70190_.htm

I am assuming your application is not using Automatic Client Reconnection based on the behavior below. If it was, the MQ Automatic Client Reconnection functionality probably would have been able to reconnect to the queue manager under the covers and you would never have been the wiser with this apparent communications error.

This is an educated guess, but the error you are hitting looks like an underlying TCP issue to me. That method receiveAsynchTsh in the error method sounds like some logic to handle an asynchronous write to a socket. I know we have seen issues with security scans sometimes causing channels to fail as the scans get their destructive hands on the ports that MQ is using. Again, just a guess.

Tim Zielke
CICS/MQ Systems Programmer
Aon

From: MQSeries List [mailto:MQSERIES-0lvw86wZMd9k/bWDasg6f+***@public.gmane.org] On Behalf Of Pasha, Mir Musthafa Ali - Contractor {BIS}
Sent: Wednesday, October 02, 2013 11:41 AM
To: MQSERIES-0lvw86wZMd9k/bWDasg6f+***@public.gmane.org
Subject: AMQ9209 error in MQ server causing 'MQRC_CONNECTION_BROKEN' error in weblogic server

Dear Folks,
I am seeing some weird issue here and would appreciate some advice on what could be causing it.
My MQ server(7.0.1.5) running on Linux SuSE infrequently writes the following error in its error logs(AMQERR01.LOG shown below) for one of the qmgrs. When this occurs, the applications accessing the particular qmgr on the MQ server fail and throw a connection broken 2009 error message (App log shown below). The timestamps in the Mq error log and the Application log match as highlighted shown below.
However, when the app tries to reconnect, the connection goes thru and everything is back to normal.

This issue occurs once a week and we couldn't figure out a pattern or reason for the occurrence of this error. There are about 150 max channels allocated to the SVRCONN and the application reaches a peak of 85-90 channel connections instances. The Application runs atop a weblogic server that connects to MQ via a JMS MQ bindings file. There is no MQ client s/w on the weblogic server.

I would appreciate if gurus can throw some insight into what could be the potential problem? Thanks.

MQ Log from AMQERR01.LOG file
-------------------------------------------------------------------------------
09/29/2013 08:35:24 AM - Process(15545.83) User(mqm) Program(amqrmppa)
Host(MQSERVER)
AMQ9209: Connection to host 'APPSERVER (XXX.XX.XX.XX)' closed.

EXPLANATION:
An error occurred receiving data from 'APPSERVER (XXX.XX.XX.XX)' over
TCP/IP. The connection to the remote host has unexpectedly terminated.
ACTION:
Tell the systems administrator.
----- amqccita.c : 3472 -------------------------------------------------------
09/29/2013 08:35:24 AM - Process(15545.83) User(mqm) Program(amqrmppa)
Host(MQSERVER)
AMQ9999: Channel program ended abnormally.

EXPLANATION:
Channel program 'ORBP2P.ORBPRD00' ended abnormally.
ACTION:
Look at previous error messages for channel program 'ORBP2P.ORBPRD00' in the
error files to determine the cause of the failure.


Application log from Weblogic Server:

2013-09-29 08:35:24,610 [ACTIVE] ExecuteThread: '0' for queue: 'weblogic.kernel.Default (self-tuning)' ERROR com.fritolay.suppliernet.common.security.SupplierNetUserDetailsService - JMSException Occured While Pushing Message to MQ
com.ibm.msg.client.jms.DetailedJMSException: JMSWMQ0018: Failed to connect to queue manager 'ORBPRD00' with connection mode 'Client' and host name 'MQSERVER (1414)'. Check the queue manager is started and if running in client mode, check there is a listener running. Please see the linked exception for more information.
.......
at weblogic.work.ExecuteThread.run(ExecuteThread.java:173)
Caused by: com.ibm.mq.MQException: JMSCMQ0001: WebSphere MQ call failed with compcode '2' ('MQCC_FAILED') reason '2009' ('MQRC_CONNECTION_BROKEN').
at com.ibm.msg.client.wmq.common.internal.Reason.createException(Reason.java:223)
... 80 more
Caused by: com.ibm.mq.jmqi.JmqiException: CC=2;RC=2009;AMQ9204: Connection to host 'MQSERVER (1414)' rejected. [1=com.ibm.mq.jmqi.JmqiException[CC=2;RC=2009;AMQ9213: A communications error for occurred.],3= MQSERVER (1414),5=RemoteConnection.receiveAsyncTsh]
at com.ibm.mq.jmqi.remote.internal.RemoteFAP.jmqiConnect(RemoteFAP.java:2019)
at com.ibm.mq.jmqi.remote.internal.RemoteFAP.jmqiConnect(RemoteFAP.java:1233)
at com.ibm.msg.client.wmq.internal.WMQConnection.<init>(WMQConnection.java:366)
... 79 more
Caused by: com.ibm.mq.jmqi.JmqiException: CC=2;RC=2009;AMQ9213: A communications error for occurred.
at com.ibm.mq.jmqi.remote.internal.system.RemoteConnection.receiveAsyncTsh(RemoteConnection.java:3199)



Sincerely,
Mir Musthafa Ali Pasha
PepsiCo, Inc. | BIS


________________________________
List Archive<http://listserv.meduniwien.ac.at/archives/mqser-l.html> - Manage Your List Settings<http://listserv.meduniwien.ac.at/cgi-bin/wa?SUBED1=mqser-l&A=1> - Unsubscribe<mailto:LISTSERV-0lvw86wZMd9k/bWDasg6f+***@public.gmane.org?subject=Unsubscribe&BODY=signoff%20mqseries>

Instructions for managing your mailing list subscription are provided in the Listserv General Users Guide available at http://www.lsoft.com<http://www.lsoft.com/resources/manuals.asp>

To unsubscribe, write to LISTSERV-0lvw86wZMd9k/bWDasg6f+***@public.gmane.org and,
in the message body (not the subject), write: SIGNOFF MQSERIES
Instructions for managing your mailing list subscription are provided in
the Listserv General Users Guide available at http://www.lsoft.com
Archive: http://listserv.meduniwien.ac.at/archives/mqser-l.html
Mir
2013-10-02 20:02:43 UTC
Permalink
Thanks for the reply, Robert/Roger.
Robert -
There are no FDC files on MQ server or weblogic server. As you said, there
is no MQ client, but MQ Java classes do exist in the application domain lib
directory.

Roger -
I just looked up the manifest and found that all jars have same version 7.0.1.5

Manifest-Version: 1.0
Ant-Version: Apache Ant 1.7.0
Created-By: 1.4.2 (IBM Corporation)
Copyright-Notice: Licensed Materials - Property of IBM 5724-H72,
5655-R36, 5724-L26, 5655-L82 (C) Copyright IBM Corp. 2008 A
ll Rights Reserved. US Government Users Restricted Rights -
Use,duplication or disclosure restricted by GSA ADP Schedule
Contract with IBM Corp.
Sealed: false
Specification-Title: WebSphere MQ classes for Java
Specification-Version: 7.0.1.5
Specification-Vendor: IBM Corporation
Implementation-Title: WebSphere MQ classes for Java
Implementation-Version: 7.0.1.5 - k701-105-110419
Implementation-Vendor: IBM Corporation
Class-Path: com.ibm.mq.headers.jar com.ibm.mq.pcf.jar com.ibm.mq.jmqi.
jar connector.jar com.ibm.mq.commonservices.jar

Regards,
Mir
<br><hr><center><font size=-1>
<a href="http://listserv.meduniwien.ac.at/archives/mqser-l.html">List Archive</a> -
<a href="http://listserv.meduniwien.ac.at/cgi-bin/wa?SUBED1=mqser-l&A=1">Manage Your List Settings</a> -
<a href="mailto:LISTSERV-0lvw86wZMd9k/bWDasg6f+***@public.gmane.org?subject=Unsubscribe&BODY=signoff%20mqseries">Unsubscribe</a>
</font><font size=-1><p>
Instructions for managing your mailing list subscription are provided in the Listserv General Users Guide available at <a href="http://www.lsoft.com/resources/manuals.asp">http://www.lsoft.com</a>
</p></font>
</center>
Pasha, Mir Musthafa Ali - Contractor {BIS}
2013-10-02 20:29:32 UTC
Permalink
Some formatting issues with my earlier reply. So, reposting it..

Thanks for the reply, Robert/Roger/Tim.

Robert - There are no FDC files on MQ server or weblogic server. As you said, there is no MQ client, but MQ Java classes do exist in the application domain lib directory.

Roger - I just looked up the manifest and found that all jars have same version. Manifest details given below.

Tim - Thanks for the link. As per the manifest info below, we are using WebSphere MQ classes for Java and the documentation link says that "Automatic client reconnection are not supported with WebSphere MQ classes "
Manifest-Version: 1.0
Ant-Version: Apache Ant 1.7.0
Created-By: 1.4.2 (IBM Corporation)
Copyright-Notice: Licensed Materials - Property of IBM 5724-H72,
5655-R36, 5724-L26, 5655-L82 (C) Copyright IBM Corp. 2008 A
ll Rights Reserved. US Government Users Restricted Rights -
Use,duplication or disclosure restricted by GSA ADP Schedule
Contract with IBM Corp.
Sealed: false
Specification-Title: WebSphere MQ classes for Java
Specification-Version: 7.0.1.5
Specification-Vendor: IBM Corporation
Implementation-Title: WebSphere MQ classes for Java
Implementation-Version: 7.0.1.5 - k701-105-110419
Implementation-Vendor: IBM Corporation
Class-Path: com.ibm.mq.headers.jar com.ibm.mq.pcf.jar com.ibm.mq.jmqi.
jar connector.jar com.ibm.mq.commonservices.jar

Sincerely,
Mir Musthafa Ali Pasha
PepsiCo, Inc. | BIS


From: Robert Biedinger [mailto:robert.biedinger-8a+***@public.gmane.org]
Sent: Wednesday, October 02, 2013 12:38 PM
Subject: Re: AMQ9209 error in MQ server causing 'MQRC_CONNECTION_BROKEN' error in weblogic server

Do you have any FDC files on MQSERVER or on Weblogic? You may not have the full MQ client installed on Weblogic, but the MQ Java classes do exist, otherwise there would not be references in the Weblogic logs to com.ibm.* classes.

On Wed, Oct 2, 2013 at 11:40 AM, Pasha, Mir Musthafa Ali - Contractor {BIS} <MirMusthafaAli.Pasha-+***@public.gmane.org<mailto:MirMusthafaAli.Pasha-+***@public.gmane.org>> wrote:
Dear Folks,
I am seeing some weird issue here and would appreciate some advice on what could be causing it.
My MQ server(7.0.1.5) running on Linux SuSE infrequently writes the following error in its error logs(AMQERR01.LOG shown below) for one of the qmgrs. When this occurs, the applications accessing the particular qmgr on the MQ server fail and throw a connection broken 2009 error message (App log shown below). The timestamps in the Mq error log and the Application log match as highlighted shown below.
However, when the app tries to reconnect, the connection goes thru and everything is back to normal.

This issue occurs once a week and we couldn't figure out a pattern or reason for the occurrence of this error. There are about 150 max channels allocated to the SVRCONN and the application reaches a peak of 85-90 channel connections instances. The Application runs atop a weblogic server that connects to MQ via a JMS MQ bindings file. There is no MQ client s/w on the weblogic server.

I would appreciate if gurus can throw some insight into what could be the potential problem? Thanks.

MQ Log from AMQERR01.LOG file
-------------------------------------------------------------------------------
09/29/2013 08:35:24 AM - Process(15545.83) User(mqm) Program(amqrmppa)
Host(MQSERVER)
AMQ9209: Connection to host 'APPSERVER (XXX.XX.XX.XX)' closed.

EXPLANATION:
An error occurred receiving data from 'APPSERVER (XXX.XX.XX.XX)' over
TCP/IP. The connection to the remote host has unexpectedly terminated.
ACTION:
Tell the systems administrator.
----- amqccita.c : 3472 -------------------------------------------------------
09/29/2013 08:35:24 AM - Process(15545.83) User(mqm) Program(amqrmppa)
Host(MQSERVER)
AMQ9999: Channel program ended abnormally.

EXPLANATION:
Channel program 'ORBP2P.ORBPRD00' ended abnormally.
ACTION:
Look at previous error messages for channel program 'ORBP2P.ORBPRD00' in the
error files to determine the cause of the failure.


Application log from Weblogic Server:

2013-09-29 08:35:24,610 [ACTIVE] ExecuteThread: '0' for queue: 'weblogic.kernel.Default (self-tuning)' ERROR com.fritolay.suppliernet.common.security.SupplierNetUserDetailsService - JMSException Occured While Pushing Message to MQ
com.ibm.msg.client.jms.DetailedJMSException: JMSWMQ0018: Failed to connect to queue manager 'ORBPRD00' with connection mode 'Client' and host name 'MQSERVER (1414)'. Check the queue manager is started and if running in client mode, check there is a listener running. Please see the linked exception for more information.
.......
at weblogic.work.ExecuteThread.run(ExecuteThread.java:173)
Caused by: com.ibm.mq.MQException: JMSCMQ0001: WebSphere MQ call failed with compcode '2' ('MQCC_FAILED') reason '2009' ('MQRC_CONNECTION_BROKEN').
at com.ibm.msg.client.wmq.common.internal.Reason.createException(Reason.java:223)
... 80 more
Caused by: com.ibm.mq.jmqi.JmqiException: CC=2;RC=2009;AMQ9204: Connection to host 'MQSERVER (1414)' rejected. [1=com.ibm.mq.jmqi.JmqiException[CC=2;RC=2009;AMQ9213: A communications error for occurred.],3= MQSERVER (1414),5=RemoteConnection.receiveAsyncTsh]
at com.ibm.mq.jmqi.remote.internal.RemoteFAP.jmqiConnect(RemoteFAP.java:2019)
at com.ibm.mq.jmqi.remote.internal.RemoteFAP.jmqiConnect(RemoteFAP.java:1233)
at com.ibm.msg.client.wmq.internal.WMQConnection.<init>(WMQConnection.java:366)
... 79 more
Caused by: com.ibm.mq.jmqi.JmqiException: CC=2;RC=2009;AMQ9213: A communications error for occurred.
at com.ibm.mq.jmqi.remote.internal.system.RemoteConnection.receiveAsyncTsh(RemoteConnection.java:3199)



Sincerely,
Mir Musthafa Ali Pasha
PepsiCo, Inc. | BIS


________________________________
List Archive<http://listserv.meduniwien.ac.at/archives/mqser-l.html> - Manage Your List Settings<http://listserv.meduniwien.ac.at/cgi-bin/wa?SUBED1=mqser-l&A=1> - Unsubscribe<mailto:LISTSERV-0lvw86wZMd9k/bWDasg6f+***@public.gmane.org?subject=Unsubscribe&BODY=signoff%20mqseries>

Instructions for managing your mailing list subscription are provided in the Listserv General Users Guide available at http://www.lsoft.com<http://www.lsoft.com/resources/manuals.asp>


________________________________
List Archive<http://listserv.meduniwien.ac.at/archives/mqser-l.html> - Manage Your List Settings<http://listserv.meduniwien.ac.at/cgi-bin/wa?SUBED1=mqser-l&A=1> - Unsubscribe<mailto:LISTSERV-0lvw86wZMd9k/bWDasg6f+***@public.gmane.org?subject=Unsubscribe&BODY=signoff%20mqseries>

Instructions for managing your mailing list subscription are provided in the Listserv General Users Guide available at http://www.lsoft.com<http://www.lsoft.com/resources/manuals.asp>

To unsubscribe, write to LISTSERV-0lvw86wZMd9k/bWDasg6f+***@public.gmane.org and,
in the message body (not the subject), write: SIGNOFF MQSERIES
Instructions for managing your mailing list subscription are provided in
the Listserv General Users Guide available at http://www.lsoft.com
Archive: http://listserv.meduniwien.ac.at/archives/mqser-l.html
Paul Clarke
2013-10-02 17:38:18 UTC
Permalink
I’ve seen things like this where the firewall has terminated the socket. Some firewalls can be configured to terminate a socket if it has been around for a long time or it has spent a long time not doing anything. have these clients been connected a long time or are they freshly connected ? That might be a good things to find out to determine whether the failure is related to the socket state or the environment state eg. time. For example, does the failure always happen around the same time? or on the same day? when the cleaner comes to hoover the machine room

Cheers,
Paul

Paul Clarke
www.mqgem.com

From: Pasha, Mir Musthafa Ali - Contractor {BIS}
Sent: Wednesday, October 02, 2013 5:40 PM
To: MQSERIES-0lvw86wZMd9k/bWDasg6f+***@public.gmane.org
Subject: AMQ9209 error in MQ server causing 'MQRC_CONNECTION_BROKEN' error in weblogic server

Dear Folks,

I am seeing some weird issue here and would appreciate some advice on what could be causing it.

My MQ server(7.0.1.5) running on Linux SuSE infrequently writes the following error in its error logs(AMQERR01.LOG shown below) for one of the qmgrs. When this occurs, the applications accessing the particular qmgr on the MQ server fail and throw a connection broken 2009 error message (App log shown below). The timestamps in the Mq error log and the Application log match as highlighted shown below.

However, when the app tries to reconnect, the connection goes thru and everything is back to normal.



This issue occurs once a week and we couldn’t figure out a pattern or reason for the occurrence of this error. There are about 150 max channels allocated to the SVRCONN and the application reaches a peak of 85-90 channel connections instances. The Application runs atop a weblogic server that connects to MQ via a JMS MQ bindings file. There is no MQ client s/w on the weblogic server.



I would appreciate if gurus can throw some insight into what could be the potential problem? Thanks.



MQ Log from AMQERR01.LOG file

-------------------------------------------------------------------------------

09/29/2013 08:35:24 AM - Process(15545.83) User(mqm) Program(amqrmppa)

Host(MQSERVER)

AMQ9209: Connection to host 'APPSERVER (XXX.XX.XX.XX)' closed.



EXPLANATION:

An error occurred receiving data from 'APPSERVER (XXX.XX.XX.XX)' over

TCP/IP. The connection to the remote host has unexpectedly terminated.

ACTION:

Tell the systems administrator.

----- amqccita.c : 3472 -------------------------------------------------------

09/29/2013 08:35:24 AM - Process(15545.83) User(mqm) Program(amqrmppa)

Host(MQSERVER)

AMQ9999: Channel program ended abnormally.



EXPLANATION:

Channel program 'ORBP2P.ORBPRD00' ended abnormally.

ACTION:

Look at previous error messages for channel program 'ORBP2P.ORBPRD00' in the

error files to determine the cause of the failure.





Application log from Weblogic Server:



2013-09-29 08:35:24,610 [ACTIVE] ExecuteThread: '0' for queue: 'weblogic.kernel.Default (self-tuning)' ERROR com.fritolay.suppliernet.common.security.SupplierNetUserDetailsService - JMSException Occured While Pushing Message to MQ

com.ibm.msg.client.jms.DetailedJMSException: JMSWMQ0018: Failed to connect to queue manager 'ORBPRD00' with connection mode 'Client' and host name 'MQSERVER (1414)'. Check the queue manager is started and if running in client mode, check there is a listener running. Please see the linked exception for more information.



.

at weblogic.work.ExecuteThread.run(ExecuteThread.java:173)

Caused by: com.ibm.mq.MQException: JMSCMQ0001: WebSphere MQ call failed with compcode '2' ('MQCC_FAILED') reason '2009' ('MQRC_CONNECTION_BROKEN').

at com.ibm.msg.client.wmq.common.internal.Reason.createException(Reason.java:223)

... 80 more

Caused by: com.ibm.mq.jmqi.JmqiException: CC=2;RC=2009;AMQ9204: Connection to host ‘MQSERVER (1414)' rejected. [1=com.ibm.mq.jmqi.JmqiException[CC=2;RC=2009;AMQ9213: A communications error for occurred.],3= MQSERVER (1414),5=RemoteConnection.receiveAsyncTsh]

at com.ibm.mq.jmqi.remote.internal.RemoteFAP.jmqiConnect(RemoteFAP.java:2019)

at com.ibm.mq.jmqi.remote.internal.RemoteFAP.jmqiConnect(RemoteFAP.java:1233)

at com.ibm.msg.client.wmq.internal.WMQConnection.<init>(WMQConnection.java:366)

... 79 more

Caused by: com.ibm.mq.jmqi.JmqiException: CC=2;RC=2009;AMQ9213: A communications error for occurred.

at com.ibm.mq.jmqi.remote.internal.system.RemoteConnection.receiveAsyncTsh(RemoteConnection.java:3199)







Sincerely,

Mir Musthafa Ali Pasha

PepsiCo, Inc. | BIS





--------------------------------------------------------------------------------

List Archive - Manage Your List Settings - Unsubscribe
Instructions for managing your mailing list subscription are provided in the Listserv General Users Guide available at http://www.lsoft.com

To unsubscribe, write to LISTSERV-0lvw86wZMd9k/bWDasg6f+***@public.gmane.org and,
in the message body (not the subject), write: SIGNOFF MQSERIES
Instructions for managing your mailing list subscription are provided in
the Listserv General Users Guide available at http://www.lsoft.com
Archive: http://listserv.meduniwien.ac.at/archives/mqser-l.html
Pasha, Mir Musthafa Ali - Contractor {BIS}
2013-10-03 17:44:10 UTC
Permalink
Thanks for the reply Paul.
No, it doesn’t fail on the same day or time. As I said earlier, I couldn’t find a failure pattern based on date/time/load/volume/environment issue etc. This is all again internal to the network, so I am not sure how firewall even plays a role here.. but you might have a point.

This issue is kind of escalating here. Looks like I need to open a PMR.

Sincerely,
Mir Musthafa Ali Pasha
PepsiCo, Inc. | BIS


From: MQSeries List [mailto:***@LISTSERV.MEDUNIWIEN.AC.AT] On Behalf Of Paul Clarke
Sent: Wednesday, October 02, 2013 12:38 PM
To: ***@LISTSERV.MEDUNIWIEN.AC.AT
Subject: Re: AMQ9209 error in MQ server causing 'MQRC_CONNECTION_BROKEN' error in weblogic server

I’ve seen things like this where the firewall has terminated the socket. Some firewalls can be configured to terminate a socket if it has been around for a long time or it has spent a long time not doing anything. have these clients been connected a long time or are they freshly connected ? That might be a good things to find out to determine whether the failure is related to the socket state or the environment state eg. time. For example, does the failure always happen around the same time? or on the same day? when the cleaner comes to hoover the machine room [cid:***@01CEC036.3FA70340]

Cheers,
Paul

Paul Clarke
www.mqgem.com<http://www.mqgem.com>

From: Pasha, Mir Musthafa Ali - Contractor {BIS}<mailto:***@PEPSICO.COM>
Sent: Wednesday, October 02, 2013 5:40 PM
To: ***@LISTSERV.MEDUNIWIEN.AC.AT<mailto:***@LISTSERV.MEDUNIWIEN.AC.AT>
Subject: AMQ9209 error in MQ server causing 'MQRC_CONNECTION_BROKEN' error in weblogic server

Dear Folks,
I am seeing some weird issue here and would appreciate some advice on what could be causing it.
My MQ server(7.0.1.5) running on Linux SuSE infrequently writes the following error in its error logs(AMQERR01.LOG shown below) for one of the qmgrs. When this occurs, the applications accessing the particular qmgr on the MQ server fail and throw a connection broken 2009 error message (App log shown below). The timestamps in the Mq error log and the Application log match as highlighted shown below.
However, when the app tries to reconnect, the connection goes thru and everything is back to normal.

This issue occurs once a week and we couldn’t figure out a pattern or reason for the occurrence of this error. There are about 150 max channels allocated to the SVRCONN and the application reaches a peak of 85-90 channel connections instances. The Application runs atop a weblogic server that connects to MQ via a JMS MQ bindings file. There is no MQ client s/w on the weblogic server.

I would appreciate if gurus can throw some insight into what could be the potential problem? Thanks.

MQ Log from AMQERR01.LOG file
-------------------------------------------------------------------------------
09/29/2013 08:35:24 AM - Process(15545.83) User(mqm) Program(amqrmppa)
Host(MQSERVER)
AMQ9209: Connection to host 'APPSERVER (XXX.XX.XX.XX)' closed.

EXPLANATION:
An error occurred receiving data from 'APPSERVER (XXX.XX.XX.XX)' over
TCP/IP. The connection to the remote host has unexpectedly terminated.
ACTION:
Tell the systems administrator.
----- amqccita.c : 3472 -------------------------------------------------------
09/29/2013 08:35:24 AM - Process(15545.83) User(mqm) Program(amqrmppa)
Host(MQSERVER)
AMQ9999: Channel program ended abnormally.

EXPLANATION:
Channel program 'ORBP2P.ORBPRD00' ended abnormally.
ACTION:
Look at previous error messages for channel program 'ORBP2P.ORBPRD00' in the
error files to determine the cause of the failure.


Application log from Weblogic Server:

2013-09-29 08:35:24,610 [ACTIVE] ExecuteThread: '0' for queue: 'weblogic.kernel.Default (self-tuning)' ERROR com.fritolay.suppliernet.common.security.SupplierNetUserDetailsService - JMSException Occured While Pushing Message to MQ
com.ibm.msg.client.jms.DetailedJMSException: JMSWMQ0018: Failed to connect to queue manager 'ORBPRD00' with connection mode 'Client' and host name 'MQSERVER (1414)'. Check the queue manager is started and if running in client mode, check there is a listener running. Please see the linked exception for more information.


.
at weblogic.work.ExecuteThread.run(ExecuteThread.java:173)
Caused by: com.ibm.mq.MQException: JMSCMQ0001: WebSphere MQ call failed with compcode '2' ('MQCC_FAILED') reason '2009' ('MQRC_CONNECTION_BROKEN').
at com.ibm.msg.client.wmq.common.internal.Reason.createException(Reason.java:223)
... 80 more
Caused by: com.ibm.mq.jmqi.JmqiException: CC=2;RC=2009;AMQ9204: Connection to host ‘MQSERVER (1414)' rejected. [1=com.ibm.mq.jmqi.JmqiException[CC=2;RC=2009;AMQ9213: A communications error for occurred.],3= MQSERVER (1414),5=RemoteConnection.receiveAsyncTsh]
at com.ibm.mq.jmqi.remote.internal.RemoteFAP.jmqiConnect(RemoteFAP.java:2019)
at com.ibm.mq.jmqi.remote.internal.RemoteFAP.jmqiConnect(RemoteFAP.java:1233)
at com.ibm.msg.client.wmq.internal.WMQConnection.<init>(WMQConnection.java:366)
... 79 more
Caused by: com.ibm.mq.jmqi.JmqiException: CC=2;RC=2009;AMQ9213: A communications error for occurred.
at com.ibm.mq.jmqi.remote.internal.system.RemoteConnection.receiveAsyncTsh(RemoteConnection.java:3199)



Sincerely,
Mir Musthafa Ali Pasha
PepsiCo, Inc. | BIS


________________________________
List Archive<http://listserv.meduniwien.ac.at/archives/mqser-l.html> - Manage Your List Settings<http://listserv.meduniwien.ac.at/cgi-bin/wa?SUBED1=mqser-l&A=1> - Unsubscribe<mailto:***@LISTSERV.MEDUNIWIEN.AC.AT?subject=Unsubscribe&BODY=signoff%20mqseries>

Instructions for managing your mailing list subscription are provided in the Listserv General Users Guide available at http://www.lsoft.com<http://www.lsoft.com/resources/manuals.asp>

________________________________
List Archive<http://listserv.meduniwien.ac.at/archives/mqser-l.html> - Manage Your List Settings<http://listserv.meduniwien.ac.at/cgi-bin/wa?SUBED1=mqser-l&A=1> - Unsubscribe<mailto:***@LISTSERV.MEDUNIWIEN.AC.AT?subject=Unsubscribe&BODY=signoff%20mqseries>

Instructions for managing your mailing list subscription are provided in the Listserv General Users Guide available at http://www.lsoft.com<http://www.lsoft.com/resources/manuals.asp>
Roger Lacroix
2013-10-03 18:50:42 UTC
Permalink
Hello Mir,

> This issue is kind of escalating here. Looks like I need to open a PMR.

Before you open a PMR, I strongly suggest that
you update the MQ JAR v7.0.1.5 files the
application is using to either v7.0.1.10 or v7.5.0.2.

Regards,
Roger Lacroix
Capitalware Inc.

At 01:44 PM 10/3/2013, you wrote:
>Thanks for the reply Paul.
>No, it doesn’t fail on the same day or time.
>As I said earlier, I couldn’t find a failure
>pattern based on
>date/time/load/volume/environment issue etc.
>This is all again internal to the network, so I
>am not sure how firewall even plays a role here.. but you might have a point.
>
>This issue is kind of escalating here. Looks like I need to open a PMR.
>
>Sincerely,
>Mir Musthafa Ali Pasha
>PepsiCo, Inc. | BIS
>
>
>From: MQSeries List
>[mailto:MQSERIES-0lvw86wZMd9k/bWDasg6f+***@public.gmane.org] On Behalf Of Paul Clarke
>Sent: Wednesday, October 02, 2013 12:38 PM
>To: MQSERIES-0lvw86wZMd9k/bWDasg6f+***@public.gmane.org
>Subject: Re: AMQ9209 error in MQ server causing
>'MQRC_CONNECTION_BROKEN' error in weblogic server
>
>I’ve seen things like this where the firewall
>has terminated the socket. Some firewalls can be
>configured to terminate a socket if it has been
>around for a long time or it has spent a long
>time not doing anything. have these clients been
>connected a long time or are they freshly
>connected ? That might be a good things to find
>out to determine whether the failure is related
>to the socket state or the environment state eg.
>time. For example, does the failure always
>happen around the same time? or on the same day?
>when the cleaner comes to hoover the machine room
>Smile
>
>
>Cheers,
>Paul
>
>Paul Clarke
><http://www.mqgem.com>www.mqgem.com
>
>From:
><mailto:MirMusthafaAli.Pasha-***@public.gmane.org>Pasha,
>Mir Musthafa Ali - Contractor {BIS}
>Sent: Wednesday, October 02, 2013 5:40 PM
>To:
><mailto:MQSERIES-0lvw86wZMd9k/bWDasg6f+***@public.gmane.org>MQSERIES-0lvw86wZMd9k/***@public.gmane.orgAT
>Subject: AMQ9209 error in MQ server causing
>'MQRC_CONNECTION_BROKEN' error in weblogic server
>
>Dear Folks,
>I am seeing some weird issue here and would
>appreciate some advice on what could be causing it.
>My MQ server(7.0.1.5) running on Linux SuSE
>infrequently writes the following error in its
>error logs(AMQERR01.LOG shown below) for one of
>the qmgrs. When this occurs, the applications
>accessing the particular qmgr on the MQ server
>fail and throw a connection broken 2009 error
>message (App log shown below). The timestamps
>in the Mq error log and the Application log match as highlighted shown below.
>However, when the app tries to reconnect, the
>connection goes thru and everything is back to normal.
>
>This issue occurs once a week and we couldn’t
>figure out a pattern or reason for the
>occurrence of this error. There are about 150
>max channels allocated to the SVRCONN and the
>application reaches a peak of 85-90 channel
>connections instances. The Application runs
>atop a weblogic server that connects to MQ via a
>JMS MQ bindings file. There is no MQ client s/w on the weblogic server.
>
>I would appreciate if gurus can throw some
>insight into what could be the potential problem? Thanks.
>
>MQ Log from AMQERR01.LOG file
>-------------------------------------------------------------------------------
>09/29/2013 08:35:24 AM - Process(15545.83) User(mqm) Program(amqrmppa)
> Host(MQSERVER)
>AMQ9209: Connection to host 'APPSERVER (XXX.XX.XX.XX)' closed.
>
>EXPLANATION:
>An error occurred receiving data from 'APPSERVER (XXX.XX.XX.XX)' over
>TCP/IP. The connection to the remote host has unexpectedly terminated.
>ACTION:
>Tell the systems administrator.
>----- amqccita.c : 3472
>-------------------------------------------------------
>09/29/2013 08:35:24 AM - Process(15545.83) User(mqm) Program(amqrmppa)
> Host(MQSERVER)
>AMQ9999: Channel program ended abnormally.
>
>EXPLANATION:
>Channel program 'ORBP2P.ORBPRD00' ended abnormally.
>ACTION:
>Look at previous error messages for channel program 'ORBP2P.ORBPRD00' in the
>error files to determine the cause of the failure.
>
>
>Application log from Weblogic Server:
>
>2013-09-29 08:35:24,610 [ACTIVE] ExecuteThread:
>'0' for queue: 'weblogic.kernel.Default
>(self-tuning)' ERROR
>com.fritolay.suppliernet.common.security.SupplierNetUserDetailsService
>- JMSException Occured While Pushing Message to MQ
>com.ibm.msg.client.jms.DetailedJMSException:
>JMSWMQ0018: Failed to connect to queue manager
>'ORBPRD00' with connection mode 'Client' and
>host name 'MQSERVER (1414)'. Check the queue
>manager is started and if running in client
>mode, check there is a listener running. Please
>see the linked exception for more information.
>…….
>at weblogic.work.ExecuteThread.run(ExecuteThread.java:173)
>Caused by: com.ibm.mq.MQException: JMSCMQ0001:
>WebSphere MQ call failed with compcode '2'
>('MQCC_FAILED') reason '2009' ('MQRC_CONNECTION_BROKEN').
> at
> com.ibm.msg.client.wmq.common.internal.Reason.createException(Reason.java:223)
> ... 80 more
>Caused by: com.ibm.mq.jmqi.JmqiException:
>CC=2;RC=2009;AMQ9204: Connection to host
>‘MQSERVER (1414)' rejected.
>[1=com.ibm.mq.jmqi.JmqiException[CC=2;RC=2009;AMQ9213:
>A communications error for occurred.],3=
>MQSERVER (1414),5=RemoteConnection.receiveAsyncTsh]
> at
> com.ibm.mq.jmqi.remote.internal.RemoteFAP.jmqiConnect(RemoteFAP.java:2019)
> at
> com.ibm.mq.jmqi.remote.internal.RemoteFAP.jmqiConnect(RemoteFAP.java:1233)
> at
> com.ibm.msg.client.wmq.internal.WMQConnection.<init>(WMQConnection.java:366)
> ... 79 more
>Caused by: com.ibm.mq.jmqi.JmqiException:
>CC=2;RC=2009;AMQ9213: A communications error for occurred.
> at
> com.ibm.mq.jmqi.remote.internal.system.RemoteConnection.receiveAsyncTsh(RemoteConnection.java:3199)
>
>
>
>Sincerely,
>Mir Musthafa Ali Pasha
>PepsiCo, Inc. | BIS
>
>
>
>----------
><http://listserv.meduniwien.ac.at/archives/mqser-l.html>List
>Archive -
><http://listserv.meduniwien.ac.at/cgi-bin/wa?SUBED1=mqser-l&A=1>Manage
>Your List Settings -
><mailto:LISTSERV-0lvw86wZMd9k/bWDasg6f+***@public.gmane.org?subject=Unsubscribe&BODY=signoff%20mqseries>Unsubscribe
>
>
>Instructions for managing your mailing list
>subscription are provided in the Listserv
>General Users Guide available at
><http://www.lsoft.com/resources/manuals.asp>http://www.lsoft.com
>
>
>----------
><http://listserv.meduniwien.ac.at/archives/mqser-l.html>List
>Archive -
><http://listserv.meduniwien.ac.at/cgi-bin/wa?SUBED1=mqser-l&A=1>Manage
>Your List Settings -
><mailto:LISTSERV-0lvw86wZMd9k/bWDasg6f+***@public.gmane.org?subject=Unsubscribe&BODY=signoff%20mqseries>Unsubscribe
>
>
>Instructions for managing your mailing list
>subscription are provided in the Listserv
>General Users Guide available at
><http://www.lsoft.com/resources/manuals.asp>http://www.lsoft.com
>

To unsubscribe, write to LISTSERV-0lvw86wZMd9k/bWDasg6f+***@public.gmane.org and,
in the message body (not the subject), write: SIGNOFF MQSERIES
Instructions for managing your mailing list subscription are provided in
the Listserv General Users Guide available at http://www.lsoft.com
Archive: http://listserv.meduniwien.ac.at/archives/mqser-l.html
Tim Zielke
2013-10-03 20:29:19 UTC
Permalink
Hi Mir,

You said originally . . .

"When this occurs, the applications accessing the particular qmgr on the MQ server fail and throw a connection broken 2009 error message (App log shown below). The timestamps in the Mq error log and the Application log match as highlighted shown below.
However, when the app tries to reconnect, the connection goes thru and everything is back to normal."

Does this mean that when the application gets the 2009 error message it crashes? And when it is restarted it connects successfully?

If this is the case, we consult our programmers who write client applications to either leverage the Automatic Client Reconnection (if the language supports it) or write reconnect logic to handle a 2009 (MQRC_CONNECTION_BROKEN). If this is an issue where the client application is not properly handling a 2009, I would push back that this is an application issue that needs to get corrected by the application team. Client programs should assume that receiving a 2009 is an inevitable part of interacting with a network and be robust enough to handle 2009 errors, in my opinion. To be complete, applications that bind locally should account for and handle a 2009, too. At least, this is the guidance we give our MQ programmers.

Thanks,
Tim


From: MQSeries List [mailto:***@LISTSERV.MEDUNIWIEN.AC.AT] On Behalf Of Pasha, Mir Musthafa Ali - Contractor {BIS}
Sent: Thursday, October 03, 2013 12:44 PM
To: ***@LISTSERV.MEDUNIWIEN.AC.AT
Subject: Re: AMQ9209 error in MQ server causing 'MQRC_CONNECTION_BROKEN' error in weblogic server

Thanks for the reply Paul.
No, it doesn’t fail on the same day or time. As I said earlier, I couldn’t find a failure pattern based on date/time/load/volume/environment issue etc. This is all again internal to the network, so I am not sure how firewall even plays a role here.. but you might have a point.

This issue is kind of escalating here. Looks like I need to open a PMR.

Sincerely,
Mir Musthafa Ali Pasha
PepsiCo, Inc. | BIS


From: MQSeries List [mailto:***@LISTSERV.MEDUNIWIEN.AC.AT] On Behalf Of Paul Clarke
Sent: Wednesday, October 02, 2013 12:38 PM
To: ***@LISTSERV.MEDUNIWIEN.AC.AT
Subject: Re: AMQ9209 error in MQ server causing 'MQRC_CONNECTION_BROKEN' error in weblogic server

I’ve seen things like this where the firewall has terminated the socket. Some firewalls can be configured to terminate a socket if it has been around for a long time or it has spent a long time not doing anything. have these clients been connected a long time or are they freshly connected ? That might be a good things to find out to determine whether the failure is related to the socket state or the environment state eg. time. For example, does the failure always happen around the same time? or on the same day? when the cleaner comes to hoover the machine room [Smile]

Cheers,
Paul

Paul Clarke
www.mqgem.com<http://www.mqgem.com>

From: Pasha, Mir Musthafa Ali - Contractor {BIS}<mailto:***@PEPSICO.COM>
Sent: Wednesday, October 02, 2013 5:40 PM
To: ***@LISTSERV.MEDUNIWIEN.AC.AT<mailto:***@LISTSERV.MEDUNIWIEN.AC.AT>
Subject: AMQ9209 error in MQ server causing 'MQRC_CONNECTION_BROKEN' error in weblogic server

Dear Folks,
I am seeing some weird issue here and would appreciate some advice on what could be causing it.
My MQ server(7.0.1.5) running on Linux SuSE infrequently writes the following error in its error logs(AMQERR01.LOG shown below) for one of the qmgrs. When this occurs, the applications accessing the particular qmgr on the MQ server fail and throw a connection broken 2009 error message (App log shown below). The timestamps in the Mq error log and the Application log match as highlighted shown below.
However, when the app tries to reconnect, the connection goes thru and everything is back to normal.

This issue occurs once a week and we couldn’t figure out a pattern or reason for the occurrence of this error. There are about 150 max channels allocated to the SVRCONN and the application reaches a peak of 85-90 channel connections instances. The Application runs atop a weblogic server that connects to MQ via a JMS MQ bindings file. There is no MQ client s/w on the weblogic server.

I would appreciate if gurus can throw some insight into what could be the potential problem? Thanks.

MQ Log from AMQERR01.LOG file
-------------------------------------------------------------------------------
09/29/2013 08:35:24 AM - Process(15545.83) User(mqm) Program(amqrmppa)
Host(MQSERVER)
AMQ9209: Connection to host 'APPSERVER (XXX.XX.XX.XX)' closed.

EXPLANATION:
An error occurred receiving data from 'APPSERVER (XXX.XX.XX.XX)' over
TCP/IP. The connection to the remote host has unexpectedly terminated.
ACTION:
Tell the systems administrator.
----- amqccita.c : 3472 -------------------------------------------------------
09/29/2013 08:35:24 AM - Process(15545.83) User(mqm) Program(amqrmppa)
Host(MQSERVER)
AMQ9999: Channel program ended abnormally.

EXPLANATION:
Channel program 'ORBP2P.ORBPRD00' ended abnormally.
ACTION:
Look at previous error messages for channel program 'ORBP2P.ORBPRD00' in the
error files to determine the cause of the failure.


Application log from Weblogic Server:

2013-09-29 08:35:24,610 [ACTIVE] ExecuteThread: '0' for queue: 'weblogic.kernel.Default (self-tuning)' ERROR com.fritolay.suppliernet.common.security.SupplierNetUserDetailsService - JMSException Occured While Pushing Message to MQ
com.ibm.msg.client.jms.DetailedJMSException: JMSWMQ0018: Failed to connect to queue manager 'ORBPRD00' with connection mode 'Client' and host name 'MQSERVER (1414)'. Check the queue manager is started and if running in client mode, check there is a listener running. Please see the linked exception for more information.


.
at weblogic.work.ExecuteThread.run(ExecuteThread.java:173)
Caused by: com.ibm.mq.MQException: JMSCMQ0001: WebSphere MQ call failed with compcode '2' ('MQCC_FAILED') reason '2009' ('MQRC_CONNECTION_BROKEN').
at com.ibm.msg.client.wmq.common.internal.Reason.createException(Reason.java:223)
... 80 more
Caused by: com.ibm.mq.jmqi.JmqiException: CC=2;RC=2009;AMQ9204: Connection to host ‘MQSERVER (1414)' rejected. [1=com.ibm.mq.jmqi.JmqiException[CC=2;RC=2009;AMQ9213: A communications error for occurred.],3= MQSERVER (1414),5=RemoteConnection.receiveAsyncTsh]
at com.ibm.mq.jmqi.remote.internal.RemoteFAP.jmqiConnect(RemoteFAP.java:2019)
at com.ibm.mq.jmqi.remote.internal.RemoteFAP.jmqiConnect(RemoteFAP.java:1233)
at com.ibm.msg.client.wmq.internal.WMQConnection.<init>(WMQConnection.java:366)
... 79 more
Caused by: com.ibm.mq.jmqi.JmqiException: CC=2;RC=2009;AMQ9213: A communications error for occurred.
at com.ibm.mq.jmqi.remote.internal.system.RemoteConnection.receiveAsyncTsh(RemoteConnection.java:3199)



Sincerely,
Mir Musthafa Ali Pasha
PepsiCo, Inc. | BIS


________________________________
List Archive<http://listserv.meduniwien.ac.at/archives/mqser-l.html> - Manage Your List Settings<http://listserv.meduniwien.ac.at/cgi-bin/wa?SUBED1=mqser-l&A=1> - Unsubscribe<mailto:***@LISTSERV.MEDUNIWIEN.AC.AT?subject=Unsubscribe&BODY=signoff%20mqseries>

Instructions for managing your mailing list subscription are provided in the Listserv General Users Guide available at http://www.lsoft.com<http://www.lsoft.com/resources/manuals.asp>

________________________________
List Archive<http://listserv.meduniwien.ac.at/archives/mqser-l.html> - Manage Your List Settings<http://listserv.meduniwien.ac.at/cgi-bin/wa?SUBED1=mqser-l&A=1> - Unsubscribe<mailto:***@LISTSERV.MEDUNIWIEN.AC.AT?subject=Unsubscribe&BODY=signoff%20mqseries>

Instructions for managing your mailing list subscription are provided in the Listserv General Users Guide available at http://www.lsoft.com<http://www.lsoft.com/resources/manuals.asp>
Pasha, Mir Musthafa Ali - Contractor {BIS}
2013-10-03 20:39:49 UTC
Permalink
Thanks for the suggestion Roger. However, we don’t have v 7.0.1.10 or 7.5 anywhere in our shop. We are currently running with 7.0.1.5 version on all servers and clients.

Sincerely,
Mir Musthafa Ali Pasha
PepsiCo, Inc. | BIS


From: MQSeries List [mailto:***@LISTSERV.MEDUNIWIEN.AC.AT] On Behalf Of Roger Lacroix
Sent: Thursday, October 03, 2013 1:51 PM
To: ***@LISTSERV.MEDUNIWIEN.AC.AT
Subject: Re: AMQ9209 error in MQ server causing 'MQRC_CONNECTION_BROKEN' error in weblogic server

Hello Mir,

> This issue is kind of escalating here. Looks like I need to open a PMR.

Before you open a PMR, I strongly suggest that you update the MQ JAR v7.0.1.5 files the application is using to either v7.0.1.10 or v7.5.0.2.

Regards,
Roger Lacroix
Capitalware Inc.

At 01:44 PM 10/3/2013, you wrote:

Thanks for the reply Paul.
No, it doesn’t fail on the same day or time. As I said earlier, I couldn’t find a failure pattern based on date/time/load/volume/environment issue etc. This is all again internal to the network, so I am not sure how firewall even plays a role here.. but you might have a point.

This issue is kind of escalating here. Looks like I need to open a PMR.

Sincerely,
Mir Musthafa Ali Pasha
PepsiCo, Inc. | BIS


From: MQSeries List [ mailto:***@LISTSERV.MEDUNIWIEN.AC.AT] On Behalf Of Paul Clarke
Sent: Wednesday, October 02, 2013 12:38 PM
To: ***@LISTSERV.MEDUNIWIEN.AC.AT<mailto:***@LISTSERV.MEDUNIWIEN.AC.AT>
Subject: Re: AMQ9209 error in MQ server causing 'MQRC_CONNECTION_BROKEN' error in weblogic server

I’ve seen things like this where the firewall has terminated the socket. Some firewalls can be configured to terminate a socket if it has been around for a long time or it has spent a long time not doing anything. have these clients been connected a long time or are they freshly connected ? That might be a good things to find out to determine whether the failure is related to the socket state or the environment state eg. time. For example, does the failure always happen around the same time? or on the same day? when the cleaner comes to hoover the machine room [cid:***@01CEC04E.CB150EA0]

Cheers,
Paul

Paul Clarke
www.mqgem.com<http://www.mqgem.com>

From: Pasha, Mir Musthafa Ali - Contractor {BIS}<mailto:***@PEPSICO.COM>
Sent: Wednesday, October 02, 2013 5:40 PM
To: ***@LISTSERV.MEDUNIWIEN.AC.AT<mailto:***@LISTSERV.MEDUNIWIEN.AC.AT>
Subject: AMQ9209 error in MQ server causing 'MQRC_CONNECTION_BROKEN' error in weblogic server

Dear Folks,
I am seeing some weird issue here and would appreciate some advice on what could be causing it.
My MQ server(7.0.1.5) running on Linux SuSE infrequently writes the following error in its error logs(AMQERR01.LOG shown below) for one of the qmgrs. When this occurs, the applications accessing the particular qmgr on the MQ server fail and throw a connection broken 2009 error message (App log shown below). The timestamps in the Mq error log and the Application log match as highlighted shown below.
However, when the app tries to reconnect, the connection goes thru and everything is back to normal.

This issue occurs once a week and we couldn’t figure out a pattern or reason for the occurrence of this error. There are about 150 max channels allocated to the SVRCONN and the application reaches a peak of 85-90 channel connections instances. The Application runs atop a weblogic server that connects to MQ via a JMS MQ bindings file. There is no MQ client s/w on the weblogic server.

I would appreciate if gurus can throw some insight into what could be the potential problem? Thanks.

MQ Log from AMQERR01.LOG file
-------------------------------------------------------------------------------
09/29/2013 08:35:24 AM - Process(15545.83) User(mqm) Program(amqrmppa)
Host(MQSERVER)
AMQ9209: Connection to host 'APPSERVER (XXX.XX.XX.XX)' closed.

EXPLANATION:
An error occurred receiving data from 'APPSERVER (XXX.XX.XX.XX)' over
TCP/IP. The connection to the remote host has unexpectedly terminated.
ACTION:
Tell the systems administrator.
----- amqccita.c : 3472 -------------------------------------------------------
09/29/2013 08:35:24 AM - Process(15545.83) User(mqm) Program(amqrmppa)
Host(MQSERVER)
AMQ9999: Channel program ended abnormally.

EXPLANATION:
Channel program 'ORBP2P.ORBPRD00' ended abnormally.
ACTION:
Look at previous error messages for channel program 'ORBP2P.ORBPRD00' in the
error files to determine the cause of the failure.


Application log from Weblogic Server:

2013-09-29 08:35:24,610 [ACTIVE] ExecuteThread: '0' for queue: 'weblogic.kernel.Default (self-tuning)' ERROR com.fritolay.suppliernet.common.security.SupplierNetUserDetailsService - JMSException Occured While Pushing Message to MQ
com.ibm.msg.client.jms.DetailedJMSException: JMSWMQ0018: Failed to connect to queue manager 'ORBPRD00' with connection mode 'Client' and host name 'MQSERVER (1414)'. Check the queue manager is started and if running in client mode, check there is a listener running. Please see the linked exception for more information.


.
at weblogic.work.ExecuteThread.run(ExecuteThread.java:173)
Caused by: com.ibm.mq.MQException: JMSCMQ0001: WebSphere MQ call failed with compcode '2' ('MQCC_FAILED') reason '2009' ('MQRC_CONNECTION_BROKEN').
at com.ibm.msg.client.wmq.common.internal.Reason.createException(Reason.java:223)
... 80 more
Caused by: com.ibm.mq.jmqi.JmqiException: CC=2;RC=2009;AMQ9204: Connection to host ‘MQSERVER (1414)' rejected. [1=com.ibm.mq.jmqi.JmqiException[CC=2;RC=2009;AMQ9213: A communications error for occurred.],3= MQSERVER (1414),5=RemoteConnection.receiveAsyncTsh]
at com.ibm.mq.jmqi.remote.internal.RemoteFAP.jmqiConnect(RemoteFAP.java:2019)
at com.ibm.mq.jmqi.remote.internal.RemoteFAP.jmqiConnect(RemoteFAP.java:1233)
at com.ibm.msg.client.wmq.internal.WMQConnection.<init>(WMQConnection.java:366)
... 79 more
Caused by: com.ibm.mq.jmqi.JmqiException: CC=2;RC=2009;AMQ9213: A communications error for occurred.
at com.ibm.mq.jmqi.remote.internal.system.RemoteConnection.receiveAsyncTsh(RemoteConnection.java:3199)



Sincerely,
Mir Musthafa Ali Pasha
PepsiCo, Inc. | BIS


________________________________
List Archive<http://listserv.meduniwien.ac.at/archives/mqser-l.html> - Manage Your List Settings<http://listserv.meduniwien.ac.at/cgi-bin/wa?SUBED1=mqser-l&A=1> - Unsubscribe<mailto:***@LISTSERV.MEDUNIWIEN.AC.AT?subject=Unsubscribe&BODY=signoff%20mqseries>

Instructions for managing your mailing list subscription are provided in the Listserv General Users Guide available at http://www.lsoft.com<http://www.lsoft.com/resources/manuals.asp>

________________________________
List Archive<http://listserv.meduniwien.ac.at/archives/mqser-l.html> - Manage Your List Settings<http://listserv.meduniwien.ac.at/cgi-bin/wa?SUBED1=mqser-l&A=1> - Unsubscribe<mailto:***@LISTSERV.MEDUNIWIEN.AC.AT?subject=Unsubscribe&BODY=signoff%20mqseries>

Instructions for managing your mailing list subscription are provided in the Listserv General Users Guide available at http://www.lsoft.com<http://www.lsoft.com/resources/manuals.asp>


________________________________
List Archive<http://listserv.meduniwien.ac.at/archives/mqser-l.html> - Manage Your List Settings<http://listserv.meduniwien.ac.at/cgi-bin/wa?SUBED1=mqser-l&A=1> - Unsubscribe<mailto:***@LISTSERV.MEDUNIWIEN.AC.AT?subject=Unsubscribe&BODY=signoff%20mqseries>

Instructions for managing your mailing list subscription are provided in the Listserv General Users Guide available at http://www.lsoft.com<http://www.lsoft.com/resources/manuals.asp>
Pasha, Mir Musthafa Ali - Contractor {BIS}
2013-10-03 20:46:21 UTC
Permalink
Hi Tim,
The answer to your question is “Yes”.

I am with you reg handling 2009 connection error in application logic(something I used to do myself as a best practice when I was a programmer) and I did make it a case and push it back to the App team.

However, their argument is.. everything was working perfectly fine and this issue started all of a sudden, albeit occurring once in a while. They claim that nothing has changed in the application or transaction volume. My point is Nothing has changed at the MQ infrastructure level either. So, it is in a deadlock situation now.


Sincerely,
Mir Musthafa Ali Pasha
PepsiCo, Inc. | BIS
MQ | Substation | TIBCO File Adapter | DRF/OS2 | QPasa Support
Work: 972-963-1622 | Cell: 469-237-0563

From: MQSeries List [mailto:***@LISTSERV.MEDUNIWIEN.AC.AT] On Behalf Of Tim Zielke
Sent: Thursday, October 03, 2013 3:29 PM
To: ***@LISTSERV.MEDUNIWIEN.AC.AT
Subject: Re: AMQ9209 error in MQ server causing 'MQRC_CONNECTION_BROKEN' error in weblogic server

Hi Mir,

You said originally . . .

"When this occurs, the applications accessing the particular qmgr on the MQ server fail and throw a connection broken 2009 error message (App log shown below). The timestamps in the Mq error log and the Application log match as highlighted shown below.
However, when the app tries to reconnect, the connection goes thru and everything is back to normal."

Does this mean that when the application gets the 2009 error message it crashes? And when it is restarted it connects successfully?

If this is the case, we consult our programmers who write client applications to either leverage the Automatic Client Reconnection (if the language supports it) or write reconnect logic to handle a 2009 (MQRC_CONNECTION_BROKEN). If this is an issue where the client application is not properly handling a 2009, I would push back that this is an application issue that needs to get corrected by the application team. Client programs should assume that receiving a 2009 is an inevitable part of interacting with a network and be robust enough to handle 2009 errors, in my opinion. To be complete, applications that bind locally should account for and handle a 2009, too. At least, this is the guidance we give our MQ programmers.

Thanks,
Tim


From: MQSeries List [mailto:***@LISTSERV.MEDUNIWIEN.AC.AT] On Behalf Of Pasha, Mir Musthafa Ali - Contractor {BIS}
Sent: Thursday, October 03, 2013 12:44 PM
To: ***@LISTSERV.MEDUNIWIEN.AC.AT<mailto:***@LISTSERV.MEDUNIWIEN.AC.AT>
Subject: Re: AMQ9209 error in MQ server causing 'MQRC_CONNECTION_BROKEN' error in weblogic server

Thanks for the reply Paul.
No, it doesn’t fail on the same day or time. As I said earlier, I couldn’t find a failure pattern based on date/time/load/volume/environment issue etc. This is all again internal to the network, so I am not sure how firewall even plays a role here.. but you might have a point.

This issue is kind of escalating here. Looks like I need to open a PMR.

Sincerely,
Mir Musthafa Ali Pasha
PepsiCo, Inc. | BIS


From: MQSeries List [mailto:***@LISTSERV.MEDUNIWIEN.AC.AT] On Behalf Of Paul Clarke
Sent: Wednesday, October 02, 2013 12:38 PM
To: ***@LISTSERV.MEDUNIWIEN.AC.AT<mailto:***@LISTSERV.MEDUNIWIEN.AC.AT>
Subject: Re: AMQ9209 error in MQ server causing 'MQRC_CONNECTION_BROKEN' error in weblogic server

I’ve seen things like this where the firewall has terminated the socket. Some firewalls can be configured to terminate a socket if it has been around for a long time or it has spent a long time not doing anything. have these clients been connected a long time or are they freshly connected ? That might be a good things to find out to determine whether the failure is related to the socket state or the environment state eg. time. For example, does the failure always happen around the same time? or on the same day? when the cleaner comes to hoover the machine room [cid:***@01CEC04F.B43D7900]

Cheers,
Paul

Paul Clarke
www.mqgem.com<http://www.mqgem.com>

From: Pasha, Mir Musthafa Ali - Contractor {BIS}<mailto:***@PEPSICO.COM>
Sent: Wednesday, October 02, 2013 5:40 PM
To: ***@LISTSERV.MEDUNIWIEN.AC.AT<mailto:***@LISTSERV.MEDUNIWIEN.AC.AT>
Subject: AMQ9209 error in MQ server causing 'MQRC_CONNECTION_BROKEN' error in weblogic server

Dear Folks,
I am seeing some weird issue here and would appreciate some advice on what could be causing it.
My MQ server(7.0.1.5) running on Linux SuSE infrequently writes the following error in its error logs(AMQERR01.LOG shown below) for one of the qmgrs. When this occurs, the applications accessing the particular qmgr on the MQ server fail and throw a connection broken 2009 error message (App log shown below). The timestamps in the Mq error log and the Application log match as highlighted shown below.
However, when the app tries to reconnect, the connection goes thru and everything is back to normal.

This issue occurs once a week and we couldn’t figure out a pattern or reason for the occurrence of this error. There are about 150 max channels allocated to the SVRCONN and the application reaches a peak of 85-90 channel connections instances. The Application runs atop a weblogic server that connects to MQ via a JMS MQ bindings file. There is no MQ client s/w on the weblogic server.

I would appreciate if gurus can throw some insight into what could be the potential problem? Thanks.

MQ Log from AMQERR01.LOG file
-------------------------------------------------------------------------------
09/29/2013 08:35:24 AM - Process(15545.83) User(mqm) Program(amqrmppa)
Host(MQSERVER)
AMQ9209: Connection to host 'APPSERVER (XXX.XX.XX.XX)' closed.

EXPLANATION:
An error occurred receiving data from 'APPSERVER (XXX.XX.XX.XX)' over
TCP/IP. The connection to the remote host has unexpectedly terminated.
ACTION:
Tell the systems administrator.
----- amqccita.c : 3472 -------------------------------------------------------
09/29/2013 08:35:24 AM - Process(15545.83) User(mqm) Program(amqrmppa)
Host(MQSERVER)
AMQ9999: Channel program ended abnormally.

EXPLANATION:
Channel program 'ORBP2P.ORBPRD00' ended abnormally.
ACTION:
Look at previous error messages for channel program 'ORBP2P.ORBPRD00' in the
error files to determine the cause of the failure.


Application log from Weblogic Server:

2013-09-29 08:35:24,610 [ACTIVE] ExecuteThread: '0' for queue: 'weblogic.kernel.Default (self-tuning)' ERROR com.fritolay.suppliernet.common.security.SupplierNetUserDetailsService - JMSException Occured While Pushing Message to MQ
com.ibm.msg.client.jms.DetailedJMSException: JMSWMQ0018: Failed to connect to queue manager 'ORBPRD00' with connection mode 'Client' and host name 'MQSERVER (1414)'. Check the queue manager is started and if running in client mode, check there is a listener running. Please see the linked exception for more information.


.
at weblogic.work.ExecuteThread.run(ExecuteThread.java:173)
Caused by: com.ibm.mq.MQException: JMSCMQ0001: WebSphere MQ call failed with compcode '2' ('MQCC_FAILED') reason '2009' ('MQRC_CONNECTION_BROKEN').
at com.ibm.msg.client.wmq.common.internal.Reason.createException(Reason.java:223)
... 80 more
Caused by: com.ibm.mq.jmqi.JmqiException: CC=2;RC=2009;AMQ9204: Connection to host ‘MQSERVER (1414)' rejected. [1=com.ibm.mq.jmqi.JmqiException[CC=2;RC=2009;AMQ9213: A communications error for occurred.],3= MQSERVER (1414),5=RemoteConnection.receiveAsyncTsh]
at com.ibm.mq.jmqi.remote.internal.RemoteFAP.jmqiConnect(RemoteFAP.java:2019)
at com.ibm.mq.jmqi.remote.internal.RemoteFAP.jmqiConnect(RemoteFAP.java:1233)
at com.ibm.msg.client.wmq.internal.WMQConnection.<init>(WMQConnection.java:366)
... 79 more
Caused by: com.ibm.mq.jmqi.JmqiException: CC=2;RC=2009;AMQ9213: A communications error for occurred.
at com.ibm.mq.jmqi.remote.internal.system.RemoteConnection.receiveAsyncTsh(RemoteConnection.java:3199)



Sincerely,
Mir Musthafa Ali Pasha
PepsiCo, Inc. | BIS


________________________________
List Archive<http://listserv.meduniwien.ac.at/archives/mqser-l.html> - Manage Your List Settings<http://listserv.meduniwien.ac.at/cgi-bin/wa?SUBED1=mqser-l&A=1> - Unsubscribe<mailto:***@LISTSERV.MEDUNIWIEN.AC.AT?subject=Unsubscribe&BODY=signoff%20mqseries>

Instructions for managing your mailing list subscription are provided in the Listserv General Users Guide available at http://www.lsoft.com<http://www.lsoft.com/resources/manuals.asp>

________________________________
List Archive<http://listserv.meduniwien.ac.at/archives/mqser-l.html> - Manage Your List Settings<http://listserv.meduniwien.ac.at/cgi-bin/wa?SUBED1=mqser-l&A=1> - Unsubscribe<mailto:***@LISTSERV.MEDUNIWIEN.AC.AT?subject=Unsubscribe&BODY=signoff%20mqseries>

Instructions for managing your mailing list subscription are provided in the Listserv General Users Guide available at http://www.lsoft.com<http://www.lsoft.com/resources/manuals.asp>
Tim Zielke
2013-10-03 21:05:04 UTC
Permalink
That is frustrating that they are making that argument. They are just focused on this specific issue, and missing the bigger picture. What happens if you find and correct this presumed network issue, and then another one crops up 6 months later? They will be back to square one. The bottom line is that their MQ client application has application stability issues for not handling a 2009, and it will continue to have instability issues until they correct this. Hopefully, hammering this home to the developers or their managers will help them to stop having tunnel vision.

Regardless, if you want to investigate the current issue, that will be tough. If I had to do it, I would want to try and capture a TCP trace (I'm assuming your transport type is TCP) when the issue occurred on both the queue manager server side and the client application side. Of course, this needs to be managed considering how large TCP traces can get and the infrequency of the issue. But it can be done with scripting on Unix/Linux. Perhaps a Linux administrator could aid with this endeavor.

Thanks,
Tim

From: MQSeries List [mailto:***@LISTSERV.MEDUNIWIEN.AC.AT] On Behalf Of Pasha, Mir Musthafa Ali - Contractor {BIS}
Sent: Thursday, October 03, 2013 3:46 PM
To: ***@LISTSERV.MEDUNIWIEN.AC.AT
Subject: Re: AMQ9209 error in MQ server causing 'MQRC_CONNECTION_BROKEN' error in weblogic server

Hi Tim,
The answer to your question is “Yes”.

I am with you reg handling 2009 connection error in application logic(something I used to do myself as a best practice when I was a programmer) and I did make it a case and push it back to the App team.

However, their argument is.. everything was working perfectly fine and this issue started all of a sudden, albeit occurring once in a while. They claim that nothing has changed in the application or transaction volume. My point is Nothing has changed at the MQ infrastructure level either. So, it is in a deadlock situation now.


Sincerely,
Mir Musthafa Ali Pasha
PepsiCo, Inc. | BIS
MQ | Substation | TIBCO File Adapter | DRF/OS2 | QPasa Support
Work: 972-963-1622 | Cell: 469-237-0563

From: MQSeries List [mailto:***@LISTSERV.MEDUNIWIEN.AC.AT] On Behalf Of Tim Zielke
Sent: Thursday, October 03, 2013 3:29 PM
To: ***@LISTSERV.MEDUNIWIEN.AC.AT
Subject: Re: AMQ9209 error in MQ server causing 'MQRC_CONNECTION_BROKEN' error in weblogic server

Hi Mir,

You said originally . . .

"When this occurs, the applications accessing the particular qmgr on the MQ server fail and throw a connection broken 2009 error message (App log shown below). The timestamps in the Mq error log and the Application log match as highlighted shown below.
However, when the app tries to reconnect, the connection goes thru and everything is back to normal."

Does this mean that when the application gets the 2009 error message it crashes? And when it is restarted it connects successfully?

If this is the case, we consult our programmers who write client applications to either leverage the Automatic Client Reconnection (if the language supports it) or write reconnect logic to handle a 2009 (MQRC_CONNECTION_BROKEN). If this is an issue where the client application is not properly handling a 2009, I would push back that this is an application issue that needs to get corrected by the application team. Client programs should assume that receiving a 2009 is an inevitable part of interacting with a network and be robust enough to handle 2009 errors, in my opinion. To be complete, applications that bind locally should account for and handle a 2009, too. At least, this is the guidance we give our MQ programmers.

Thanks,
Tim


From: MQSeries List [mailto:***@LISTSERV.MEDUNIWIEN.AC.AT] On Behalf Of Pasha, Mir Musthafa Ali - Contractor {BIS}
Sent: Thursday, October 03, 2013 12:44 PM
To: ***@LISTSERV.MEDUNIWIEN.AC.AT<mailto:***@LISTSERV.MEDUNIWIEN.AC.AT>
Subject: Re: AMQ9209 error in MQ server causing 'MQRC_CONNECTION_BROKEN' error in weblogic server

Thanks for the reply Paul.
No, it doesn’t fail on the same day or time. As I said earlier, I couldn’t find a failure pattern based on date/time/load/volume/environment issue etc. This is all again internal to the network, so I am not sure how firewall even plays a role here.. but you might have a point.

This issue is kind of escalating here. Looks like I need to open a PMR.

Sincerely,
Mir Musthafa Ali Pasha
PepsiCo, Inc. | BIS


From: MQSeries List [mailto:***@LISTSERV.MEDUNIWIEN.AC.AT] On Behalf Of Paul Clarke
Sent: Wednesday, October 02, 2013 12:38 PM
To: ***@LISTSERV.MEDUNIWIEN.AC.AT<mailto:***@LISTSERV.MEDUNIWIEN.AC.AT>
Subject: Re: AMQ9209 error in MQ server causing 'MQRC_CONNECTION_BROKEN' error in weblogic server

I’ve seen things like this where the firewall has terminated the socket. Some firewalls can be configured to terminate a socket if it has been around for a long time or it has spent a long time not doing anything. have these clients been connected a long time or are they freshly connected ? That might be a good things to find out to determine whether the failure is related to the socket state or the environment state eg. time. For example, does the failure always happen around the same time? or on the same day? when the cleaner comes to hoover the machine room [Smile]

Cheers,
Paul

Paul Clarke
www.mqgem.com<http://www.mqgem.com>

From: Pasha, Mir Musthafa Ali - Contractor {BIS}<mailto:***@PEPSICO.COM>
Sent: Wednesday, October 02, 2013 5:40 PM
To: ***@LISTSERV.MEDUNIWIEN.AC.AT<mailto:***@LISTSERV.MEDUNIWIEN.AC.AT>
Subject: AMQ9209 error in MQ server causing 'MQRC_CONNECTION_BROKEN' error in weblogic server

Dear Folks,
I am seeing some weird issue here and would appreciate some advice on what could be causing it.
My MQ server(7.0.1.5) running on Linux SuSE infrequently writes the following error in its error logs(AMQERR01.LOG shown below) for one of the qmgrs. When this occurs, the applications accessing the particular qmgr on the MQ server fail and throw a connection broken 2009 error message (App log shown below). The timestamps in the Mq error log and the Application log match as highlighted shown below.
However, when the app tries to reconnect, the connection goes thru and everything is back to normal.

This issue occurs once a week and we couldn’t figure out a pattern or reason for the occurrence of this error. There are about 150 max channels allocated to the SVRCONN and the application reaches a peak of 85-90 channel connections instances. The Application runs atop a weblogic server that connects to MQ via a JMS MQ bindings file. There is no MQ client s/w on the weblogic server.

I would appreciate if gurus can throw some insight into what could be the potential problem? Thanks.

MQ Log from AMQERR01.LOG file
-------------------------------------------------------------------------------
09/29/2013 08:35:24 AM - Process(15545.83) User(mqm) Program(amqrmppa)
Host(MQSERVER)
AMQ9209: Connection to host 'APPSERVER (XXX.XX.XX.XX)' closed.

EXPLANATION:
An error occurred receiving data from 'APPSERVER (XXX.XX.XX.XX)' over
TCP/IP. The connection to the remote host has unexpectedly terminated.
ACTION:
Tell the systems administrator.
----- amqccita.c : 3472 -------------------------------------------------------
09/29/2013 08:35:24 AM - Process(15545.83) User(mqm) Program(amqrmppa)
Host(MQSERVER)
AMQ9999: Channel program ended abnormally.

EXPLANATION:
Channel program 'ORBP2P.ORBPRD00' ended abnormally.
ACTION:
Look at previous error messages for channel program 'ORBP2P.ORBPRD00' in the
error files to determine the cause of the failure.


Application log from Weblogic Server:

2013-09-29 08:35:24,610 [ACTIVE] ExecuteThread: '0' for queue: 'weblogic.kernel.Default (self-tuning)' ERROR com.fritolay.suppliernet.common.security.SupplierNetUserDetailsService - JMSException Occured While Pushing Message to MQ
com.ibm.msg.client.jms.DetailedJMSException: JMSWMQ0018: Failed to connect to queue manager 'ORBPRD00' with connection mode 'Client' and host name 'MQSERVER (1414)'. Check the queue manager is started and if running in client mode, check there is a listener running. Please see the linked exception for more information.


.
at weblogic.work.ExecuteThread.run(ExecuteThread.java:173)
Caused by: com.ibm.mq.MQException: JMSCMQ0001: WebSphere MQ call failed with compcode '2' ('MQCC_FAILED') reason '2009' ('MQRC_CONNECTION_BROKEN').
at com.ibm.msg.client.wmq.common.internal.Reason.createException(Reason.java:223)
... 80 more
Caused by: com.ibm.mq.jmqi.JmqiException: CC=2;RC=2009;AMQ9204: Connection to host ‘MQSERVER (1414)' rejected. [1=com.ibm.mq.jmqi.JmqiException[CC=2;RC=2009;AMQ9213: A communications error for occurred.],3= MQSERVER (1414),5=RemoteConnection.receiveAsyncTsh]
at com.ibm.mq.jmqi.remote.internal.RemoteFAP.jmqiConnect(RemoteFAP.java:2019)
at com.ibm.mq.jmqi.remote.internal.RemoteFAP.jmqiConnect(RemoteFAP.java:1233)
at com.ibm.msg.client.wmq.internal.WMQConnection.<init>(WMQConnection.java:366)
... 79 more
Caused by: com.ibm.mq.jmqi.JmqiException: CC=2;RC=2009;AMQ9213: A communications error for occurred.
at com.ibm.mq.jmqi.remote.internal.system.RemoteConnection.receiveAsyncTsh(RemoteConnection.java:3199)



Sincerely,
Mir Musthafa Ali Pasha
PepsiCo, Inc. | BIS


________________________________
List Archive<http://listserv.meduniwien.ac.at/archives/mqser-l.html> - Manage Your List Settings<http://listserv.meduniwien.ac.at/cgi-bin/wa?SUBED1=mqser-l&A=1> - Unsubscribe<mailto:***@LISTSERV.MEDUNIWIEN.AC.AT?subject=Unsubscribe&BODY=signoff%20mqseries>

Instructions for managing your mailing list subscription are provided in the Listserv General Users Guide available at http://www.lsoft.com<http://www.lsoft.com/resources/manuals.asp>

________________________________
List Archive<http://listserv.meduniwien.ac.at/archives/mqser-l.html> - Manage Your List Settings<http://listserv.meduniwien.ac.at/cgi-bin/wa?SUBED1=mqser-l&A=1> - Unsubscribe<mailto:***@LISTSERV.MEDUNIWIEN.AC.AT?subject=Unsubscribe&BODY=signoff%20mqseries>

Instructions for managing your mailing list subscription are provided in the Listserv General Users Guide available at http://www.lsoft.com<http://www.lsoft.com/resources/manuals.asp>
Roger Lacroix
2013-10-03 21:15:42 UTC
Permalink
Hello Mir,

You do know what the first thing that IBM Support is going to say...

Regards,
Roger Lacroix
Capitalware Inc.

At 04:39 PM 10/3/2013, you wrote:
>Thanks for the suggestion Roger. However, we
>don’t have v 7.0.1.10 or 7.5 anywhere in our
>shop. We are currently running with 7.0.1.5
>version on all servers and clients.
>
>Sincerely,
>Mir Musthafa Ali Pasha
>PepsiCo, Inc. | BIS
>
>
>From: MQSeries List
>[mailto:MQSERIES-0lvw86wZMd9k/bWDasg6f+***@public.gmane.org] On Behalf Of Roger Lacroix
>Sent: Thursday, October 03, 2013 1:51 PM
>To: MQSERIES-0lvw86wZMd9k/bWDasg6f+***@public.gmane.org
>Subject: Re: AMQ9209 error in MQ server causing
>'MQRC_CONNECTION_BROKEN' error in weblogic server
>
>Hello Mir,
>
> > This issue is kind of escalating here. Looks like I need to open a PMR.
>
>Before you open a PMR, I strongly suggest that
>you update the MQ JAR v7.0.1.5 files the
>application is using to either v7.0.1.10 or v7.5.0.2.
>
>Regards,
>Roger Lacroix
>Capitalware Inc.
>
>At 01:44 PM 10/3/2013, you wrote:
>
>Thanks for the reply Paul.
>No, it doesn’t fail on the same da day or
>time. As I said earlier, I couldn’t find a fa
>failure pattern based on
>date/time/load/volume/environment issue etc.
>This is all again internal to the network, so I
>am not sure how firewall even plays a role here.. but you might have a point.
>
>This issue is kind of escalating here. Looks like I need to open a PMR.
>
>Sincerely,
>Mir Musthafa Ali Pasha
>PepsiCo, Inc. | BIS
>
>
>From: MQSeries List [
>mailto:MQSERIES-0lvw86wZMd9k/bWDasg6f+***@public.gmane.org] On Behalf Of Paul Clarke
>Sent: Wednesday, October 02, 2013 12:38 PM
>To:
><mailto:MQSERIES-0lvw86wZMd9k/bWDasg6f+***@public.gmane.org>MQSERIES-0lvw86wZMd9k/***@public.gmane.orgAT
>Subject: Re: AMQ9209 error in MQ server causing
>'MQRC_CONNECTION_BROKEN' error in weblogic server
>
>I’ve seen things like thithis where the
>firewall has terminated the socket. Some
>firewalls can be configured to terminate a
>socket if it has been around for a long time or
>it has spent a long time not doing anything.
>have these clients been connected a long time or
>are they freshly connected ? That might be a
>good things to find out to determine whether the
>failure is related to the socket state or the
>environment state eg. time. For example, does
>the failure always happen around the same time?
>or on the same day? when the cleaner comes to hoover the machine room
>Smile
>
>
>Cheers,
>Paul
>
>Paul Clarke
><http://www.mqgem.com>www.mqgem.com
>
>From:
><mailto:MirMusthafaAli.Pasha-***@public.gmane.org>Pasha,
>Mir Musthafa Ali - Contractor {BIS}
>Sent: Wednesday, October 02, 2013 5:40 PM
>To:
><mailto:MQSERIES-0lvw86wZMd9k/bWDasg6f+***@public.gmane.org>MQSERIES-0lvw86wZMd9k/***@public.gmane.orgAT
>Subject: AMQ9209 error in MQ server causing
>'MQRC_CONNECTION_BROKEN' error in weblogic server
>
>Dear Folks,
>I am seeing some weird issue here and would
>appreciate some advice on what could be causing it.
>My MQ server(7.0.1.5) running on Linux SuSE
>infrequently writes the following error in its
>error logs(AMQERR01.LOG shown below) for one of
>the qmgrs. When this occurs, the applications
>accessing the particular qmgr on the MQ server
>fail and throw a connection broken 2009 error
>message (App log shown below). The timestamps
>in the Mq error log and the Application log match as highlighted shown below.
>However, when the app tries to reconnect, the
>connection goes thru and everything is back to normal.
>
>This issue occurs once a week and we couldn’t
>figure out a pattern or or reason for the
>occurrence of this error. There are about 150
>max channels allocated to the SVRCONN and the
>application reaches a peak of 85-90 channel
>connections instances. The Application runs
>atop a weblogic server that connects to MQ via a
>JMS MQ bindings file. There is no MQ client s/w on the weblogic server.
>
>I would appreciate if gurus can throw some
>insight into what could be the potential problem? Thanks.
>
>MQ Log from AMQERR01.LOG file
>-------------------------------------------------------------------------------
>09/29/2013 08:35:24 AM - Process(15545.83) User(mqm) Program(amqrmppa)
> Host(MQSERVER)
>AMQ9209: Connection to host 'APPSERVER (XXX.XX.XX.XX)' closed.
>
>EXPLANATION:
>An error occurred receiving data from 'APPSERVER (XXX.XX.XX.XX)' over
>TCP/IP. The connection to the remote host has unexpectedly terminated.
>ACTION:
>Tell the systems administrator.
>----- amqccita.c : 3472
>-------------------------------------------------------
>09/29/2013 08:35:24 AM - Process(15545.83) User(mqm) Program(amqrmppa)
> Host(MQSERVER)
>AMQ9999: Channel program ended abnormally.
>
>EXPLANATION:
>Channel program 'ORBP2P.ORBPRD00' ended abnormally.
>ACTION:
>Look at previous error messages for channel program 'ORBP2P.ORBPRD00' in the
>error files to determine the cause of the failure.
>
>
>Application log from Weblogic Server:
>
>2013-09-29 08:35:24,610 [ACTIVE] ExecuteThread:
>'0' for queue: 'weblogic.kernel.Default
>(self-tuning)' ERROR
>com.fritolay.suppliernet.common.security.SupplierNetUserDetailsService
>- JMSException Occured While Pushing Message to MQ
>com.ibm.msg.client.jms.DetailedJMSException:
>JMSWMQ0018: Failed to connect to queue manager
>'ORBPRD00' with connection mode 'Client' and
>host name 'MQSERVER (1414)'. Check the queue
>manager is started and if running in client
>mode, check there is a listener running. Please
>see the linked exception for more information.
>…….
>at weblogic.work.ExecuteThrehread.run(ExecuteThread.java:173)
>Caused by: com.ibm.mq.MQException: JMSCMQ0001:
>WebSphere MQ call failed with compcode '2'
>('MQCC_FAILED') reason '2009' ('MQRC_CONNECTION_BROKEN').
> at
> com.ibm.msg.client.wmq.common.internal.Reason.createException(Reason.java:223)
> ... 80 more
>Caused by: com.ibm.mq.jmqi.JmqiException:
>CC=2;RC=2009;AMQ9204: Connection to host
>‘MQSERVER (1414)' rejected.
>[1=com.ibm.mq.jmqiqi.JmqiException[CC=2;RC=2009;AMQ9213:
>A communications error for occurred.],3=
>MQSERVER (1414),5=RemoteConnection.receiveAsyncTsh]
> at
> com.ibm.mq.jmqi.remote.internal.RemoteFAP.jmqiConnect(RemoteFAP.java:2019)
> at
> com.ibm.mq.jmqi.remote.internal.RemoteFAP.jmqiConnect(RemoteFAP.java:1233)
> at
> com.ibm.msg.client.wmq.internal.WMQConnection.<init>(WMQConnection.java:366)
> ... 79 more
>Caused by: com.ibm.mq.jmqi.JmqiException:
>CC=2;RC=2009;AMQ9213: A communications error for occurred.
> at
> com.ibm.mq.jmqi.remote.internal.system.RemoteConnection.receiveAsyncTsh(RemoteConnection.java:3199)
>
>
>
>Sincerely,
>Mir Musthafa Ali Pasha
>PepsiCo, Inc. | BIS
>
>
>
>----------
><http://listserv.meduniwien.ac.at/archives/mqser-l.html>List
>Archive -
><http://listserv.meduniwien.ac.at/cgi-bin/wa?SUBED1=mqser-l&A=1>Manage
>Your List Settings -
><mailto:LISTSERV-0lvw86wZMd9k/bWDasg6f+***@public.gmane.org?subject=Unsubscribe&BODY=signoff%20mqseries>Unsubscribe
>
>
>Instructions for managing your mailing list
>subscription are provided in the Listserv
>General Users Guide available at
><http://www.lsoft.com/resources/manuals.asp>http://www.lsoft.com
>
>
>----------
><http://listserv.meduniwien.ac.at/archives/mqser-l.html>List
>Archive -
><http://listserv.meduniwien.ac.at/cgi-bin/wa?SUBED1=mqser-l&A=1>Manage
>Your List Settings -
><mailto:LISTSERV-0lvw86wZMd9k/bWDasg6f+***@public.gmane.org?subject=Unsubscribe&BODY=signoff%20mqseries>Unsubscribe
>
>
>Instructions for managing your mailing list
>subscription are provided in the Listserv
>General Users Guide available at
><http://www.lsoft.com/resources/manuals.asp>http://www.lsoft.com
>
>
>
>----------
><http://listserv.meduniwien.ac.at/archives/mqser-l.html>List
>Archive -
><http://listserv.meduniwien.ac.at/cgi-bin/wa?SUBED1=mqser-l&A=1>Manage
>Your List Settings -
><mailto:LISTSERV-0lvw86wZMd9k/bWDasg6f+***@public.gmane.org?subject=Unsubscribe&BODY=signoff%20mqseries>Unsubscribe
>
>
>Instructions for managing your mailing list
>subscription are provided in the Listserv
>General Users Guide available at
><http://www.lsoft.com/resources/manuals.asp>http://www.lsoft.com
>

To unsubscribe, write to LISTSERV-0lvw86wZMd9k/bWDasg6f+***@public.gmane.org and,
in the message body (not the subject), write: SIGNOFF MQSERIES
Instructions for managing your mailing list subscription are provided in
the Listserv General Users Guide available at http://www.lsoft.com
Archive: http://listserv.meduniwien.ac.at/archives/mqser-l.html
Tim Zielke
2013-10-24 15:53:52 UTC
Permalink
Hello Mir Musthafa Ali,

I don't know if you are still dealing with this issue, but I ran across something in the MQ manual that might shed a little more light on the issue. This is completely speculative on my part, but for that receiveAsyncTsh java method that you were getting an error on, the Tsh may stand for TSH or Transmission Segment Header. This is the first data structure I see when you look at an example of the channel data the MQ sends. Below is an example from a trace of a sender channel. There is a TSH (Transmission Segment Header), followed by an MSH (undocumented in the manual), following by an XQH (transmission queue header), followed by a MD (message descriptor), followed by the message contents, itself. This makes me think that SVRCONN channel end received some data that it tried to process and then abended. Almost like it wasn't really MQ sending the data. Again, being completely speculative here. But I thought this might help, if you are still dealing with the issue.

10:54:38.784876 11169.1 RSESS:000001 Channel Name:XXXXXXXX.TO.YYYYYYYY
10:54:38.784880 11169.1 RSESS:000001 Sending 482 bytes
10:54:38.784880 11169.1 RSESS:000001 0x0000: 54534820 000001e2 02043000 ed595f52 |TSH ......0..Y_R|
10:54:38.784880 11169.1 RSESS:000001 0x0010: 02010010 22020000 b8040000 4d534820 |....".......MSH |
10:54:38.784880 11169.1 RSESS:000001 0x0020: 02000000 06000000 00000000 b2010000 |................|
10:54:38.784880 11169.1 RSESS:000001 0x0030: 58514820 01000000 54435a2e 54455354 |XQH ....TCZ.TEST|
10:54:38.784880 11169.1 RSESS:000001 0x0040: 2e333030 20202020 20202020 20202020 |.300 |
10:54:38.784880 11169.1 RSESS:000001 0x0050: 20202020 20202020 20202020 20202020 | |
10:54:38.784880 11169.1 RSESS:000001 0x0060: 20202020 20202020 xxxxxxxx xxxxxxxx | xxxxxxxx|
10:54:38.784880 11169.1 RSESS:000001 0x0070: xxxxxxxx xxxxxxxx 20202020 20202020 |xxxxxxxx |
10:54:38.784880 11169.1 RSESS:000001 0x0080: 20202020 20202020 20202020 20202020 | |
10:54:38.784880 11169.1 RSESS:000001 0x0090: 20202020 20202020 4d442020 01000000 | MD ....|
10:54:38.784880 11169.1 RSESS:000001 0x00a0: 00000000 08000000 ffffffff 00000000 |................|
10:54:38.784880 11169.1 RSESS:000001 0x00b0: 22020000 b8040000 4d515354 52202020 |".......MQSTR |
10:54:38.784880 11169.1 RSESS:000001 0x00c0: 00000000 00000000 414d5120 xxxxxxxx |........AMQ xxxx|
10:54:38.784880 11169.1 RSESS:000001 0x00d0: xxxxxxxx xxxxxxxx xxxxxxxx xxxxxxxx |xxxxxxxxxxxxxxxx|
10:54:38.784880 11169.1 RSESS:000001 0x00e0: 00000000 00000000 00000000 00000000 |................|
10:54:38.784880 11169.1 RSESS:000001 0x00f0: 00000000 00000000 00000000 20202020 |............ |
10:54:38.784880 11169.1 RSESS:000001 0x0100: 20202020 20202020 20202020 20202020 | |
10:54:38.784880 11169.1 RSESS:000001 0x0110: 20202020 20202020 20202020 20202020 | |
10:54:38.784880 11169.1 RSESS:000001 0x0120: 20202020 20202020 20202020 xxxxxxxx | xxxx|
10:54:38.784880 11169.1 RSESS:000001 0x0130: xxxxxxx xxxxxxxx xxxxxx20 20202020 |xxxxxxxxx |
10:54:38.784880 11169.1 RSESS:000001 0x0140: 20202020 20202020 20202020 20202020 | |
10:54:38.784880 11169.1 RSESS:000001 0x0150: 20202020 20202020 20202020 4d514d20 | MQM |
10:54:38.784880 11169.1 RSESS:000001 0x0160: 20202020 20202020 03323434 00000000 | .244....|
10:54:38.784880 11169.1 RSESS:000001 0x0170: 00000000 00000000 00000000 00000000 |................|
10:54:38.784880 11169.1 RSESS:000001 0x0180: 00000000 00000006 20202020 20202020 |........ |
10:54:38.784880 11169.1 RSESS:000001 0x0190: 20202020 20202020 20202020 20202020 | |
10:54:38.784880 11169.1 RSESS:000001 0x01a0: 20202020 20202020 06000000 616d7173 | ....amqs|
10:54:38.784880 11169.1 RSESS:000001 0x01b0: 70757420 20202020 20202020 20202020 |put |
10:54:38.784880 11169.1 RSESS:000001 0x01c0: 20202020 20202020 32303133 31303234 | 20131024|
10:54:38.784880 11169.1 RSESS:000001 0x01d0: 31353534 33383738 20202020 74657374 |15543878 test|
10:54:38.784880 11169.1 RSESS:000001 0x01e0: 3232 |22 |

Thanks,
Tim

From: Tim Zielke
Sent: Thursday, October 03, 2013 4:05 PM
To: ***@LISTSERV.MEDUNIWIEN.AC.AT
Subject: RE: AMQ9209 error in MQ server causing 'MQRC_CONNECTION_BROKEN' error in weblogic server

That is frustrating that they are making that argument. They are just focused on this specific issue, and missing the bigger picture. What happens if you find and correct this presumed network issue, and then another one crops up 6 months later? They will be back to square one. The bottom line is that their MQ client application has application stability issues for not handling a 2009, and it will continue to have instability issues until they correct this. Hopefully, hammering this home to the developers or their managers will help them to stop having tunnel vision.

Regardless, if you want to investigate the current issue, that will be tough. If I had to do it, I would want to try and capture a TCP trace (I'm assuming your transport type is TCP) when the issue occurred on both the queue manager server side and the client application side. Of course, this needs to be managed considering how large TCP traces can get and the infrequency of the issue. But it can be done with scripting on Unix/Linux. Perhaps a Linux administrator could aid with this endeavor.

Thanks,
Tim

From: MQSeries List [mailto:***@LISTSERV.MEDUNIWIEN.AC.AT] On Behalf Of Pasha, Mir Musthafa Ali - Contractor {BIS}
Sent: Thursday, October 03, 2013 3:46 PM
To: ***@LISTSERV.MEDUNIWIEN.AC.AT
Subject: Re: AMQ9209 error in MQ server causing 'MQRC_CONNECTION_BROKEN' error in weblogic server

Hi Tim,
The answer to your question is “Yes”.

I am with you reg handling 2009 connection error in application logic(something I used to do myself as a best practice when I was a programmer) and I did make it a case and push it back to the App team.

However, their argument is.. everything was working perfectly fine and this issue started all of a sudden, albeit occurring once in a while. They claim that nothing has changed in the application or transaction volume. My point is Nothing has changed at the MQ infrastructure level either. So, it is in a deadlock situation now.


Sincerely,
Mir Musthafa Ali Pasha
PepsiCo, Inc. | BIS
MQ | Substation | TIBCO File Adapter | DRF/OS2 | QPasa Support
Work: 972-963-1622 | Cell: 469-237-0563

From: MQSeries List [mailto:***@LISTSERV.MEDUNIWIEN.AC.AT] On Behalf Of Tim Zielke
Sent: Thursday, October 03, 2013 3:29 PM
To: ***@LISTSERV.MEDUNIWIEN.AC.AT
Subject: Re: AMQ9209 error in MQ server causing 'MQRC_CONNECTION_BROKEN' error in weblogic server

Hi Mir,

You said originally . . .

"When this occurs, the applications accessing the particular qmgr on the MQ server fail and throw a connection broken 2009 error message (App log shown below). The timestamps in the Mq error log and the Application log match as highlighted shown below.
However, when the app tries to reconnect, the connection goes thru and everything is back to normal."

Does this mean that when the application gets the 2009 error message it crashes? And when it is restarted it connects successfully?

If this is the case, we consult our programmers who write client applications to either leverage the Automatic Client Reconnection (if the language supports it) or write reconnect logic to handle a 2009 (MQRC_CONNECTION_BROKEN). If this is an issue where the client application is not properly handling a 2009, I would push back that this is an application issue that needs to get corrected by the application team. Client programs should assume that receiving a 2009 is an inevitable part of interacting with a network and be robust enough to handle 2009 errors, in my opinion. To be complete, applications that bind locally should account for and handle a 2009, too. At least, this is the guidance we give our MQ programmers.

Thanks,
Tim


From: MQSeries List [mailto:***@LISTSERV.MEDUNIWIEN.AC.AT] On Behalf Of Pasha, Mir Musthafa Ali - Contractor {BIS}
Sent: Thursday, October 03, 2013 12:44 PM
To: ***@LISTSERV.MEDUNIWIEN.AC.AT<mailto:***@LISTSERV.MEDUNIWIEN.AC.AT>
Subject: Re: AMQ9209 error in MQ server causing 'MQRC_CONNECTION_BROKEN' error in weblogic server

Thanks for the reply Paul.
No, it doesn’t fail on the same day or time. As I said earlier, I couldn’t find a failure pattern based on date/time/load/volume/environment issue etc. This is all again internal to the network, so I am not sure how firewall even plays a role here.. but you might have a point.

This issue is kind of escalating here. Looks like I need to open a PMR.

Sincerely,
Mir Musthafa Ali Pasha
PepsiCo, Inc. | BIS


From: MQSeries List [mailto:***@LISTSERV.MEDUNIWIEN.AC.AT] On Behalf Of Paul Clarke
Sent: Wednesday, October 02, 2013 12:38 PM
To: ***@LISTSERV.MEDUNIWIEN.AC.AT<mailto:***@LISTSERV.MEDUNIWIEN.AC.AT>
Subject: Re: AMQ9209 error in MQ server causing 'MQRC_CONNECTION_BROKEN' error in weblogic server

I’ve seen things like this where the firewall has terminated the socket. Some firewalls can be configured to terminate a socket if it has been around for a long time or it has spent a long time not doing anything. have these clients been connected a long time or are they freshly connected ? That might be a good things to find out to determine whether the failure is related to the socket state or the environment state eg. time. For example, does the failure always happen around the same time? or on the same day? when the cleaner comes to hoover the machine room [Smile]

Cheers,
Paul

Paul Clarke
www.mqgem.com<http://www.mqgem.com>

From: Pasha, Mir Musthafa Ali - Contractor {BIS}<mailto:***@PEPSICO.COM>
Sent: Wednesday, October 02, 2013 5:40 PM
To: ***@LISTSERV.MEDUNIWIEN.AC.AT<mailto:***@LISTSERV.MEDUNIWIEN.AC.AT>
Subject: AMQ9209 error in MQ server causing 'MQRC_CONNECTION_BROKEN' error in weblogic server

Dear Folks,
I am seeing some weird issue here and would appreciate some advice on what could be causing it.
My MQ server(7.0.1.5) running on Linux SuSE infrequently writes the following error in its error logs(AMQERR01.LOG shown below) for one of the qmgrs. When this occurs, the applications accessing the particular qmgr on the MQ server fail and throw a connection broken 2009 error message (App log shown below). The timestamps in the Mq error log and the Application log match as highlighted shown below.
However, when the app tries to reconnect, the connection goes thru and everything is back to normal.

This issue occurs once a week and we couldn’t figure out a pattern or reason for the occurrence of this error. There are about 150 max channels allocated to the SVRCONN and the application reaches a peak of 85-90 channel connections instances. The Application runs atop a weblogic server that connects to MQ via a JMS MQ bindings file. There is no MQ client s/w on the weblogic server.

I would appreciate if gurus can throw some insight into what could be the potential problem? Thanks.

MQ Log from AMQERR01.LOG file
-------------------------------------------------------------------------------
09/29/2013 08:35:24 AM - Process(15545.83) User(mqm) Program(amqrmppa)
Host(MQSERVER)
AMQ9209: Connection to host 'APPSERVER (XXX.XX.XX.XX)' closed.

EXPLANATION:
An error occurred receiving data from 'APPSERVER (XXX.XX.XX.XX)' over
TCP/IP. The connection to the remote host has unexpectedly terminated.
ACTION:
Tell the systems administrator.
----- amqccita.c : 3472 -------------------------------------------------------
09/29/2013 08:35:24 AM - Process(15545.83) User(mqm) Program(amqrmppa)
Host(MQSERVER)
AMQ9999: Channel program ended abnormally.

EXPLANATION:
Channel program 'ORBP2P.ORBPRD00' ended abnormally.
ACTION:
Look at previous error messages for channel program 'ORBP2P.ORBPRD00' in the
error files to determine the cause of the failure.


Application log from Weblogic Server:

2013-09-29 08:35:24,610 [ACTIVE] ExecuteThread: '0' for queue: 'weblogic.kernel.Default (self-tuning)' ERROR com.fritolay.suppliernet.common.security.SupplierNetUserDetailsService - JMSException Occured While Pushing Message to MQ
com.ibm.msg.client.jms.DetailedJMSException: JMSWMQ0018: Failed to connect to queue manager 'ORBPRD00' with connection mode 'Client' and host name 'MQSERVER (1414)'. Check the queue manager is started and if running in client mode, check there is a listener running. Please see the linked exception for more information.


.
at weblogic.work.ExecuteThread.run(ExecuteThread.java:173)
Caused by: com.ibm.mq.MQException: JMSCMQ0001: WebSphere MQ call failed with compcode '2' ('MQCC_FAILED') reason '2009' ('MQRC_CONNECTION_BROKEN').
at com.ibm.msg.client.wmq.common.internal.Reason.createException(Reason.java:223)
... 80 more
Caused by: com.ibm.mq.jmqi.JmqiException: CC=2;RC=2009;AMQ9204: Connection to host ‘MQSERVER (1414)' rejected. [1=com.ibm.mq.jmqi.JmqiException[CC=2;RC=2009;AMQ9213: A communications error for occurred.],3= MQSERVER (1414),5=RemoteConnection.receiveAsyncTsh]
at com.ibm.mq.jmqi.remote.internal.RemoteFAP.jmqiConnect(RemoteFAP.java:2019)
at com.ibm.mq.jmqi.remote.internal.RemoteFAP.jmqiConnect(RemoteFAP.java:1233)
at com.ibm.msg.client.wmq.internal.WMQConnection.<init>(WMQConnection.java:366)
... 79 more
Caused by: com.ibm.mq.jmqi.JmqiException: CC=2;RC=2009;AMQ9213: A communications error for occurred.
at com.ibm.mq.jmqi.remote.internal.system.RemoteConnection.receiveAsyncTsh(RemoteConnection.java:3199)



Sincerely,
Mir Musthafa Ali Pasha
PepsiCo, Inc. | BIS


________________________________
List Archive<http://listserv.meduniwien.ac.at/archives/mqser-l.html> - Manage Your List Settings<http://listserv.meduniwien.ac.at/cgi-bin/wa?SUBED1=mqser-l&A=1> - Unsubscribe<mailto:***@LISTSERV.MEDUNIWIEN.AC.AT?subject=Unsubscribe&BODY=signoff%20mqseries>

Instructions for managing your mailing list subscription are provided in the Listserv General Users Guide available at http://www.lsoft.com<http://www.lsoft.com/resources/manuals.asp>

________________________________
List Archive<http://listserv.meduniwien.ac.at/archives/mqser-l.html> - Manage Your List Settings<http://listserv.meduniwien.ac.at/cgi-bin/wa?SUBED1=mqser-l&A=1> - Unsubscribe<mailto:***@LISTSERV.MEDUNIWIEN.AC.AT?subject=Unsubscribe&BODY=signoff%20mqseries>

Instructions for managing your mailing list subscription are provided in the Listserv General Users Guide available at http://www.lsoft.com<http://www.lsoft.com/resources/manuals.asp>
Pasha, Mir Musthafa Ali - Contractor {BIS}
2013-10-25 23:42:56 UTC
Permalink
Thanks for that info, Tim. The issue is still there and we haven’t been able to make much progress. I have opened a PMR and provided the log files to IBM.
Will update y’all on what they find. Thanks.

Sincerely,
Mir Musthafa Ali Pasha


From: MQSeries List [mailto:***@LISTSERV.MEDUNIWIEN.AC.AT] On Behalf Of Tim Zielke
Sent: Thursday, October 24, 2013 10:54 AM
To: ***@LISTSERV.MEDUNIWIEN.AC.AT
Subject: Re: AMQ9209 error in MQ server causing 'MQRC_CONNECTION_BROKEN' error in weblogic server

Hello Mir Musthafa Ali,

I don't know if you are still dealing with this issue, but I ran across something in the MQ manual that might shed a little more light on the issue. This is completely speculative on my part, but for that receiveAsyncTsh java method that you were getting an error on, the Tsh may stand for TSH or Transmission Segment Header. This is the first data structure I see when you look at an example of the channel data the MQ sends. Below is an example from a trace of a sender channel. There is a TSH (Transmission Segment Header), followed by an MSH (undocumented in the manual), following by an XQH (transmission queue header), followed by a MD (message descriptor), followed by the message contents, itself. This makes me think that SVRCONN channel end received some data that it tried to process and then abended. Almost like it wasn't really MQ sending the data. Again, being completely speculative here. But I thought this might help, if you are still dealing with the issue.

10:54:38.784876 11169.1 RSESS:000001 Channel Name:XXXXXXXX.TO.YYYYYYYY
10:54:38.784880 11169.1 RSESS:000001 Sending 482 bytes
10:54:38.784880 11169.1 RSESS:000001 0x0000: 54534820 000001e2 02043000 ed595f52 |TSH ......0..Y_R|
10:54:38.784880 11169.1 RSESS:000001 0x0010: 02010010 22020000 b8040000 4d534820 |....".......MSH |
10:54:38.784880 11169.1 RSESS:000001 0x0020: 02000000 06000000 00000000 b2010000 |................|
10:54:38.784880 11169.1 RSESS:000001 0x0030: 58514820 01000000 54435a2e 54455354 |XQH ....TCZ.TEST|
10:54:38.784880 11169.1 RSESS:000001 0x0040: 2e333030 20202020 20202020 20202020 |.300 |
10:54:38.784880 11169.1 RSESS:000001 0x0050: 20202020 20202020 20202020 20202020 | |
10:54:38.784880 11169.1 RSESS:000001 0x0060: 20202020 20202020 xxxxxxxx xxxxxxxx | xxxxxxxx|
10:54:38.784880 11169.1 RSESS:000001 0x0070: xxxxxxxx xxxxxxxx 20202020 20202020 |xxxxxxxx |
10:54:38.784880 11169.1 RSESS:000001 0x0080: 20202020 20202020 20202020 20202020 | |
10:54:38.784880 11169.1 RSESS:000001 0x0090: 20202020 20202020 4d442020 01000000 | MD ....|
10:54:38.784880 11169.1 RSESS:000001 0x00a0: 00000000 08000000 ffffffff 00000000 |................|
10:54:38.784880 11169.1 RSESS:000001 0x00b0: 22020000 b8040000 4d515354 52202020 |".......MQSTR |
10:54:38.784880 11169.1 RSESS:000001 0x00c0: 00000000 00000000 414d5120 xxxxxxxx |........AMQ xxxx|
10:54:38.784880 11169.1 RSESS:000001 0x00d0: xxxxxxxx xxxxxxxx xxxxxxxx xxxxxxxx |xxxxxxxxxxxxxxxx|
10:54:38.784880 11169.1 RSESS:000001 0x00e0: 00000000 00000000 00000000 00000000 |................|
10:54:38.784880 11169.1 RSESS:000001 0x00f0: 00000000 00000000 00000000 20202020 |............ |
10:54:38.784880 11169.1 RSESS:000001 0x0100: 20202020 20202020 20202020 20202020 | |
10:54:38.784880 11169.1 RSESS:000001 0x0110: 20202020 20202020 20202020 20202020 | |
10:54:38.784880 11169.1 RSESS:000001 0x0120: 20202020 20202020 20202020 xxxxxxxx | xxxx|
10:54:38.784880 11169.1 RSESS:000001 0x0130: xxxxxxx xxxxxxxx xxxxxx20 20202020 |xxxxxxxxx |
10:54:38.784880 11169.1 RSESS:000001 0x0140: 20202020 20202020 20202020 20202020 | |
10:54:38.784880 11169.1 RSESS:000001 0x0150: 20202020 20202020 20202020 4d514d20 | MQM |
10:54:38.784880 11169.1 RSESS:000001 0x0160: 20202020 20202020 03323434 00000000 | .244....|
10:54:38.784880 11169.1 RSESS:000001 0x0170: 00000000 00000000 00000000 00000000 |................|
10:54:38.784880 11169.1 RSESS:000001 0x0180: 00000000 00000006 20202020 20202020 |........ |
10:54:38.784880 11169.1 RSESS:000001 0x0190: 20202020 20202020 20202020 20202020 | |
10:54:38.784880 11169.1 RSESS:000001 0x01a0: 20202020 20202020 06000000 616d7173 | ....amqs|
10:54:38.784880 11169.1 RSESS:000001 0x01b0: 70757420 20202020 20202020 20202020 |put |
10:54:38.784880 11169.1 RSESS:000001 0x01c0: 20202020 20202020 32303133 31303234 | 20131024|
10:54:38.784880 11169.1 RSESS:000001 0x01d0: 31353534 33383738 20202020 74657374 |15543878 test|
10:54:38.784880 11169.1 RSESS:000001 0x01e0: 3232 |22 |

Thanks,
Tim

From: Tim Zielke
Sent: Thursday, October 03, 2013 4:05 PM
To: ***@LISTSERV.MEDUNIWIEN.AC.AT<mailto:***@LISTSERV.MEDUNIWIEN.AC.AT>
Subject: RE: AMQ9209 error in MQ server causing 'MQRC_CONNECTION_BROKEN' error in weblogic server

That is frustrating that they are making that argument. They are just focused on this specific issue, and missing the bigger picture. What happens if you find and correct this presumed network issue, and then another one crops up 6 months later? They will be back to square one. The bottom line is that their MQ client application has application stability issues for not handling a 2009, and it will continue to have instability issues until they correct this. Hopefully, hammering this home to the developers or their managers will help them to stop having tunnel vision.

Regardless, if you want to investigate the current issue, that will be tough. If I had to do it, I would want to try and capture a TCP trace (I'm assuming your transport type is TCP) when the issue occurred on both the queue manager server side and the client application side. Of course, this needs to be managed considering how large TCP traces can get and the infrequency of the issue. But it can be done with scripting on Unix/Linux. Perhaps a Linux administrator could aid with this endeavor.

Thanks,
Tim

From: MQSeries List [mailto:***@LISTSERV.MEDUNIWIEN.AC.AT] On Behalf Of Pasha, Mir Musthafa Ali - Contractor {BIS}
Sent: Thursday, October 03, 2013 3:46 PM
To: ***@LISTSERV.MEDUNIWIEN.AC.AT<mailto:***@LISTSERV.MEDUNIWIEN.AC.AT>
Subject: Re: AMQ9209 error in MQ server causing 'MQRC_CONNECTION_BROKEN' error in weblogic server

Hi Tim,
The answer to your question is “Yes”.

I am with you reg handling 2009 connection error in application logic(something I used to do myself as a best practice when I was a programmer) and I did make it a case and push it back to the App team.

However, their argument is.. everything was working perfectly fine and this issue started all of a sudden, albeit occurring once in a while. They claim that nothing has changed in the application or transaction volume. My point is Nothing has changed at the MQ infrastructure level either. So, it is in a deadlock situation now.


Sincerely,
Mir Musthafa Ali Pasha
PepsiCo, Inc. | BIS
MQ | Substation | TIBCO File Adapter | DRF/OS2 | QPasa Support
Work: 972-963-1622 | Cell: 469-237-0563

From: MQSeries List [mailto:***@LISTSERV.MEDUNIWIEN.AC.AT] On Behalf Of Tim Zielke
Sent: Thursday, October 03, 2013 3:29 PM
To: ***@LISTSERV.MEDUNIWIEN.AC.AT<mailto:***@LISTSERV.MEDUNIWIEN.AC.AT>
Subject: Re: AMQ9209 error in MQ server causing 'MQRC_CONNECTION_BROKEN' error in weblogic server

Hi Mir,

You said originally . . .

"When this occurs, the applications accessing the particular qmgr on the MQ server fail and throw a connection broken 2009 error message (App log shown below). The timestamps in the Mq error log and the Application log match as highlighted shown below.
However, when the app tries to reconnect, the connection goes thru and everything is back to normal."

Does this mean that when the application gets the 2009 error message it crashes? And when it is restarted it connects successfully?

If this is the case, we consult our programmers who write client applications to either leverage the Automatic Client Reconnection (if the language supports it) or write reconnect logic to handle a 2009 (MQRC_CONNECTION_BROKEN). If this is an issue where the client application is not properly handling a 2009, I would push back that this is an application issue that needs to get corrected by the application team. Client programs should assume that receiving a 2009 is an inevitable part of interacting with a network and be robust enough to handle 2009 errors, in my opinion. To be complete, applications that bind locally should account for and handle a 2009, too. At least, this is the guidance we give our MQ programmers.

Thanks,
Tim


From: MQSeries List [mailto:***@LISTSERV.MEDUNIWIEN.AC.AT] On Behalf Of Pasha, Mir Musthafa Ali - Contractor {BIS}
Sent: Thursday, October 03, 2013 12:44 PM
To: ***@LISTSERV.MEDUNIWIEN.AC.AT<mailto:***@LISTSERV.MEDUNIWIEN.AC.AT>
Subject: Re: AMQ9209 error in MQ server causing 'MQRC_CONNECTION_BROKEN' error in weblogic server

Thanks for the reply Paul.
No, it doesn’t fail on the same day or time. As I said earlier, I couldn’t find a failure pattern based on date/time/load/volume/environment issue etc. This is all again internal to the network, so I am not sure how firewall even plays a role here.. but you might have a point.

This issue is kind of escalating here. Looks like I need to open a PMR.

Sincerely,
Mir Musthafa Ali Pasha
PepsiCo, Inc. | BIS


From: MQSeries List [mailto:***@LISTSERV.MEDUNIWIEN.AC.AT] On Behalf Of Paul Clarke
Sent: Wednesday, October 02, 2013 12:38 PM
To: ***@LISTSERV.MEDUNIWIEN.AC.AT<mailto:***@LISTSERV.MEDUNIWIEN.AC.AT>
Subject: Re: AMQ9209 error in MQ server causing 'MQRC_CONNECTION_BROKEN' error in weblogic server

I’ve seen things like this where the firewall has terminated the socket. Some firewalls can be configured to terminate a socket if it has been around for a long time or it has spent a long time not doing anything. have these clients been connected a long time or are they freshly connected ? That might be a good things to find out to determine whether the failure is related to the socket state or the environment state eg. time. For example, does the failure always happen around the same time? or on the same day? when the cleaner comes to hoover the machine room [cid:***@01CED1B2.046B0730]

Cheers,
Paul

Paul Clarke
www.mqgem.com<http://www.mqgem.com>

From: Pasha, Mir Musthafa Ali - Contractor {BIS}<mailto:***@PEPSICO.COM>
Sent: Wednesday, October 02, 2013 5:40 PM
To: ***@LISTSERV.MEDUNIWIEN.AC.AT<mailto:***@LISTSERV.MEDUNIWIEN.AC.AT>
Subject: AMQ9209 error in MQ server causing 'MQRC_CONNECTION_BROKEN' error in weblogic server

Dear Folks,
I am seeing some weird issue here and would appreciate some advice on what could be causing it.
My MQ server(7.0.1.5) running on Linux SuSE infrequently writes the following error in its error logs(AMQERR01.LOG shown below) for one of the qmgrs. When this occurs, the applications accessing the particular qmgr on the MQ server fail and throw a connection broken 2009 error message (App log shown below). The timestamps in the Mq error log and the Application log match as highlighted shown below.
However, when the app tries to reconnect, the connection goes thru and everything is back to normal.

This issue occurs once a week and we couldn’t figure out a pattern or reason for the occurrence of this error. There are about 150 max channels allocated to the SVRCONN and the application reaches a peak of 85-90 channel connections instances. The Application runs atop a weblogic server that connects to MQ via a JMS MQ bindings file. There is no MQ client s/w on the weblogic server.

I would appreciate if gurus can throw some insight into what could be the potential problem? Thanks.

MQ Log from AMQERR01.LOG file
-------------------------------------------------------------------------------
09/29/2013 08:35:24 AM - Process(15545.83) User(mqm) Program(amqrmppa)
Host(MQSERVER)
AMQ9209: Connection to host 'APPSERVER (XXX.XX.XX.XX)' closed.

EXPLANATION:
An error occurred receiving data from 'APPSERVER (XXX.XX.XX.XX)' over
TCP/IP. The connection to the remote host has unexpectedly terminated.
ACTION:
Tell the systems administrator.
----- amqccita.c : 3472 -------------------------------------------------------
09/29/2013 08:35:24 AM - Process(15545.83) User(mqm) Program(amqrmppa)
Host(MQSERVER)
AMQ9999: Channel program ended abnormally.

EXPLANATION:
Channel program 'ORBP2P.ORBPRD00' ended abnormally.
ACTION:
Look at previous error messages for channel program 'ORBP2P.ORBPRD00' in the
error files to determine the cause of the failure.


Application log from Weblogic Server:

2013-09-29 08:35:24,610 [ACTIVE] ExecuteThread: '0' for queue: 'weblogic.kernel.Default (self-tuning)' ERROR com.fritolay.suppliernet.common.security.SupplierNetUserDetailsService - JMSException Occured While Pushing Message to MQ
com.ibm.msg.client.jms.DetailedJMSException: JMSWMQ0018: Failed to connect to queue manager 'ORBPRD00' with connection mode 'Client' and host name 'MQSERVER (1414)'. Check the queue manager is started and if running in client mode, check there is a listener running. Please see the linked exception for more information.


.
at weblogic.work.ExecuteThread.run(ExecuteThread.java:173)
Caused by: com.ibm.mq.MQException: JMSCMQ0001: WebSphere MQ call failed with compcode '2' ('MQCC_FAILED') reason '2009' ('MQRC_CONNECTION_BROKEN').
at com.ibm.msg.client.wmq.common.internal.Reason.createException(Reason.java:223)
... 80 more
Caused by: com.ibm.mq.jmqi.JmqiException: CC=2;RC=2009;AMQ9204: Connection to host ‘MQSERVER (1414)' rejected. [1=com.ibm.mq.jmqi.JmqiException[CC=2;RC=2009;AMQ9213: A communications error for occurred.],3= MQSERVER (1414),5=RemoteConnection.receiveAsyncTsh]
at com.ibm.mq.jmqi.remote.internal.RemoteFAP.jmqiConnect(RemoteFAP.java:2019)
at com.ibm.mq.jmqi.remote.internal.RemoteFAP.jmqiConnect(RemoteFAP.java:1233)
at com.ibm.msg.client.wmq.internal.WMQConnection.<init>(WMQConnection.java:366)
... 79 more
Caused by: com.ibm.mq.jmqi.JmqiException: CC=2;RC=2009;AMQ9213: A communications error for occurred.
at com.ibm.mq.jmqi.remote.internal.system.RemoteConnection.receiveAsyncTsh(RemoteConnection.java:3199)



Sincerely,
Mir Musthafa Ali Pasha
PepsiCo, Inc. | BIS


________________________________
List Archive<http://listserv.meduniwien.ac.at/archives/mqser-l.html> - Manage Your List Settings<http://listserv.meduniwien.ac.at/cgi-bin/wa?SUBED1=mqser-l&A=1> - Unsubscribe<mailto:***@LISTSERV.MEDUNIWIEN.AC.AT?subject=Unsubscribe&BODY=signoff%20mqseries>

Instructions for managing your mailing list subscription are provided in the Listserv General Users Guide available at http://www.lsoft.com<http://www.lsoft.com/resources/manuals.asp>

________________________________
List Archive<http://listserv.meduniwien.ac.at/archives/mqser-l.html> - Manage Your List Settings<http://listserv.meduniwien.ac.at/cgi-bin/wa?SUBED1=mqser-l&A=1> - Unsubscribe<mailto:***@LISTSERV.MEDUNIWIEN.AC.AT?subject=Unsubscribe&BODY=signoff%20mqseries>

Instructions for managing your mailing list subscription are provided in the Listserv General Users Guide available at http://www.lsoft.com<http://www.lsoft.com/resources/manuals.asp>
Loading...