Jordi Saltiveri Jovells
2013-10-17 10:51:18 UTC
Hello,
We have an issue and would appreciate some advice on what could be causing it.
We are implementing Multi instance queuemanager solution.
The environment have this specs:
· Virtual server on vmware host ESX 4 server
· OS: Red hat 5.9 Enterprise
· MQServer 7.5.0.1
· NFS FileSystem V4 on NetApp Release 8.0.2P4 7-Mode: Tue Nov 15 16:16:47 PST 2011
· NFS mount options: nfs4 bg,soft,nointr,timeo=300,retrans=5,rsize=65940,wsize=65940,noac 0 0
This issue occurs when the storage team wants to move the service at the other NETAPP active node using "takeover" option or "giveback" option, our MQ server loses the file System and switch to standby instance.
The next FDC was generated in the MQServer:
AMQ11708.0.FDC 2013/10/14 23:33:28.593536+2 Installation1 amqzxma0 11708 9 ZX155001 zxcFileLockMonitorThread lrcE_S_Q_MGR_UNRESPONSIVE OK
!AMQ11740.0.FDC 2013/10/14 23:33:31.073928+2 Installation1 PS006049 psiWaitPSModeChange
AMQ11571.0.FDC 2013/10/14 23:33:37.426462+2 Installation1 amqzxma0 11571 9 ZX155001 zxcFileLockMonitorThread lrcE_S_Q_MGR_UNRESPONSIVE OK
!AMQ11740.0.FDC 2013/10/14 23:33:42.944821+2 Installation1 PS058054 psiStopAllTasks
AMQ11740.0.FDC 2013/10/14 23:33:42.958556+2 Installation1 amqzmuf0 11740 9 PS058099 psiStopAllTasks frcW_QUIESCE_FAILED OK
!AMQ11740.0.FDC 2013/10/14 23:33:43.246717+2 Installation1 PS058054 psiStopAllTasks
AMQ11740.0.FDC 2013/10/14 23:33:43.261180+2 Installation1 amqzmuf0 11740 10 PS058099 psiStopAllTasks frcW_QUIESCE_FAILED OK
!AMQ11740.0.FDC 2013/10/14 23:33:43.382625+2 Installation1 PS058054 psiStopAllTasks
AMQ11740.0.FDC 2013/10/14 23:33:43.396977+2 Installation1 amqzmuf0 11740 11 PS058099 psiStopAllTasks frcW_QUIESCE_FAILED OK
AMQ11602.0.FDC 2013/10/14 23:35:01.528132+2 Installation1 amqzmuf0 11602 9 PS017094 psiProcessProxySubs MQRC_CONNECTION_BROKEN OK
!AMQ11602.0.FDC 2013/10/14 23:35:01.546072+2 Installation1 PS000035 psiReceivePublications
AMQ11708.0.FDC 2013/10/14 23:35:01.576255+2 Installation1 amqzxma0 11708 1 ZX005025 zxcProcessChildren zrcX_PROCESS_MISSING OK
AMQ11571.0.FDC 2013/10/14 23:35:27.463749+2 Installation1 amqzxma0 11571 1 ZX005025 zxcProcessChildren zrcX_PROCESS_MISSING OK
AMQ11735.0.FDC 2013/10/14 23:35:37.640100+2 Installation1 amqzmur0 11735 1 ZX086131 amqzmur0 zrcE_UNEXPECTED_ERROR OK
AMQ11735.0.FDC 2013/10/14 23:35:37.651677+2 Installation1 amqzmur0 11735 2 ZX086131 amqzmur0 OK OK
We know that file lock monitor thread stops the qmgr if it not responded for 20 seconds:
http://www-01.ibm.com/support/docview.wss?uid=swg21592501
The MQ queue manager runs a file lock monitor thread, which, every 10 seconds, checks that it has access to the resources it needs over the network. Another health-check thread monitors this one to determine whether it has hung, which is a sign that there is a network problem. The FDC with probe id ZX155001 is from this health-check thread which detected that the file lock monitor thread had not responded for 20 seconds.
I would appreciate if anyone can help me with this.
Thanks!
[Descripción: Descripció: Descripció: Descripció: logo_claim] Jordi Saltiveri
Avda.Diagonal, 605, Planta 4ª - 08028 Barcelona
E-mail: jsaltive-GH7bCjKZcdLQT0dZR+***@public.gmane.org<mailto:jsaltive-GH7bCjKZcdLQT0dZR+***@public.gmane.org>
________________________________
AVISO DE CONFIDENCIALIDAD.
Este correo y la información contenida o adjunta al mismo es privada y confidencial y va dirigida exclusivamente a su destinatario. everis informa a quien pueda haber recibido este correo por error que contiene información confidencial cuyo uso, copia, reproducción o distribución está expresamente prohibida. Si no es Vd. el destinatario del mismo y recibe este correo por error, le rogamos lo ponga en conocimiento del emisor y proceda a su eliminación sin copiarlo, imprimirlo o utilizarlo de ningún modo.
CONFIDENTIALITY WARNING.
This message and the information contained in or attached to it are private and confidential and intended exclusively for the addressee. everis informs to whom it may receive it in error that it contains privileged information and its use, copy, reproduction or distribution is prohibited. If you are not an intended recipient of this E-mail, please notify the sender, delete it and do not read, act upon, print, disclose, copy, retain or redistribute any portion of this E-mail.
To unsubscribe, write to LISTSERV-0lvw86wZMd9k/bWDasg6f+***@public.gmane.org and,
in the message body (not the subject), write: SIGNOFF MQSERIES
Instructions for managing your mailing list subscription are provided in
the Listserv General Users Guide available at http://www.lsoft.com
Archive: http://listserv.meduniwien.ac.at/archives/mqser-l.html
We have an issue and would appreciate some advice on what could be causing it.
We are implementing Multi instance queuemanager solution.
The environment have this specs:
· Virtual server on vmware host ESX 4 server
· OS: Red hat 5.9 Enterprise
· MQServer 7.5.0.1
· NFS FileSystem V4 on NetApp Release 8.0.2P4 7-Mode: Tue Nov 15 16:16:47 PST 2011
· NFS mount options: nfs4 bg,soft,nointr,timeo=300,retrans=5,rsize=65940,wsize=65940,noac 0 0
This issue occurs when the storage team wants to move the service at the other NETAPP active node using "takeover" option or "giveback" option, our MQ server loses the file System and switch to standby instance.
The next FDC was generated in the MQServer:
AMQ11708.0.FDC 2013/10/14 23:33:28.593536+2 Installation1 amqzxma0 11708 9 ZX155001 zxcFileLockMonitorThread lrcE_S_Q_MGR_UNRESPONSIVE OK
!AMQ11740.0.FDC 2013/10/14 23:33:31.073928+2 Installation1 PS006049 psiWaitPSModeChange
AMQ11571.0.FDC 2013/10/14 23:33:37.426462+2 Installation1 amqzxma0 11571 9 ZX155001 zxcFileLockMonitorThread lrcE_S_Q_MGR_UNRESPONSIVE OK
!AMQ11740.0.FDC 2013/10/14 23:33:42.944821+2 Installation1 PS058054 psiStopAllTasks
AMQ11740.0.FDC 2013/10/14 23:33:42.958556+2 Installation1 amqzmuf0 11740 9 PS058099 psiStopAllTasks frcW_QUIESCE_FAILED OK
!AMQ11740.0.FDC 2013/10/14 23:33:43.246717+2 Installation1 PS058054 psiStopAllTasks
AMQ11740.0.FDC 2013/10/14 23:33:43.261180+2 Installation1 amqzmuf0 11740 10 PS058099 psiStopAllTasks frcW_QUIESCE_FAILED OK
!AMQ11740.0.FDC 2013/10/14 23:33:43.382625+2 Installation1 PS058054 psiStopAllTasks
AMQ11740.0.FDC 2013/10/14 23:33:43.396977+2 Installation1 amqzmuf0 11740 11 PS058099 psiStopAllTasks frcW_QUIESCE_FAILED OK
AMQ11602.0.FDC 2013/10/14 23:35:01.528132+2 Installation1 amqzmuf0 11602 9 PS017094 psiProcessProxySubs MQRC_CONNECTION_BROKEN OK
!AMQ11602.0.FDC 2013/10/14 23:35:01.546072+2 Installation1 PS000035 psiReceivePublications
AMQ11708.0.FDC 2013/10/14 23:35:01.576255+2 Installation1 amqzxma0 11708 1 ZX005025 zxcProcessChildren zrcX_PROCESS_MISSING OK
AMQ11571.0.FDC 2013/10/14 23:35:27.463749+2 Installation1 amqzxma0 11571 1 ZX005025 zxcProcessChildren zrcX_PROCESS_MISSING OK
AMQ11735.0.FDC 2013/10/14 23:35:37.640100+2 Installation1 amqzmur0 11735 1 ZX086131 amqzmur0 zrcE_UNEXPECTED_ERROR OK
AMQ11735.0.FDC 2013/10/14 23:35:37.651677+2 Installation1 amqzmur0 11735 2 ZX086131 amqzmur0 OK OK
We know that file lock monitor thread stops the qmgr if it not responded for 20 seconds:
http://www-01.ibm.com/support/docview.wss?uid=swg21592501
The MQ queue manager runs a file lock monitor thread, which, every 10 seconds, checks that it has access to the resources it needs over the network. Another health-check thread monitors this one to determine whether it has hung, which is a sign that there is a network problem. The FDC with probe id ZX155001 is from this health-check thread which detected that the file lock monitor thread had not responded for 20 seconds.
I would appreciate if anyone can help me with this.
Thanks!
[Descripción: Descripció: Descripció: Descripció: logo_claim] Jordi Saltiveri
Avda.Diagonal, 605, Planta 4ª - 08028 Barcelona
E-mail: jsaltive-GH7bCjKZcdLQT0dZR+***@public.gmane.org<mailto:jsaltive-GH7bCjKZcdLQT0dZR+***@public.gmane.org>
________________________________
AVISO DE CONFIDENCIALIDAD.
Este correo y la información contenida o adjunta al mismo es privada y confidencial y va dirigida exclusivamente a su destinatario. everis informa a quien pueda haber recibido este correo por error que contiene información confidencial cuyo uso, copia, reproducción o distribución está expresamente prohibida. Si no es Vd. el destinatario del mismo y recibe este correo por error, le rogamos lo ponga en conocimiento del emisor y proceda a su eliminación sin copiarlo, imprimirlo o utilizarlo de ningún modo.
CONFIDENTIALITY WARNING.
This message and the information contained in or attached to it are private and confidential and intended exclusively for the addressee. everis informs to whom it may receive it in error that it contains privileged information and its use, copy, reproduction or distribution is prohibited. If you are not an intended recipient of this E-mail, please notify the sender, delete it and do not read, act upon, print, disclose, copy, retain or redistribute any portion of this E-mail.
To unsubscribe, write to LISTSERV-0lvw86wZMd9k/bWDasg6f+***@public.gmane.org and,
in the message body (not the subject), write: SIGNOFF MQSERIES
Instructions for managing your mailing list subscription are provided in
the Listserv General Users Guide available at http://www.lsoft.com
Archive: http://listserv.meduniwien.ac.at/archives/mqser-l.html