d8888 888 888      88888888888 888      d8b                                 888       888          888       .d8888b.           888                               
      d88888 888 888          888     888      Y8P                                 888   o   888          888      d88P  Y88b          888                               
     d88P888 888 888          888     888                                          888  d8b  888          888      Y88b.               888                               
    d88P 888 888 888          888     88888b.  888 88888b.   .d88b.  .d8888b       888 d888b 888  .d88b.  88888b.   "Y888b.   88888b.  88888b.   .d88b.  888d888 .d88b.  
   d88P  888 888 888          888     888 "88b 888 888 "88b d88P"88b 88K           888d88888b888 d8P  Y8b 888 "88b     "Y88b. 888 "88b 888 "88b d8P  Y8b 888P"  d8P  Y8b 
  d88P   888 888 888          888     888  888 888 888  888 888  888 "Y8888b.      88888P Y88888 88888888 888  888       "888 888  888 888  888 88888888 888    88888888 
 d8888888888 888 888          888     888  888 888 888  888 Y88b 888      X88      8888P   Y8888 Y8b.     888 d88P Y88b  d88P 888 d88P 888  888 Y8b.     888    Y8b.     
d88P     888 888 888          888     888  888 888 888  888  "Y88888  88888P'      888P     Y888  "Y8888  88888P"   "Y8888P"  88888P"  888  888  "Y8888  888     "Y8888  
                                                                 888                                                          888                                        
                                                            Y8b d88P                                                          888                                        
                                                             "Y88P"                                                           888   

All Things WebSphere

Concerns and issues relating to all versions of WebSphere Application Server

Thursday, September 6, 2012

 

Serviceability gem: WebSphere Application Server hung thread detection and recovery

Customers often ask me a question that goes like this 

Is there a way that hung threads could be manually or automatically killed to prevent the maxed out condition? For instance, once an alarm state is reached, could a script or a person look at the threads, identify "hung" threads, and kill them thus breaking the logjam? As I understand it, if a thread is hung then no data is being sent until the thread process completes. And if the thread dies then the request is simply re-sent by the originating application. So killing a thread will not actually hurt anything. Is this correct? And if so, how would we go about doing that? Trying think outside the box here. If we can't fix this, can we figure out a creative way to prevent it?

My response is as follows :  Yes there is a way out of the logjam. Prior to WebSphere Application Server 8.5 there are two ways to achieve what you are asking for ... 

1.  http://wasdynacache.blogspot.com/2012/03/websphere-application-server-jvm.html
Please read and comment.  Disadvantages of this approach are that tit aborts i.e. kills the JVM on the first hung thread which is abrasive since in-flight requests are killed.  Please note when you abort a JVM you may leave your system in an inconsistent state since in flight transactions and requests are terminated. 

2. Use a java client Thread_Hung event notification program that automatically monitor possibly hung threads and restarts the server via the nodeagent mbean http://www.ibm.com/developerworks/websphere/library/techarticles/0412_kochuba/0412_kochuba.html 

By default the hung thread detection threshold is 10 minutes (600 seconds). You may want to change that to 5 minutes . if you want a super responsive early warning system. Please note that you will need to set com.ibm.websphere.threadmonitor.false.alarm.threshold to 300. 

In WebSphere Application Server 8.5 and after you can use the Intelligent Management health monitoring and management subsystem  to monitor the application server environment and take action when certain criteria are discovered. see Configuring Health Management and Custom health condition subexpression builder

Comments:

Post a Comment

Subscribe to Post Comments [Atom]



Links to this post:

Create a Link



<< Home

Archives

December 2006   September 2008   January 2009   February 2009   March 2009   September 2009   October 2009   November 2009   December 2009   January 2010   February 2010   March 2010   April 2010   October 2010   January 2011   February 2011   April 2011   May 2011   June 2011   July 2011   August 2011   September 2011   October 2011   November 2011   December 2011   January 2012   February 2012   March 2012   April 2012   May 2012   June 2012   July 2012   August 2012   September 2012   October 2012   November 2012   January 2013   May 2013   June 2013   July 2013   September 2013   October 2013   June 2014   August 2014  

This page is powered by Blogger. Isn't yours?

Subscribe to Posts [Atom]