Thursday, September 25, 2014

How to take & read the JVM Thread Dump for OIM Servers

Each Java Application has its own thread dump. A thread dump is a snapshot of the state of all threads that are part of the process. The state of each thread is presented with a so called stack trace, which shows the contents of a thread’s stack. Some of the threads belong to the Java application you are running, while others are JVM internal threads.

A thread dump reveals information about an application’s thread activity that can help you diagnose problems and better optimize application and JVM performance.

The latest versions of JRockit include a utility called "jrcmd" which can be used to grab information from running JVMs as well as control the behaviour of the JVM. For example, it can be used to create thread stack dumps, print the heap usage, get the commandline used to start the JVM, enable the management console and a few other things.

In the most basic incarnation, jrcmd lists the currently running JVMs on the system. For example,

$./jrcmd

7693 weblogic.Server

13369 weblogic.NodeManager -v

7551 weblogic.Server

9420

Note: Please set the JAVA_HOME & PATH environment variables before running this utility.

The first number is the process id of the process followed by the name of the main class and any command line parameters.

The first number is the process id of the process followed by the name of the main class and any command line parameters.

./jrcmd 19570 print_threads >>thread.txt

Note: Here 19570 is the PID for oim_server2 in PROD.

19570:

===== FULL THREAD DUMP ===============
Wed Aug 27 08:55:58 2014
Oracle JRockit(R) R28.2.5-20-152429-1.6.0_37-20120927-1915-linux-x86_64

"Main Thread" id=1 idx=0x4 tid=19571 prio=5 alive, waiting, native_blocked
    -- Waiting for notification on: weblogic/t3/srvr/T3Srvr@0x12fc20ff0[fat lock]

"[STUCK] ExecuteThread: '0' for queue: 'weblogic.kernel.Default (self-tuning)'" id=15 idx=0x68 tid=24579 prio=1 alive, blocked, native_blocked, daemon

A thread that has the name STUCK in it’s name has been identified as not being returned to the thread pool for a configured amount of time, which defaults to 10 minutes.

Print Memory Usage

./jrcmd 6259 print_memusage

6259:
Total mapped                  6847396KB           (reserved=1206632KB)
-              Java heap      4194304KB           (reserved=0KB)
-              GC tables       140300KB
-          Thread stacks        63828KB           (#threads=198)
-          Compiled code      1048576KB           (used=63385KB)
-               Internal         1480KB
-                     OS       367452KB
-                  Other       491296KB
-            Classblocks        28416KB           (malloced=28323KB #70508)
-        Java class data       510720KB           (malloced=509750KB #327854 in 70508 classes)
- Native memory tracking         1024KB           (malloced=240KB #10)

Print JVM State

./jrcmd 6259 print_vm_state

6259:
Uptime       : 0 days, 01:20:55 on Wed Aug 27 10:30:06 2014
Version      : Oracle JRockit(R) R28.2.5-20-152429-1.6.0_37-20120927-1915-linux-x86_64
CPU          : Intel Core i7 (HT) SSE SSE2 SSE3 SSSE3 SSE4.1 SSE4.2 Core Intel64
Number CPUs  : 16
Tot Phys Mem : 76155084800 (72627 MB)
OS version   : Red Hat Enterprise Linux Server release 5.7 (Tikanga)
Linux version 2.6.32-300.41.2.el5uek (mockbuild@ca-build56.us.oracle.com) (gcc version 4.1.2 20080704 (Red Hat 4.1.2-50)) #1 SMP Tue May 7 21:19:44 PDT 2013 (x86_64)
Thread System: Linux NPTL
LibC release : 2.5-stable
Java locking : Lazy unlocking enabled (class banning) (transfer banning)
State        : JVM is running
LD_ASSUME_KERNEL:
LD_PRELOAD   :
StackOverFlow: 0 StackOverFlowErrors have occured
OutOfMemory  : 0 OutOfMemoryErrors have occured
C Heap       : Good; no memory allocations have failed
GC Strategy  : Mode: throughput, with strategy: genparpar (basic strategy: genparpar)
GC Status    : OC currently running, in phase: cleanup. This is OC#63.
             : YC is not running. Last finished YC was YC#729.
YC Promotion : Last YC successfully promoted all objects
YC History   : Ran 12 YCs before OC#59.
             : Ran 14 YCs before OC#60.
             : Ran 12 YCs before OC#61.
             : Ran 14 YCs before OC#62.
             : Ran 13 YCs before OC#63.
Heap         : 0x100000000 - 0x200000000  (Size: 4096 MB)
Compaction   : 0x1c0000000 - 0x1e0000000  (Current compaction type: external)
Allocation   : TLA-min: 2048, TLA-preferred: 65536 TLA-waste limit: 2048
NurseryList  : 0x118278e50 - 0x192d34590
KeepArea     : 0x178bed9a0 - 0x192d34590
KA Markers   : [ 0x1606298a0,  0x178bed9a0 , 0x192d34590 ]
Forbidden A  : (none)
Previous KA  : (none)
Previous FA  : (none)
CompRefs     : References are compressed, with heap base 0x0 and shift 3.

You can also use weblogic console to view the no. of threads which are in STUCK, HOGGING, STANDBY,ACTIVE state.

Known Limitations of jrcmd:

In order to issue diagnostic commands to a process, you must run jrcmd with the same user as the one running the Java process.

 How to Read Thread Dumps

29156:

// The thread dump starts with the date and time of the dump, and the version number of the JRockit JVM used

===== FULL THREAD DUMP ===============
Thu Sep  4 16:10:17 2014
Oracle JRockit(R) R28.2.5-20-152429-1.6.0_37-20120927-1915-linux-x86_64


//There is a thread information line, followed by information about locks and a trace of the thread’s stack at the moment of the thread dump.

// The main thread is running thread (alive), its either in executing JVM internal code and is currently waiting for an object to be released (waiting). Virtually, all the threads in thread dump will be alive.

"Main Thread" id=1 idx=0x4 tid=29157 prio=5 alive, waiting, native_blocked
    -- Waiting for notification on: weblogic/t3/srvr/T3Srvr@0x127aeec50[fat lock]
    at jrockit/vm/Threads.waitForNotifySignal(JLjava/lang/Object;)Z(Native Method)
    at java/lang/Object.wait(J)V(Native Method)
    at java/lang/Object.wait(Object.java:485)
    at weblogic/t3/srvr/T3Srvr.waitForDeath(T3Srvr.java:981)
    ^-- Lock released while waiting: weblogic/t3/srvr/T3Srvr@0x127aeec50[fat lock]
    at weblogic/t3/srvr/T3Srvr.run(T3Srvr.java:490)
    at weblogic/Server.main(Server.java:71)
    at jrockit/vm/RNI.c2java(JJJJJ)V(Native Method)
    -- end of trace

// JVM Internal threads. All these threads are daemon.          
           
"(Signal Handler)" id=2 idx=0x8 tid=29158 prio=5 alive, native_blocked, daemon

"(OC Main Thread)" id=3 idx=0xc tid=29159 prio=5 alive, native_waiting, daemon

"(GC Worker Thread 1)" id=? idx=0x10 tid=29160 prio=5 alive, daemon

"(GC Worker Thread 2)" id=? idx=0x14 tid=29161 prio=5 alive, daemon

"(GC Worker Thread 3)" id=? idx=0x18 tid=29162 prio=5 alive, daemon

Note: By default, the stack trace is not printed for JVM Internal threads.

Types of Lock

There are four types of Locks;

Fat Lock: A fat lock is a lock with history of contention (several threads try to take the lock simulatenously)

Thin Lock: A thin lock is a lock that doesn’t have any contention.

Recursive Lock: A recursive lock is when a thread takes several locks on an object without releasing it.

Lazy Lock: A Lazy Lock is a lock that’s not released when a critical section is exited.

Lock Chains:

Thread A and B form a chain when thread A holds lock on an object on which thread B is trying to take the lock.




No comments: