locked <地址> 目标:使用 synchronized 申请对象锁成功,监视器的拥有者。 waiting to lock <地址> 目标:使用 synchronized 申请对象锁未成功,在迚入区等待。 waiting on <地址> 目标:使用 synchronized 申请对象锁成功后,释放锁幵在等待区等待。 parking to wait for <地址> 目标
locked
1 2 3 4 5
at oracle.jdbc.driver.PhysicalConnection.prepareStatement - locked <0x00002aab63bf7f58> (a oracle.jdbc.driver.T4CConnection) at oracle.jdbc.driver.PhysicalConnection.prepareStatement - locked <0x00002aab63bf7f58> (a oracle.jdbc.driver.T4CConnection) at com.jiuqi.dna.core.internal.db.datasource.PooledConnection.prepareStatement
at com.jiuqi.dna.core.impl.CacheHolder.isVisibleIn(CacheHolder.java:165) - waiting to lock <0x0000000097ba9aa8> (a CacheHolder) at com.jiuqi.dna.core.impl.CacheGroup$Index.findHolder at com.jiuqi.dna.core.impl.ContextImpl.find at com.jiuqi.dna.bap.basedata.common.util.BaseDataCenter.findInfo
at java.lang.Object.wait(Native Method) - waiting on <0x00000000da2defb0> (a WorkingThread) at com.jiuqi.dna.core.impl.WorkingManager.getWorkToDo - locked <0x00000000da2defb0> (a WorkingThread) at com.jiuqi.dna.core.impl.WorkingThread.run
所以要结合系统的一些性能观察工具来综合分析,比如 netstat 统计单位时间的发送包的数目,如果很明显超过了所在网络带宽的限制;观察 cpu 的利用率,如果系统态的 CPU 时间,相对于用户态的 CPU 时间比例较高;如果程序运行在 Solaris 10 平台上,可以用 dtrace 工具看系统调用的情况,如果观察到 read/write 的系统调用的次数或者运行时间遥遥领先;这些都指向由于网络带宽所限导致的网络瓶颈。
线程 Dump 的分析
原则
结合代码阅读的推理。需要线程 Dump 和源码的相互推到和印证。
造成 Bug 的根源往往会在调用栈上直接体现,一定要格外注意线程当前调用之前的所有调用。
入手点
进入区等待
1 2 3 4 5
"d&a-3588" daemon waiting for monitor entry [0x000000006e5d5000] java.lang.Thread.State: BLOCKED (on object monitor) at com.jiuqi.dna.bap.authority.service.UserService$LoginHandler.handle() - waiting to lock <0x0000000602f38e90> (a java.lang.Object) at com.jiuqi.dna.bap.authority.service.UserService$LoginHandler.handle()
线程状态 BLOCKED,线程动作 wait on monitor entry,调用修饰 waiting to lock 总是一起出现。表示在代码级别已经存在冲突的调用。必然有问题的代码,需要尽可能减少其发生。
同步块阻塞
一个线程锁住某对象,大量其他线程在该对象上等待。
1 2 3 4 5 6 7 8 9 10 11 12
"blocker" runnable java.lang.Thread.State: RUNNABLE at com.jiuqi.hcl.javadump.Blocker$1.run(Blocker.java:23) - locked <0x00000000eb8eff68> (a java.lang.Object) "blockee-11" waiting for monitor entry java.lang.Thread.State: BLOCKED (on object monitor) at com.jiuqi.hcl.javadump.Blocker$2.run(Blocker.java:41) - waiting to lock <0x00000000eb8eff68> (a java.lang.Object) "blockee-86" waiting for monitor entry java.lang.Thread.State: BLOCKED (on object monitor) at com.jiuqi.hcl.javadump.Blocker$2.run(Blocker.java:41) - waiting to lock <0x00000000eb8eff68> (a java.lang.Object)
"d&a-614" daemon prio=6 tid=0x0000000022f1f000 nid=0x37c8 runnable [0x0000000027cbd000] java.lang.Thread.State: RUNNABLE at java.net.SocketInputStream.socketRead0(Native Method) at java.net.SocketInputStream.read(Unknown Source) at oracle.net.ns.Packet.receive(Packet.java:240) at oracle.net.ns.DataPacket.receive(DataPacket.java:92) at oracle.net.ns.NetInputStream.getNextPacket(NetInputStream.java:172) at oracle.net.ns.NetInputStream.read(NetInputStream.java:117) at oracle.jdbc.driver.T4CMAREngine.unmarshalUB1(T4CMAREngine.java:1034) at oracle.jdbc.driver.T4C8Oall.receive(T4C8Oall.java:588)
分线程调度的休眠
正常的线程池等待
1 2 3 4 5 6
"d&a-131" in Object.wait() java.lang.Thread.State: TIMED_WAITING (on object monitor) at java.lang.Object.wait(Native Method) at com.jiuqi.dna.core.impl.WorkingManager.getWorkToDo(WorkingManager.java:322) - locked <0x0000000313f656f8> (a com.jiuqi.dna.core.impl.WorkingThread) at com.jiuqi.dna.core.impl.WorkingThread.run(WorkingThread.java:40)
可疑的线程等待
1 2 3 4 5 6 7
"d&a-121" in Object.wait() java.lang.Thread.State: WAITING (on object monitor) at java.lang.Object.wait(Native Method) at java.lang.Object.wait(Object.java:485) at com.jiuqi.dna.core.impl.AcquirableAccessor.exclusive() - locked <0x00000003011678d8> (a com.jiuqi.dna.core.impl.CacheGroup) at com.jiuqi.dna.core.impl.Transaction.lock()
入手点总结
wait on monitor entry: 被阻塞的,肯定有问题
runnable: 注意 IO 线程
**in Object.wait()**: 注意非线程池等待
使用
想要学习一个命令,先来看看帮助,使用 jstack -help 查看帮助:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
pudding@DESKTOP-1QCHCU4:~$ jstack -help Usage: jstack [-l] <pid> (to connect to running process) jstack -F [-m] [-l] <pid> (to connect to a hung process) jstack [-m] [-l] <executable> <core> (to connect to a core file) jstack [-m] [-l] [server_id@]<remote server IP or hostname> (to connect to a remote debug server)
Options: -F to force a thread dump. Use when jstack <pid> does not respond(process is hung) -m to print both java and nativeframes(mixed mode) -l long listing. Prints additional information about locks -h or -help to print this help message
"Reference Handler" daemon prio=10 tid=0x00007fbbcc06e000 nid=0x286c in Object.wait() [0x00007fbbc8dfc000] java.lang.Thread.State: WAITING (on object monitor) at java.lang.Object.wait(Native Method) - waiting on <0x0000000783e066e0> (a java.lang.ref.Reference$Lock) at java.lang.Object.wait(Object.java:503) at java.lang.ref.Reference$ReferenceHandler.run(Reference.java:133) - locked <0x0000000783e066e0> (a java.lang.ref.Reference$Lock)
Found one Java-level deadlock: ============================= "Thread-1": waiting to lock monitor 0x00007f6274003ac8 (object 0x000000077e004ec0, a java.lang.Object), which is held by "Thread-0" "Thread-0": waiting to lock monitor 0x00007f62740050c8 (object 0x000000077e004ed0, a java.lang.Object), which is held by "Thread-1"
Java stack information for the threads listed above: =================================================== "Thread-1": at DeadLockclass.run(JStackDemo.java:40) - waiting to lock <0x000000077e004ec0> (a java.lang.Object) - locked <0x000000077e004ed0> (a java.lang.Object) at java.lang.Thread.run(Thread.java:750) "Thread-0": at DeadLockclass.run(JStackDemo.java:27) - waiting to lock <0x000000077e004ed0> (a java.lang.Object) - locked <0x000000077e004ec0> (a java.lang.Object) at java.lang.Thread.run(Thread.java:750)
Found 1 deadlock.
哈哈,堆栈写的很明显,它告诉我们 Found one Java-level deadlock,然后指出造成死锁的两个线程的内容。然后,又通过 Java stack information for the threads listed above来显示更详细的死锁的信息。 他说