Today, one of our most critical production DB reports lock issue.
When i login into the DB there are already more then 200+ locks, and most of them are TM lock.
Below is the list:
SESS ID1 ID2 LMODE REQUEST TY CTIME
--------------- ---------- ---------- ---------- ---------- -- ----------
Holder: 4738 16737 0 2 0 TM 4021
Holder: 2080 16737 0 2 0 TM 4021
............
............
Holder: 3480 17347 0 3 0 TM 5958
Waiter: 4975 17347 0 0 2 TM 5792
Waiter: 1015 17347 0 0 2 TM 4090
Waiter: 1058 17347 0 0 2 TM 5832
............
Holder: 4729 983064 3155735 6 0 TX 3229
Waiter: 2301 983064 3155735 0 6 TX 2327
Waiter: 1194 2752522 2373725 0 6 TX 2931
Waiter: 3359 2752522 2373725 0 6 TX 2397
286 rows selected.
There are see 286 locking sessions as well as so dozens blockers which of couse we can't kill them all.
Then here is the problem: how to identify the root session?
In 11g there is a powerful View "V$WAIT_CHAINS". But our DB is 10g.
BLOCKING_SESSION column in v$session is useful but not for so many locks and so many blockers.
systemstate dump is also not a choice since i lost most of my scripts including ass109.awk after computer re-installed.
Then the left choice is hanganalyze, let's do it:
==============
HANG ANALYSIS:
==============
Found 106 objects waiting for
<0/4634/18424/0x600086e0/28223/enq: TM - contention>
Found 105 objects waiting for
<0/3317/59720/0x20005f58/27464/enq: TM - contention>
...............
...............
Found 20 objects waiting for
<0/4974/30755/0x60041418/29550/enq: TM - contention>
Open chains found:
Chain 1 :
<0/3480/26248/0x200229e8/190/db file sequential read>
-- <0/4634/18424/0x600086e0/28223/enq: TM - contention>
-- <0/679/55269/0xa004e990/13104/enq: TM - contention>
-- <0/793/864/0xa0054088/25644/enq: TM - contention>
Chain 2 :
<0/3797/1/0x20001048/24805/log file parallel write>
-- <0/707/60023/0xa001bb38/27013/log file sync>
Other chains found:
Chain 3 :
<0/689/49815/0xa0016c28/15330/enq: TM - contention>
-- <0/793/864/0xa0054088/25644/enq: TM - contention>
...............
...............
Chain 104 :
<0/5041/14667/0x6000b650/6616/PX Deq: Table Q Normal>
State of nodes
([nodenum]/cnode/sid/sess_srno/session/ospid/state/start/finish/[adjlist]/predecessor):
[678]/0/679/55269/0xa05c3618/13104/NLEAF/1/8/[4633][3316]/792
[681]/0/682/5041/0xa05c7650/22742/IGN/9/10//none
[682]/0/683/33667/0xa05c8bb8/27585/IGN/11/12//none
[687]/0/688/35783/0xa05cf6c0/13708/IGN/13/14//none
...............
...............
[5063]/0/5064/1/0x608d2e40/24807/IGN/1657/1658//none
[5064]/0/5065/1/0x608d43a8/24799/IGN/1659/1660//none
====================
END OF HANG ANALYSIS
====================
From above hanganylyze session it is clear that root chain (Open chains part) is:
Open chains found:
Chain 1 :
<0/3480/26248/0x200229e8/190/db file sequential read>
-- <0/4634/18424/0x600086e0/28223/enq: TM - contention>
-- <0/679/55269/0xa004e990/13104/enq: TM - contention>
-- <0/793/864/0xa0054088/25644/enq: TM - contention>
Chain 2 :
<0/3797/1/0x20001048/24805/log file parallel write>
-- <0/707/60023/0xa001bb38/27013/log file sync>
And the LEAF is:
server444[oracle]_hrpp1> cat supp_ora_17698.trc|grep -i single|grep -v none
[level 5] : 83 node dumps -- [SINGLE_NODE] [SINGLE_NODE_NW] [IGN_DMP]
server444[oracle]_hrpp1> cat supp_ora_17698.trc|grep -i leaf|grep -v NLEAF
[level 4] : 2 node dumps -- [REMOTE_WT] [LEAF] [LEAF_NW]
[3479]/0/3480/26248/0x20729710/190/LEAF/3/4//4633
[3796]/0/3797/1/0x208d18d8/24805/LEAF/28/29//706
Now it is clear to us Session 3480 and 3797 have the most suspect.
After check these two sessions in DB, we identify the 3480 session as the root session since it is also an pldevelop session:
SSID SERIAL# SPID OSUSER USERNAME MACHINE LOGON_TIME STATE STATUS SQL_ID PREV_SQL_ID WAIT_TIME SECONDS_IN_WAIT EVENT BLOCKING_INSTANCE BLOCKING_SESSION
----- ---------- ------------ ---------- -------------------- ---
PROGRAM
------------------------------------------------
SQL_TEXT
-----------------------------------------------------------------
3480 26248 190 324979 SUPXXXHOC MYHOMEGDC\MYH136092 07-JUN-12 16:17 WAITED SHORT TIME ACTIVE 5xzua1wzkfwpy fvgnun67x1bcg -1 1 db file sequential read
plsqldev.exe
DELETE FROM T_HAD_XXXX_AUDIT WHERE FORM_ID = :B1
After killed the session all 280+ locks released immediate.
GE, AGFA. experienced in 9i/10g/11g/12c, AdvanceReplication, Stream, GoldenGate, RAC, DataGuard, ODA, EXADATA, GridControl. In my career I have always been assigned to handle most complicated and critical oracle issues without solution online, and I hope to record some of them via this blog.
June 8, 2012
identify root session among hundreds of locks using hanganalyze
Tags:
hanganalyze,
identify,
Lock,
root
Subscribe to:
Post Comments (Atom)
0 Comments:
Post a Comment