A DB suddenly hang during a load. User come to me.
When i login the server, i can see the issue DB consuming 100% CPU on the server.
Tried "sqlplus / as sysdba" but also hang.
I gernerate a systemstate dump and notice that the PMON is blocked:
PROCESS 2: PMON O/S info: user: oracle, term: UNKNOWN, ospid: 1193 OSD pid info: Unix process pid: 1193, image: oracle@alpcisddb484.corporate.ge.com (PMON) waiting for 60108b48 Child shared pool level=7 child#=3 Location from where latch is held: kgh.h LINE:6387 ID:kghalo: Context saved from call: 0 state=busy [holder orapid=89] wlstate=free [value=0] waiters [orapid (seconds since: put on list, posted, alive check)]: possible holder pid = 89 ospid=10478
Let's check which session is holding latch 60108b48, and what is the session waiting for:
PROCESS 89: O/S info: user: oracle, term: UNKNOWN, ospid: 10478 OSD pid info: Unix process pid: 10478, image: oracle@alpcisddb484.corporate.ge.com holding (efd=8) 60108b48 Child shared pool level=7 child#=3 Current Wait Stack: 0: waiting for 'library cache: mutex X' idn=0xfc57b2b7, value=0xd100000000, where=0x4f wait_id=57 seq_num=60 snap_id=1 wait times: snap=45 min 11 sec, exc=45 min 11 sec, total=45 min 11 sec wait times: max=infinite, heur=45 min 11 sec wait counts: calls=0 os=0 in_wait=1 iflags=0x5a2
Let’s see which session is holding mutex 0xfc57b2b7 and what is the session waiting for:
PROCESS 42: M000 SO: 0x19f40dda8, type: 4, owner: 0x19f2de3f8, flag: INIT/-/-/0x00 if: 0x3 c: 0x3 proc=0x19f2de3f8, name=session, file=ksu.h LINE:12459, pg=0 (session) sid: 209 ser: 521 trans: (nil), creator: 0x19f2de3f8 Current Wait Stack: Not in wait; last wait ended 45 min 10 sec ago
OK, till now we found the root blocking is M000.
It is holding mutex:
PROCESS 42: M000 SO: 0x19f40dda8, type: 4, owner: 0x19f2de3f8, flag: INIT/-/-/0x00 if: 0x3 c: 0x3 proc=0x19f2de3f8, name=session, file=ksu.h LINE:12459, pg=0 (session) sid: 209 ser: 521 trans: (nil), creator: 0x19f2de3f8 Current Wait Stack: Not in wait; last wait ended 45 min 10 sec ago KGL-UOL SO Cache(total=240, free=0) KGX Atomic Operation Log 0x17d569fc0 Mutex 0x19eb2b4c8(209, 0) idn 1fc57b2b7 oper EXCL Library Cache uid 209 efd 7 whr 49 slp 0 oper=0 pt1=(nil) pt2=(nil) pt3=(nil)
But above also shows that M000 itself not in a wait since 45 mins back. What is M000 doing and why it doesn't release that mutex all along?
It is un-reasonable.
After review the process state of M000, i notice the process is already dead:
SO: 0x19f2de3f8, type: 2, owner: (nil), flag: INIT/-/-/0x00 if: 0x3 c: 0x3 proc=0x19f2de3f8, name=process, file=ksu.h LINE:12451, pg=0 (process) Oracle pid:42, ser:16, calls cur/top: 0x1908c1eb8/0x1906a28f8 flags : (0x3) DEAD flags2: (0x8030), flags3: (0x0) intr error: 0, call error: 0, sess error: 0, txn error 0 intr queue: empty ksudlp FALSE at location: 0 Cleanup details: Found dead = 44 min 27 sec ago Total Cleanup attempts = 10, Cleanup time = 15 min 49 sec, Cleanup timer = 15 min 45 sec Last attempt (full) ended 54 sec ago, Length = 2 min 30 sec, Cleanup timer = 2 min 30 sec
OK. Now it is clear to us that M000's death leading to the DB hang.
But why M000 suddenly dead?
After checking alert.log i find blow:
Thu Dec 06 22:00:48 2012 Exception [type: SIGSEGV, Address not mapped to object] [ADDR:0x1060] [PC:0x90ED147, kglic0()+1075] [flags: 0x0, count: 1] Errors in file /x484/dump01/oracle/hrxt/diag/rdbms/hrxdt/hrxdt_m000_10618.trc (incident=6337): ORA-07445: exception encountered: core dump [kglic0()+1075] [SIGSEGV] [ADDR:0x1060] [PC:0x90ED147] [Address not mapped to object] []
Now all are clear.
After searching on metalink with above symbol, our scenario excatly matches this bug:
MMON Slave Process Reports ORA-7445 [kglic0] Error, Plus Database Hangs [ID 1487720.1]
0 Comments:
Post a Comment