检查发现一套使用asm的rac两个实例基本上每个小时都会报一次ora-32701错误,截取alert日志中错误信息如下:
一:版本信息
操作系统版本:aix 61009
oracle数据库版本:11.2.0.3.11(rac)
二:错误描述
检查发现一套使用asm的rac两个实例基本上每个小时都会报一次ora-32701错误,截取alert日志中错误信息如下:
sat dec 06 09:44:00 2014
errors in file /oracle/app/oracle/diag/rdbms/egmmdb/egmmdb2/trace/egmmdb2_dia0_13500888.trc (incident=1041128):
ora-32701: possible hangs up to hang id=0 detected
incident details in: /oracle/app/oracle/diag/rdbms/egmmdb/egmmdb2/incident/incdir_1041128/egmmdb2_dia0_13500888_i1041128.trc
dia0 terminating blocker (ospid: 15335610 sid: 1299 ser#: 5849) of hang with id = 3
requested by master dia0 process on instance 1
hang resolution reason: although the number of affected sessions did not
justify automatic hang resolution initially, this previously ignored
hang was automatically resolved.
by terminating session sid: 1299 ospid: 15335610
sat dec 06 09:44:01 2014
sweep [inc][1041128]: completed
sweep [inc2][1041128]: completed
dia0 successfully terminated session sid:1299 ospid:15335610 with status 31.
sat dec 06 09:45:35 2014
errors in file /oracle/app/oracle/diag/rdbms/egmmdb/egmmdb2/trace/egmmdb2_dia0_13500888.trc (incident=1041129):
ora-32701: possible hangs up to hang id=0 detected
incident details in: /oracle/app/oracle/diag/rdbms/egmmdb/egmmdb2/incident/incdir_1041129/egmmdb2_dia0_13500888_i1041129.trc
dia0 terminating blocker (ospid: 15335610 sid: 1299 ser#: 5849) of hang with id = 3
requested by master dia0 process on instance 1
hang resolution reason: although the number of affected sessions did not
justify automatic hang resolution initially, this previously ignored
hang was automatically resolved.
by terminating the process
dia0 successfully terminated process ospid:15335610.
sat dec 06 09:45:37 2014
sweep [inc][1041129]: completed
sweep [inc2][1041129]: completed
sat dec 06 10:45:12 2014
errors in file /oracle/app/oracle/diag/rdbms/egmmdb/egmmdb2/trace/egmmdb2_dia0_13500888.trc (incident=1041130):
ora-32701: possible hangs up to hang id=0 detected
incident details in: /oracle/app/oracle/diag/rdbms/egmmdb/egmmdb2/incident/incdir_1041130/egmmdb2_dia0_13500888_i1041130.trc
sat dec 06 10:45:13 2014
sweep [inc][1041130]: completed
sweep [inc2][1041130]: completed
egmmdb2_dia0_13500888_i1041129.trc中截取如下信息:
*** 2014-12-06 09:45:35.770
resolvable hangs in the system
root chain total hang
hang hang inst root #hung #hung hang hang resolution
id type status num sess sess sess conf span action
----- ---- -------- ---- ----- ----- ----- ------ ------ -------------------
3 hang rslnpend 2 1299 2 2 high global terminate process
hang resolution reason: although the number of affected sessions did not
justify automatic hang resolution initially, this previously ignored
hang was automatically resolved.
inst# sessid ser# ospid prcnm event
----- ------ ----- --------- ----- -----
1 1444 7855 10420452 m000 enq: fu - contention
2 1299 5849 15335610 m000 not in wait
dumping process info of pid[155.15335610] (sid:1299, ser#:5849)
requested by master dia0 process on instance 1.
*** 2014-12-06 09:45:35.770
process diagnostic dump for oracle@egmmdb2 (m000), os id=15335610,
pid: 155, proc_ser: 153, sid: 1299, sess_ser: 5849
-------------------------------------------------------------------------------
os thread scheduling delay history: (sampling every 1.000000 secs)
0.000000 secs at [ 09:45:35 ]
note: scheduling delay has not been sampled for 0.376554 secs 0.000000 secs from [ 09:45:31 - 09:45:36 ], 5 sec avg
0.000000 secs from [ 09:44:36 - 09:45:36 ], 1 min avg
0.000000 secs from [ 09:40:36 - 09:45:36 ], 5 min avg
loadavg : 2.68 2.42 2.41
swap info: free_mem = 19881.13m rsv = 256.00m
alloc = 138.07m avail = 65536.00m swap_free = 65397.93m
f s uid pid ppid c pri ni addr sz wchan stime tty time cmd
240001 a oracle 15335610 1 0 60 20 948d16590 209136 f1000a01500d48b0 08:37:22 - 0:01 ora_m000_egmmdb2
short stack dump:
ksedsts()+360
-------------------------------------------------------------------------------
process diagnostic dump actual duration=0.084000 sec
(max dump time=15.000000 sec)