今天接应用人员电话反应,一套备份数据库在加完表空间后出现异常,现象是数据库能查询,但是update很慢。
今天接应用人员电话反应,一套备份数据库在加完表空间后出现异常,现象是数据库能查询,但是update很慢。
故障处理详细:
1、查看alert日志如下:
thread 1 advanced to log sequence 16541 (lgwr switch)
current log# 1 seq# 16541 mem# 0: +data/racdb/onlinelog/group_1.262.792077131
current log# 1 seq# 16541 mem# 1: +data/racdb/onlinelog/group_1.263.792077153
fri sep 12 09:52:51 2014
archived log entry 3480 added for thread 1 sequence 16540 id 0x10fd7185 dest 1:
fri sep 12 09:59:08 2014
alter tablespace alarm_data_tbs add datafile '+data' size 10g autoextend off
fri sep 12 10:01:11 2014
completed: alter tablespace alarm_data_tbs add datafile '+data' size 10g autoextend off
fri sep 12 10:06:17 2014
alter tablespace alarm_data_tbs add datafile '+data' size 10g autoextend off
fri sep 12 10:13:44 2014
minact-scn: useg scan erroring out with error e:12751
fri sep 12 10:17:50 2014
errors in file /oracle/app/oracle/diag/rdbms/racdb/racdb/trace/racdb_m001_18487.trc:
ora-12751: cpu time or run time policy violation
fri sep 12 10:22:46 2014
errors in file /oracle/app/oracle/diag/rdbms/racdb/racdb/trace/racdb_ora_17431.trc (incident=362566):
ora-00494: 持有入队 [cf] 的时间过长 (超过 900 秒) (由 'inst 1, osid 18462')
incident details in: /oracle/app/oracle/diag/rdbms/racdb/racdb/incident/incdir_362566/racdb_ora_17431_i362566.trc
fri sep 12 10:23:16 2014
killing enqueue blocker (pid=18462) on resource cf-00000000-00000000 by (pid=17431)
by killing session 2140.29719
attempt to get control file enqueue by user pid=17431 (mode=x, type=0, timeout=900) is being blocked by inst=1, pid=18462
please check inst 1's alert log for more information on the blocker including a possible ora-00494 and related incident logs
fri sep 12 10:23:57 2014
minact-scn: useg scan erroring out with error e:12751
fri sep 12 10:23:57 2014
sweep [inc][362566]: completed
sweep [inc2][362566]: completed
fri sep 12 10:25:10 2014
errors in file /oracle/app/oracle/diag/rdbms/racdb/racdb/trace/racdb_lgwr_9336.trc (incident=360259):
ora-00494: enqueue [cf] held for too long (more than 900 seconds) by 'inst 1, osid 18462'
incident details in: /oracle/app/oracle/diag/rdbms/racdb/racdb/incident/incdir_360259/racdb_lgwr_9336_i360259.trc
fri sep 12 10:25:40 2014
killing enqueue blocker (pid=18462) on resource cf-00000000-00000000 by (pid=9336)
by killing session 2140.29719
attempt to get control file enqueue by lgwr pid=9336 (mode=x, type=0, timeout=900) is being blocked by inst=1, pid=18462
please check inst 1's alert log for more information on the blocker including a possible ora-00494 and related incident logs
fri sep 12 10:28:16 2014
errors in file /oracle/app/oracle/diag/rdbms/racdb/racdb/trace/racdb_ora_17431.trc (incident=362567):
ora-00494: 持有入队 [cf] 的时间过长 (超过 900 秒) (由 'inst 1, osid 18462')
incident details in: /oracle/app/oracle/diag/rdbms/racdb/racdb/incident/incdir_362567/racdb_ora_17431_i362567.trc
fri sep 12 10:28:46 2014
killing enqueue blocker (pid=18462) on resource cf-00000000-00000000 by (pid=17431)
by terminating the process
attempt to get control file enqueue by user pid=17431 (mode=x, type=0, timeout=300) is being blocked by inst=1, pid=18462
please check inst 1's alert log for more information on the blocker including a possible ora-00494 and related incident logs
fri sep 12 10:28:46 2014
thread 1 advanced to log sequence 16542 (lgwr switch)
current log# 2 seq# 16542 mem# 0: +data/racdb/onlinelog/group_2.264.792077173
current log# 2 seq# 16542 mem# 1: +data/racdb/onlinelog/group_2.265.792077193
2、进入数据库中手动切换日志(alter system switch logfile)很慢几近于不动。
3、分析日志发现lgwr进程一直在等待cf enqueue,
dump continued from file: /oracle/app/oracle/diag/rdbms/racdb/racdb/trace/racdb_lgwr_5854.trc
ora-00494: enqueue [cf] held for too long (more than 900 seconds) by 'inst 1, osid 7955'
========= dump for incident 380258 (ora 494) ========
----- beginning of customized incident dump(s) -----
-------------------------------------------------------------------------------
enqueue [cf] held for too long
enqueue holder: 'inst 1, osid 7955'
4、进一步分析发现很多次control file sequential read等待,即处于io等待状态,
为什么出现control file sequential read,可能控制文件不在了,或者控制文件或其快照所在目录掉了。
current wait stack:
not in wait; last wait ended 17 min 35 sec ago
there are 40 sessions blocked by this session.
dumping one waiter:
inst: 1, sid: 1653, ser: 1
wait event: 'enq: cf - contention'
p1: 'name|mode'=0x43460005
p2: '0'=0x0
p3: 'operation'=0x0
row_wait_obj#: 4294967295, block#: 0, row#: 0, file# 0
min_blocked_time: 898 secs, waiter_cache_ver: 407
wait state:
fixed_waits=0 flags=0x21 boundary=0x0/-1
session wait history:
elapsed time of 17 min 35 sec since last wait
0: waited for 'control file sequential read'
file#=0x0, block#=0x12, blocks=0x1
wait_id=10168 seq_num=10169 snap_id=1
wait times: snap=0.000352 sec, exc=0.000352 sec, total=0.000352 sec
wait times: max=infinite
wait counts: calls=0 os=0
occurred after 0.000018 sec of elapsed time
1: waited for 'control file sequential read'
file#=0x0, block#=0x10, blocks=0x1
wait_id=10167 seq_num=10168 snap_id=1
wait times: snap=0.000320 sec, exc=0.000320 sec, total=0.000320 sec
wait times: max=infinite
wait counts: calls=0 os=0
occurred after 0.000018 sec of elapsed time
5、进入rman中查看控制文件备份路径均为/oradata/racdbdb_rman_bak目录,,详细如下:
rman> show all;