听不止一个人说起,学技术就要不断地折腾、搞破坏,比如说备份恢复,你就可以尝试删文件,不管是日志文件、临时文件、数据文件还
听不止一个人说起,学技术就要不断地折腾、搞破坏,比如说备份恢复,你就可以尝试删文件,不管是日志文件、临时文件、数据文件还是system文件。删了之后,重启数据库肯定报错,有的甚至当时数据库就挂掉,这样你就可以学着恢复,一破一立之间,很多常规的备份恢复手段也就算是领教了。我今天就尝试着把虚拟机上的一个重做日志文件组删除。
1.环境准备
我们在oracle11g中进行测试,数据库处于非归档状态。
sql>
sql> select * from v$version;
banner
--------------------------------------------------------------------------------
oracle database 11g enterprise edition release 11.2.0.3.0 - production
pl/sql release 11.2.0.3.0 - production
core 11.2.0.3.0 production
tns for linux: version 11.2.0.3.0 - production
nlsrtl version 11.2.0.3.0 - production
sql>
sql> archive log list;
database log mode no archive mode
automatic archival disabled
archive destination /u01/app/oracle/product/11.2.0/dbhome_1/dbs/arch
oldest online log sequence 9
current log sequence 11
sql>
2.删除一个重做日志文件组,重启数据库报错
首先,通过查询v$log视图来获取数据库重做日志文件组的状态。
sql> select group#,members,archived,status from v$log;
group# members arc status
---------- ---------- --- ----------------
1 1 no inactive
2 1 no current
3 1 no inactive
sql>
然后,通过ls命令查看数据文件,,删除第一个重做日志文件组(该文件组只有一个日志成员)。
[oracle@ hoegh hoegh]$ ls
control01.ctl redo01.log sysaux01.dbf undotbs01.dbf
control02.ctl redo02.log system01.dbf users01.dbf
example01.dbf redo03.log temp01.dbf
[oracle@hoegh hoegh]$
[oracle@hoegh hoegh]$
[oracle@hoegh hoegh]$ rm redo01.log
[oracle@hoegh hoegh]$ ls
control01.ctl control02.ctl example01.dbf redo02.log redo03.log sysaux01.dbf system01.dbf temp01.dbf undotbs01.dbf users01.dbf
重启数据库,数据库报错。
sql>
sql> shu immediate
database closed.
database dismounted.
oracle instance shut down.
sql> startup
oracle instance started.
total system global area 941600768 bytes
fixed size 1348860 bytes
variable size 515902212 bytes
database buffers 419430400 bytes
redo buffers 4919296 bytes
database mounted.
ora-03113: end-of-file on communication channel
process id: 5196
session id: 125 serial number: 5
sql>
sql> select status from v$instance;
error:
ora-03114: not connected to oracle
sql>
3.查看报警日志文件,定位问题
ora-03113报错是一个非常经典的报错,报错原因多种多样,从报错信息中并看不出是什么原因导致的报错,我们可以到报警日志文件中查看有价值的线索。
[oracle@enmoedu1 trace]$ tail -40 alert_hoegh.log
wed jul 08 21:59:30 2015
mmon started with pid=15, os id=5443
wed jul 08 21:59:30 2015
mmnl started with pid=16, os id=5445
starting up 1 dispatcher(s) for network address \'(address=(partial=yes)(protocol=tcp))\'...
starting up 1 shared server(s) ...
oracle_base from environment = /u01/app/oracle
wed jul 08 21:59:39 2015
alter database mount
wed jul 08 21:59:43 2015
successful mount of redo thread 1, with mount id 2105928075
database mounted in exclusive mode
lost write protection disabled
completed: alter database mount
wed jul 08 22:11:45 2015
time drift detected. please check vktm trace file for more details.
wed jul 08 22:11:59 2015
alter database open
wed jul 08 22:11:59 2015
errors in file /u01/app/oracle/diag/rdbms/hoegh/hoegh/trace/hoegh_lgwr_5435.trc:
ora-00313: open failed for members of log group 1 of thread 1
ora-00312: online log 1 thread 1: \'/u01/app/oracle/oradata/hoegh/redo01.log\'
ora-27037: unable to obtain file status
linux error: 2: no such file or directory
additional information: 3
errors in file /u01/app/oracle/diag/rdbms/hoegh/hoegh/trace/hoegh_lgwr_5435.trc:
ora-00313: open failed for members of log group 1 of thread 1
ora-00312: online log 1 thread 1: \'/u01/app/oracle/oradata/hoegh/redo01.log\'
ora-27037: unable to obtain file status
linux error: 2: no such file or directory
additional information: 3
errors in file /u01/app/oracle/diag/rdbms/hoegh/hoegh/trace/hoegh_ora_5451.trc:
ora-00313: open failed for members of log group 1 of thread
ora-00312: online log 1 thread 1: \'/u01/app/oracle/oradata/hoegh/redo01.log\'
user (ospid: 5451): terminating the instance due to error 313
wed jul 08 22:12:00 2015
system state dump requested by (instance=1, osid=5451), summary=[abnormal instance termination].
system state dumped to trace file /u01/app/oracle/diag/rdbms/hoegh/hoegh/trace/hoegh_diag_5425.trc
dumping diagnostic data in directory=[cdmp_20150708221200], requested by (instance=1, osid=5451), summary=[abnormal instance termination].
instance terminated by user, pid = 5451
其中,黄色标注部分为关键信息,我们知道“/u01/app/oracle/oradata/hoegh/redo01.log”这个文件找不到了。
4.启动数据库到mount状态,重建重做日志文件组
从报警日志可以看出,第一组重做日志文件组丢了,我们可以通过sql语句“alter database clear logfile group 1;”重建日志文件组;确认日志文件创建成功后,将数据库切换到open状态。
sql> startup nomount
oracle instance started.
total system global area 941600768 bytes
fixed size 1348860 bytes
variable size 515902212 bytes
database buffers 419430400 bytes
redo buffers 4919296 bytes
sql> alter database mount;
database altered.
sql> select group#,sequence#,archived,status from v$log;
group# sequence# arc status
---------- ---------- --- ----------------
1 7 no inactive
3 6 no inactive
2 8 no current
sql>
sql>
sql>
sql>
sql> alter database clear logfile group 1;
database altered.
sql> select group#,sequence#,archived,status from v$log;
group# sequence# arc status
---------- ---------- --- ----------------
1 0 no unused
3 6 no inactive
2 8 no current
启动数据库到open状态
sql>
sql> alter database open;
database altered.
sql>
sql> select group#,sequence#,archived,status from v$log;
group# sequence# arc status
---------- ---------- --- ----------------
1 0 no unused
2 8 no current
3 6 no inactive
此时我们再次查看文件列表,结果如下。
[oracle@hoegh hoegh]$ ls
control01.ctl control02.ctl example01.dbf redo01.log redo02.log redo03.log sysaux01.dbf system01.dbf temp01.dbf undotbs01.dbf users01.dbf
[oracle@hoegh hoegh]$
5.手动切换重做日志文件组
为了确保新建的日志文件组可用,我们可以手动切换日志文件组,改变新建日志文件组的状态(由unused改为其他)。
sql>
sql> alter system switch logfile;
system altered.
sql> select group#,sequence#,archived,status from v$log;
group# sequence# arc status
---------- ---------- --- ----------------
1 9 no current
2 8 no active
3 6 no inactive
sql> alter system switch logfile;
system altered.
sql> select group#,sequence#,archived,status from v$log;
group# sequence# arc status
---------- ---------- --- ----------------
1 9 no active
2 8 no active
3 10 no current
sql> alter system switch logfile;
system altered.
sql> select group#,sequence#,archived,status from v$log;
group# sequence# arc status
---------- ---------- --- ----------------
1 9 no inactive
2 11 no current
3 10 no inactive
sql>
其中,
current:表示该日志组为当前日志组,oracle正在使用该日志组;
active:当current redo组发生日志切换时,状态会改变为active,在这个状态下,如果数据库为归档模式,archive进程会归档active日志组;如果发生数据库crash,该日志组也是实例恢复必需的日志组;
inactive:当active日志组归档完毕并且oracle判断不需要进行实例恢复时,会将其状态修改为inactive,等待下一轮的使用;所以当日志组为inactive的时候,如果数据库为归档模式.那么日志肯定是归档完成了。
本文永久更新链接地址: