您好,欢迎访问一九零五行业门户网

ORA-00600: internal error code, arguments: [15709]

客户一套10.2.0.4的数据库,一个实例突然的crash掉了。客户想让我们帮忙分析宕机的原因。对于这种数据库突然crash的问题,我们首先就会看数据库的alert日志,可以看到在宕机之前,smon进程报了ora-00600[15709]的错误,紧接数据库就输出了一条信息“fatal in
客户一套10.2.0.4的数据库,一个实例突然的crash掉了。客户想让我们帮忙分析宕机的原因。对于这种数据库突然crash的问题,我们首先就会看数据库的alert日志,可以看到在宕机之前,smon进程报了ora-00600[15709]的错误,紧接数据库就输出了一条信息“fatal internal error happened while smon was doing active transaction recovery.”也就是说smon在做活动事务恢复的时候出现了异常。最终导致了数据库实例的宕机。日志输出如下所示:
fri sep 26 10:53:35 2014errors in file /oracle/app/oracle/admin/wxyydb/bdump/wxyydb_smon_28997.trc:ora-00600: internal error code, arguments: [15709], [29], [1], [], [], [], [], []ora-30319: message 30319 not found; product=rdbms; facility=orafri sep 26 10:53:55 2014fatal internal error happened while smon was doing active transaction recovery.fri sep 26 10:53:55 2014errors in file /oracle/app/oracle/admin/wxyydb/bdump/wxyydb_smon_28997.trc:ora-00600: internal error code, arguments: [15709], [29], [1], [], [], [], [], []ora-30319: message 30319 not found; product=rdbms; facility=orasmon: terminating instance due to error 474termination issued to instance processes. waiting for the processes to exitfri sep 26 10:54:05 2014instance termination failed to kill one or more processesinstance terminated by smon, pid = 28997
我们再来分析一下wxyydb_smon_28997.trc文件的信息。可以看到数据库的smon进程一直尝试在做并行恢复事务。在恢复的过程中遇到了ora-00600错误,最终底层代码异常触发了数据库的宕机。
*** 2014-09-26 10:10:36.236parallel transaction recovery caught error 30319 *** 2014-09-26 10:15:10.643parallel transaction recovery caught exception 30319*** 2014-09-26 10:15:21.816parallel transaction recovery caught error 30319 *** 2014-09-26 10:19:51.707parallel transaction recovery caught exception 30319*** 2014-09-26 10:53:35.830ksedmp: internal or fatal errorora-00600: internal error code, arguments: [15709], [29], [1], [], [], [], [], []ora-30319: message 30319 not found; product=rdbms; facility=ora----- call stack trace -----calling call entry argument values in hex location type point (? means dubious value) -------------------- -------- -------------------- ----------------------------ksedst()+64 call ksedst1() 000000000 ? 000000001 ?ksedmp()+2176 call ksedst() 000000000 ? c000000000000c9f ? 4000000004057f40 ? 000000000 ? 000000000 ? 000000000 ?ksfdmp()+48 call ksedmp() 000000003 ?kgeriv()+336 call ksfdmp() c000000000000695 ? 000000003 ? 40000000095185e0 ? 00000ec33 ? 000000000 ? 000000000 ? 000000000 ? 000000000 ?kgeasi()+416 call kgeriv() 6000000000031770 ? 6000000000032828 ? 4000000001a504e0 ? 000000002 ? 9fffffffffffa138 ?$cold_kxfpqsrls()+1 call kgeasi() 6000000000031770 ?168 9ffffffffd3d2290 ? 000003d5d ? 000000002 ? 000000002 ? 0000003e7 ? 000003d5d ? 9ffffffffd3d22a0 ?kxfpqrsod()+1104 call $cold_kxfpqsrls() c0000004fdf7a838 ? c0000004fdf74430 ? 000000004 ? 9fffffffffffa200 ? c0000000000011ab ? 4000000003aa1250 ? 00000edf5 ? 000000001 ?kxfpdelqrefs()+640 call kxfpqrsod() c0000004fdf74430 ? 000000001 ? 60000000000b6300 ? c000000000000694 ? 4000000003dd14f0 ? 00000ee2d ? 60000000000c6708 ?kxfpqsod_qc_sod()+2 call kxfpdelqrefs() 00000003e ? 000000001 ?016 60000000000b6300 ? c000000000001028 ? 40000000025de5a0 ? 4000000001b1a110 ? 60000000000c2d04 ? 60000000000c2e90 ?kxfpqsod()+816 call kxfpqsod_qc_sod() 000000010 ? 000000001 ? 9fffffffffffa260 ? 60000000000b6300 ? 9fffffffffffa7f0 ? c000000000001028 ? 40000000025df810 ? 00000ee65 ?ktprdestroy()+208 call kxfpqsod() c0000004fdf7a838 ? 000000001 ? 9fffffffffffa810 ? 60000000000b6300 ? 9fffffffffffad90 ?ktprbeg()+8272 call ktprdestroy() c000000000001026 ? 40000000025615b0 ? 000006e61 ? 000000000 ? 4000000001052e40 ? 000000000 ?ktmmon()+10096 call ktprbeg() 9fffffffffffbe70 ? 9fffffffffffada0 ? 60000000000b6300 ? 40000000028b75a0 ? 00000ef21 ? 9fffffffffffadd8 ? 9fffffffffffade0 ?ktmsmonmain()+64 call ktmmon() 9fffffffffffd140 ?ksbrdp()+2816 call ktmsmonmain() c000000100e1ca60 ? c000000000000fa5 ? 000007361 ? 4000000003b5ae10 ? c000000000000205 ? 400000000409dcd0 ?opirip()+1136 call ksbrdp() 9fffffffffffd150 ? 60000000000b6300 ? 9fffffffffffdc90 ? 4000000002863ef0 ? 000004861 ? c000000000000b1d ? 60000000000318f0 ?$cold_opidrv()+1408 call opirip() 9fffffffffffea70 ? 000000004 ? 9ffffffffffff090 ? 9fffffffffffdca0 ? 60000000000b6300 ? c000000000000da1 ?sou2o()+336 call $cold_opidrv() 000000032 ? 9ffffffffffff090 ? 60000000000c2c78 ?$cold_opimai_real() call sou2o() 9ffffffffffff0b0 ?+640 000000032 ? 000000004 ? 9ffffffffffff090 ?main()+368 call $cold_opimai_real() 000000003 ? 000000000 ?main_opd_entry()+80 call main() 000000003 ? 9ffffffffffff598 ? 60000000000b6300 ? c000000000000004 ?
根据ora-00600[15709],我们在oracle support上找到一篇文档,smon may fail with ora-00600 [15709] errors crashing the instance (文档 id 736348.1),这篇文档的错误信息和我们所报出来的信息雷同。这篇文档列出了出现错误的堆栈情况:kxfpqsrls 695472,而如果你安装了这个patch,还是有类似的问题,很可能是遇到了另外一个类似的bug 9233544,oracle的bug还真是多啊。
bug 695472会影响9.2.0.8和10.2.0.4这两个版本,并且在10.2.0.4.2和10.2.0.5,11.1.0.7,11.2.0.1上得到了修复。解决bug 695472的方法是:
1.use the following workaround
set fast_start_parallel_rollback=false and recovery_parallelism=0
or
2.apply one-off  >, if available for your platform/version here.
or
3.upgrade to fixed release 10.2.0.5, 11.1.0.7 or 11.2.0.1.
bug 9233544会影响10.2.0.4,11.1.0.7和11.2.0.1这三个版本,并且在11.2.0.3和12.1上得到了修复,解决bug 9233544的方法是:
1.apply patchset 11.2.0.3, in which bug: 9233544 is fixed.
or
2.check if one-off patch:9233544 is available for your release and platform here.
我们仔细检查了一下系统的补丁,发现系统已经安装了patch 6954722,那就证明是bug 9233544影响的。要么升级到11.2.0.3的版本,要么就是安装单独的patch 9233544。对于升级11.2.0.3这个动作太大了,给客户说了一下考虑安装小patch来解决。
原文地址:ora-00600: internal error code, arguments: [15709], 感谢原作者分享。
其它类似信息

推荐信息