疑难问题|Oracle undo表空间损坏的修复

朋友的数据库坏了,非归档模式,没有备份,要我帮忙修复一下。

环境:
windows 2003
oracle 9.2.0.1
noarchivelog

故障行为:
数据库运行时,直接拔电导致无法启动。

我把他发来的数据库文件在本地建了个库,然后启动,检查故障信息。

C:/>sqlplus /nolog

SQL*Plus: Release 9.2.0.1.0 - Production on Tue Feb 14 13:49:00 2006

Copyright (c) 1982, 2002, Oracle Corporation.All rights reserved.

SQL> connect sys/manager as sysdba
Connected.
SQL> startup pfile='d:/oracle92/ora92/database/inittest.ora'
ORA-01081: cannot start already-running ORACLE - shut it down first
SQL> shutdown abort
ORACLE instance shut down.
SQL> startup pfile='d:/oracle92/ora92/database/inittest.ora'
ORACLE instance started.

Total System Global Area93395628 bytes
Fixed Size453292 bytes
Variable Size75497472 bytes
Database Buffers16777216 bytes
Redo Buffers667648 bytes
Database mounted.
ORA-03113: end-of-file on communication channel

SQL>

查看alert log:

Beginning crash recovery of 1 threads
Tue Feb 14 13:50:53 2006
Started recovery at
Thread 1: logseq 368, block 1462, scn 0.0
Recovery of Online Redo Log: Thread 1 Group 1 Seq 368 Reading mem 0
Mem# 0 errs 0: D:/ORACLE92/ORADATA/TEST/REDO01.LOG
***
Corrupt block relative dba: 0x0080000e (file 2, block 14)
Fractured block found during media/instance recovery
Data in bad block -
type: 2 format: 2 rdba: 0x0080000e
last change scn: 0x0000.0646b03b seq: 0x3 flg: 0x04
consistency value in tail: 0xec0b0203
check value in block header: 0x2790, computed block checksum: 0x7ca0
spare1: 0x0, spare2: 0x0, spare3: 0x0
***
Reread of rdba: 0x0080000e (file 2, block 14) found same corrupted data
***
Corrupt block relative dba: 0x0080078e (file 2, block 1934)
Fractured block found during media/instance recovery
Data in bad block -
type: 2 format: 2 rdba: 0x0080078e
last change scn: 0x0000.064a39c9 seq: 0x1 flg: 0x04
consistency value in tail: 0xac2e0201
check value in block header: 0x23b8, computed block checksum: 0xf3e9
spare1: 0x0, spare2: 0x0, spare3: 0x0
***
Reread of rdba: 0x0080078e (file 2, block 1934) found same corrupted data
***
Corrupt block relative dba: 0x008005ee (file 2, block 1518)
Fractured block found during media/instance recovery
Data in bad block -
type: 2 format: 2 rdba: 0x008005ee
last change scn: 0x0000.064a43e1 seq: 0x3 flg: 0x04
consistency value in tail: 0x49c30201
check value in block header: 0x96f7, computed block checksum: 0x1bab
spare1: 0x0, spare2: 0x0, spare3: 0x0
***
Reread of rdba: 0x008005ee (file 2, block 1518) found same corrupted data
***
Corrupt block relative dba: 0x0080056e (file 2, block 1390)
Fractured block found during media/instance recovery
Data in bad block -
type: 2 format: 2 rdba: 0x0080056e
last change scn: 0x0000.064a3e68 seq: 0x1 flg: 0x04
consistency value in tail: 0x4c190203
check value in block header: 0x4470, computed block checksum: 0x6a36
spare1: 0x0, spare2: 0x0, spare3: 0x0
***
Reread of rdba: 0x0080056e (file 2, block 1390) found same corrupted data
***
Corrupt block relative dba: 0x0080066e (file 2, block 1646)
Fractured block found during media/instance recovery
Data in bad block -
type: 2 format: 2 rdba: 0x0080066e
last change scn: 0x0000.064a4549 seq: 0x1 flg: 0x04
consistency value in tail: 0x4c2b0201
check value in block header: 0x8a18, computed block checksum: 0x195d
spare1: 0x0, spare2: 0x0, spare3: 0x0
***
Reread of rdba: 0x0080066e (file 2, block 1646) found same corrupted data
***
Corrupt block relative dba: 0x0080047e (file 2, block 1150)
Fractured block found during media/instance recovery
Data in bad block -
type: 2 format: 2 rdba: 0x0080047e
last change scn: 0x0000.064a58ff seq: 0x1 flg: 0x04
consistency value in tail: 0x90e50201
check value in block header: 0xd69c, computed block checksum: 0x4bbd
spare1: 0x0, spare2: 0x0, spare3: 0x0
***
Reread of rdba: 0x0080047e (file 2, block 1150) found same corrupted data
***
Corrupt block relative dba: 0x008003fe (file 2, block 1022)
Fractured block found during media/instance recovery
Data in bad block -
type: 2 format: 2 rdba: 0x008003fe
last change scn: 0x0000.064a56b3 seq: 0x1 flg: 0x04
consistency value in tail: 0x8ff30203
check value in block header: 0x9d2b, computed block checksum: 0x7280
spare1: 0x0, spare2: 0x0, spare3: 0x0
***
Reread of rdba: 0x008003fe (file 2, block 1022) found same corrupted data
***
Corrupt block relative dba: 0x0080027e (file 2, block 638)
Fractured block found during media/instance recovery
Data in bad block -
type: 2 format: 2 rdba: 0x0080027e
last change scn: 0x0000.064a5231 seq: 0x3 flg: 0x04
consistency value in tail: 0x90f90201
check value in block header: 0x2282, computed block checksum: 0x7a6c
spare1: 0x0, spare2: 0x0, spare3: 0x0
***
Reread of rdba: 0x0080027e (file 2, block 638) found same corrupted data
Tue Feb 14 13:50:57 2006
Ended recovery at
Thread 1: logseq 368, block 55848, scn 0.105557616
817 data blocks read, 122 data blocks written, 54386 redo blocks read
Crash recovery completed successfully
Tue Feb 14 13:50:58 2006
Thread 1 advanced to log sequence 369
Thread 1 opened at log sequence 369
Current log# 2 seq# 369 mem# 0: D:/ORACLE92/ORADATA/TEST/REDO02.LOG
Successful open of redo thread 1.
Tue Feb 14 13:50:59 2006
SMON: enabling cache recovery
Tue Feb 14 13:51:00 2006
Errors in file d:/oracle92/admin/test/udump/db01gen_ora_1888.trc:
ORA-00600: internal error code, arguments: [2662], [0], [105557623], [0], [105590063], [8388633], [], []

Tue Feb 14 13:51:01 2006
Errors in file d:/oracle92/admin/test/udump/db01gen_ora_1888.trc:
ORA-00600: internal error code, arguments: [2662], [0], [105557623], [0], [105590063], [8388633], [], []

Tue Feb 14 13:51:01 2006
Error 600 happened during db open, shutting down database
USER: terminating instance due to error 600
Instance terminated by USER, pid = 1888
ORA-1092 signalled during: ALTER DATABASE OPEN...
Tue Feb 14 13:56:02 2006
USER: terminating instance due to error 1092
Instance terminated by USER, pid = 1888

可知是文件2发生错误,进而导致600错误。
因此,先查看文件2的名字,如下:

SQL> connect sys/manager as sysdba
Connected to an idle instance.
SQL> startup mount pfile='d:/oracle92/ora92/database/inittest.ora'
ORACLE instance started.

Total System Global Area93395628 bytes
Fixed Size453292 bytes
Variable Size75497472 bytes
Database Buffers16777216 bytes
Redo Buffers667648 bytes
Database mounted.

SQL> select file#,status,name from v$datafile;

FILE# STATUS
---------- -------
NAME
--------------------------------------------------------------------------------
1 SYSTEM
D:/ORACLE92/ORADATA/TEST/SYSTEM01.DBF

2 ONLINE
D:/ORACLE92/ORADATA/TEST/UNDOTBS01.DBF

3 ONLINE
D:/ORACLE92/ORADATA/TEST/CWMLITE01.DBF


FILE# STATUS
---------- -------
NAME
--------------------------------------------------------------------------------
4 ONLINE
D:/ORACLE92/ORADATA/TEST/DRSYS01.DBF

5 ONLINE
D:/ORACLE92/ORADATA/TEST/EXAMPLE01.DBF

6 ONLINE
D:/ORACLE92/ORADATA/TEST/INDX01.DBF


FILE# STATUS
---------- -------
NAME
--------------------------------------------------------------------------------
7 ONLINE
D:/ORACLE92/ORADATA/TEST/ODM01.DBF

8 ONLINE
D:/ORACLE92/ORADATA/TEST/TOOLS01.DBF

9 ONLINE
D:/ORACLE92/ORADATA/TEST/USERS01.DBF


FILE# STATUS
---------- -------
NAME
--------------------------------------------------------------------------------
10 ONLINE
D:/ORACLE92/ORADATA/TEST/XDB01.DBF

11 ONLINE
D:/ORACLE92/ORADATA/TEST/PMS.ORA

12 ONLINE
D:/ORACLE92/ORADATA/TEST/FYBX.ORA


12 rows selected.

可以看到,损坏的文件2是undotbs01.dbf,
查看资料,undotbs损坏或丢失时可以采用隐含参数临时启动数据库,然后进行修复。

修改init文件,加入
*._allow_resetlogs_corruption=true
(注:允许在数据库文件SCN不一致的情况下启动数据库)
*._corrupted_rollback_segments=(_SYSSMU1$,_SYSSMU2$,_SYSSMU3$,_SYSSMU4$,_SYSSMU5$,_SYSSMU6$,_SYSSMU7$,_SYSSMU8$,_SYSSMU9$,_SYSSMU10$)
(注:允许在rollback segments损坏的情况下启动数据库)

SQL> shutdown abort
ORACLE instance shut down.
SQL>startup pfile='d:/oracle92/ora92/database/inittest.ora'
ORACLE instance started.

Total System Global Area93395628 bytes
Fixed Size453292 bytes
Variable Size75497472 bytes
Database Buffers16777216 bytes
Redo Buffers667648 bytes
Database mounted.
Database opened.

启动成功,查看下当前的rollback segments
SQL> select segment_name,status from dba_rollback_segs;

SEGMENT_NAMESTATUS
------------------------------ ----------------
SYSTEMONLINE
_SYSSMU1$NEEDS RECOVERY
_SYSSMU2$NEEDS RECOVERY
_SYSSMU3$NEEDS RECOVERY
_SYSSMU4$NEEDS RECOVERY
_SYSSMU5$NEEDS RECOVERY
_SYSSMU6$NEEDS RECOVERY
_SYSSMU7$NEEDS RECOVERY
_SYSSMU8$NEEDS RECOVERY
_SYSSMU9$NEEDS RECOVERY
_SYSSMU10$NEEDS RECOVERY

SEGMENT_NAMESTATUS
------------------------------ ----------------
_SYSSMU11$ONLINE

12 rows selected.

新建一重做表空间undo
SQL> create undo tablespace undo datafile 'D:/oracle92/oradata/test/undo01.dbf' size 50M reuse autoe
xtend on;

Tablespace created.

SQL> select segment_name,status from dba_rollback_segs;

SEGMENT_NAMESTATUS
------------------------------ ----------------
SYSTEMONLINE
_SYSSMU1$OFFLINE
_SYSSMU2$NEEDS RECOVERY
_SYSSMU3$NEEDS RECOVERY
_SYSSMU4$NEEDS RECOVERY
_SYSSMU5$NEEDS RECOVERY
_SYSSMU6$NEEDS RECOVERY
_SYSSMU7$NEEDS RECOVERY
_SYSSMU8$NEEDS RECOVERY
_SYSSMU9$NEEDS RECOVERY
_SYSSMU10$NEEDS RECOVERY

SEGMENT_NAMESTATUS
------------------------------ ----------------
_SYSSMU11$ONLINE
_SYSSMU12$OFFLINE
_SYSSMU13$OFFLINE
_SYSSMU14$OFFLINE
_SYSSMU15$OFFLINE
_SYSSMU16$OFFLINE
_SYSSMU17$OFFLINE
_SYSSMU18$OFFLINE
_SYSSMU19$OFFLINE
_SYSSMU20$OFFLINE
_SYSSMU21$OFFLINE

22 rows selected.

SQL> shutdown immediate
Database closed.
Database dismounted.
ORACLE instance shut down.

修改init文件
*.undo_tablespace=undo

SQL>startup pfile='d:/oracle92/ora92/database/inittest.ora'
ORACLE instance started.

Total System Global Area93395628 bytes
Fixed Size453292 bytes
Variable Size75497472 bytes
Database Buffers16777216 bytes
Redo Buffers667648 bytes
Database mounted.
Database opened.

删除损坏的undotbs1表空间:
SQL> alter tablespace undotbs1 offline normal;

Tablespace altered.

SQL> drop tablespace UNDOTBS1 including contents and datafiles;

Tablespace dropped.

SQL>select * from v$recover_file;

no rows selected

SQL> shutdown immediate
Database closed.
Database dismounted.
ORACLE instance shut down.

修改init文件,注释参数
#*._allow_resetlogs_corruption=true
#*._corrupted_rollback_segments=(_SYSSMU1$,_SYSSMU2$,_SYSSMU3$,_SYSSMU4$,_SYSSMU5$,_SYSSMU6$,_SYSSMU7$,_SYSSMU8$,_SYSSMU9$,_SYSSMU10$)

SQL>startup pfile='d:/oracle92/ora92/database/inittest.ora'
ORACLE instance started.

Total System Global Area93395628 bytes
Fixed Size453292 bytes
Variable Size75497472 bytes
Database Buffers16777216 bytes
Redo Buffers667648 bytes
Database mounted.
Database opened.
SQL>
至此数据库已经成功修复。

**********************************************************************************
需要提醒的是,在删除损坏的重做表空间时,一定要先offline,
否则注释掉隐含参数后就会出现下面的情况。

SQL>startup pfile='d:/oracle92/ora92/database/inittest.ora'
ORACLE instance started.

Total System Global Area93395628 bytes
Fixed Size453292 bytes
Variable Size75497472 bytes
Database Buffers16777216 bytes
Redo Buffers667648 bytes
Database mounted.
ORA-01092: ORACLE instance terminated. Disconnection forced

查看alert log,本例中会发现下面的信息,oracle标记刚才删除的
重做表空间需要恢复,这时就无法去掉隐含参数了。
......
drop tablespace UNDOTBS1 including contents and datafiles
SMON: about to recover undo segment 6
SMON: mark undo segment 6 as needs recovery
SMON: about to recover undo segment 6
SMON: mark undo segment 6 as needs recovery
SMON: about to recover undo segment 6
SMON: mark undo segment 6 as needs recovery
......

    推荐阅读