一次ORA-00600[13011]处理过程

一次掉电后,DG主库启动alert中报如下错误

1
2
3
4
5
6
7
8
9
Errors in file /home/oracle/app/oracle/diag/rdbms/min/min/trace/min_m002_4099.trc  (incident=16328):
ORA-00600: internal error code, arguments: [13011], [6443], [8463760], [14], [8488694], [0], [], [], [], [], [], []
Incident details in: /home/oracle/app/oracle/diag/rdbms/min/min/incident/incdir_16328/min_m002_4099_i16328.trc
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Errors in file /home/oracle/app/oracle/diag/rdbms/min/min/trace/min_m002_4099.trc (incident=16329):
ORA-00600: internal error code, arguments: [kewrose_1], [600], [ORA-00600: internal error code, arguments: [13011], [6443], [8463760], [14], [8488694], [0], [], [], [], [], [], []
], [], [], [], [], [], [], [], [], []
Incident details in: /home/oracle/app/oracle/diag/rdbms/min/min/incident/incdir_16329/min_m002_4099_i16329.trc

查看trace文件,部分内容如下

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18

*** 2016-07-20 18:21:04.185
*** SESSION ID:(169.43) 2016-07-20 18:21:04.185
*** CLIENT ID:() 2016-07-20 18:21:04.185
*** SERVICE NAME:(SYS$BACKGROUND) 2016-07-20 18:21:04.185
*** MODULE NAME:(MMON_SLAVE) 2016-07-20 18:21:04.185
*** ACTION NAME:(Auto-Purge Slave Action) 2016-07-20 18:21:04.185

Dump continued from file: /home/oracle/app/oracle/diag/rdbms/min/min/trace/min_m000_2615.trc
ORA-00600: internal error code, arguments: [13011], [6665], [8395075], [14], [8462787], [0], [], [], [], [], [], []

========= Dump for incident 14804 (ORA 600 [13011]) ========

*** 2016-07-20 18:21:04.186
dbkedDefDump(): Starting incident default dumps (flags=0x2, level=3, mask=0x0)
----- Current SQL Statement for this session (sql_id=c8taax4bfzsjc) -----
delete from WRH$_RSRC_PLAN tab where (:beg_snap <= tab.snap_id and tab.snap_id <= :end_snap and dbid = :dbid) and not exists (select 1 from WRM$_BASELINE b wh
ere (tab.dbid = b.dbid) and (tab.snap_id >= b.start_snap_id) and (tab.snap_id <= b.end_snap_id))

MOS中关于ORA-600 [13013]描述

1
2
3
4
5
6
7
Format: ORA-600 [13013] [a] [b] {c} [d] [e] [f]
Arg [a] Passcount
Arg [b] Data Object number
Arg {c} Tablespace Decimal Relative DBA (RDBA) of block containing the row to be updated
Arg [d] Row Slot number
Arg [e] Decimal RDBA of block being updated (Typically same as {c})
Arg [f] Code

确定object

1
2
3
4
5
SQL> select owner,object_name,object_type  from dba_objects where object_id=6665;

OWNER OBJECT_NAME OBJECT_TYPE
------------------------------ ------------------------------------------------------------ -------------------
SYS WRH$_RSRC_PLAN TABLE

可见与trc文件中的sql中对象名吻合,都是WRH$_RSRC_PLAN
查看trc文件中的PLAN TABLE

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
----- Plan Table -----

============
Plan Table
============
-----------------------------------------------------------+-----------------------------------+
| Id | Operation | Name | Rows | Bytes | Cost | Time |
-----------------------------------------------------------+-----------------------------------+
| 0 | DELETE STATEMENT | | | | 626 | |
| 1 | DELETE | WRH$_RSRC_PLAN | | | | |
| 2 | FILTER | | | | | |
| 3 | INDEX RANGE SCAN | WRH$_RSRC_PLAN_PK| 94 | 1598 | 8 | 00:00:01 |
| 4 | TABLE ACCESS BY INDEX ROWID | WRM$_BASELINE | 1 | 33 | 2 | 00:00:01 |
| 5 | INDEX RANGE SCAN | WRM$_BASELINE_PK | 1 | | 1 | 00:00:01 |
-----------------------------------------------------------+-----------------------------------+

1.The most common cause of this error is an index corruption. The first step is to check the indexes for corruption, i.e. run

1
2
3
SQL> ANALYZE INDEX WRH$_RSRC_PLAN_PK VALIDATE STRUCTURE;

Index analyzed.

2.If the indexes do not report corruption, further test for corruption the base tables referenced in the execution plan or the statement producing the error:

1
2
3
4
5
SQL> ANALYZE TABLE WRH$_RSRC_PLAN VALIDATE STRUCTURE CASCADE;
ANALYZE TABLE WRH$_RSRC_PLAN VALIDATE STRUCTURE CASCADE
*
ERROR at line 1:
ORA-01499: table/index cross reference failure - see trace file

查看生成的trace文件内容

1
2
3
4
5
6
7
8
9
10
*** 2016-07-21 01:31:31.840
*** SESSION ID:(52.5) 2016-07-21 01:31:31.840
*** CLIENT ID:() 2016-07-21 01:31:31.840
*** SERVICE NAME:(SYS$USERS) 2016-07-21 01:31:31.840
*** MODULE NAME:(sqlplus@primary (TNS V1-V3)) 2016-07-21 01:31:31.840
*** ACTION NAME:() 2016-07-21 01:31:31.840

Table/Index row count mismatch
table 2901 : index 3203, 296
Index root = tsn: 1 rdba: 0x0080194a

对于文件内容,简单介绍如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
trace文件中包含:

<description>: tsn: <tablespace number> rdba: <relative dba>

description有以下值:

"row not found in index"
"Table/Index row count mismatch"
"row mismatch in index dba"
"Table row count/Bitmap index bit count mismatch"
"kdavls: kdcchk returns %d when checking cluster dba 0x%08lx objn %d\n"

tsn: Tablespace Number表示的是索引存储的表空间编号。
rdba: 是索引段头相对于数据块的存储地址。

运行以下sql查询索引的文件和块号

1
2
3
4
5
6
7
8
9
SELECT dbms_utility.data_block_address_file(
to_number(trim(leading '0' from
replace('&&rdba','0x','')),'XXXXXXXX')
) AS rfile#,
dbms_utility.data_block_address_block(
to_number(trim(leading '0' from
replace('&&rdba','0x','')),'XXXXXXXX')
) AS block#
FROM dual;

结果如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
SQL> SELECT dbms_utility.data_block_address_file(
2 to_number(trim(leading '0' from
3 replace('&&rdba','0x','')),'XXXXXXXX')
4 ) AS rfile#,
5 dbms_utility.data_block_address_block(
6 to_number(trim(leading '0' from
7 replace('&&rdba','0x','')),'XXXXXXXX')
8 ) AS block#
9 FROM dual;
Enter value for rdba: 0x0080194a
old 3: replace('&&rdba','0x','')),'XXXXXXXX')
new 3: replace('0x0080194a','0x','')),'XXXXXXXX')
old 7: replace('&&rdba','0x','')),'XXXXXXXX')
new 7: replace('0x0080194a','0x','')),'XXXXXXXX')

RFILE# BLOCK#
---------- ----------
2 6474

接下来运行如下查询,定位具体的segment

1
2
3
4
select owner, segment_name, segment_type
from dba_segments
where header_file = <rfile#>
and header_block = <block#>

结果如下:

1
2
3
4
5
6
7
SQL> select owner, segment_name, segment_type
2 from dba_segments
3 where header_file = 2
4 and header_block = 6474;
OWNER SEGMENT_NAME SEGMENT_TYPE
------------------------------ --------------------------------------------------------------------------------- ------------------
SYS WRH$_RSRC_PLAN_PK INDEX

看来仍然是上面分析过的索引,这里是因为表和索引的数据不一致,那么重建索引

1
2
3
4
alter table WRH$_RSRC_PLAN drop constraint WRH$_RSRC_PLAN_PK cascade;

alter table WRH$_RSRC_PLAN
add constraint WRH$_RSRC_PLAN_PK primary key (DBID, SNAP_ID, INSTANCE_NUMBER, SEQUENCE#);

再次分析表,发现错误消失了

1
2
3
SQL> ANALYZE TABLE WRH$_RSRC_PLAN VALIDATE STRUCTURE CASCADE;

Table analyzed.

至此,问题解决

掉电仍是数据库的大敌,而此次并非只在这一张表上有讹误,还有其他一些WRH$表,这些表均属于sysaux表空间,这些表都是awr系列表,写入频繁,掉电容易引起数据的不一致性,但是,正因为是awr数据,并不是特别重要,修复起来也相对容易。

记录一下AWR Performance Tables介绍

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
The Oracle10g dynamic performance tables constitute the foundation of sophisticated automations such as Automatic Memory Management  (AMM ) as well as intelligent advisory tools such as ADDM and the SQL Tuning   Advisor.

Remember, the AWR is a core feature of the 10g database kernel and automatically collects and stores important run-time performance information for our historical analysis.

The tables that store this information are prefixed with wrh$ and are very similar in function to the STATSPACK tables. This could make STATSPACK appear somewhat obsolete, although it is still available in the $ORACLE_HOME/rdbms/admin directory.

Unlike the more cumbersome STATSPACK utility, which requires knowledge of the table structure and creation of complex query scripts, the 10g Enterprise Manager (OEM) automatically displays and interprets this valuable time-series performance data.

The wrh$ AWR tables store important historical statistical information about the database in the form of periodic snapshots. Each snapshot is a capture of the in–memory x$ fixed view and other control structures at a certain point in time. Each of the AWR table names is prefixed with wrm$ (Metadata tables), wrh$ (History tables), or wri$ (Advisory tables).

1. The wrm$ tables store metadata information for the Workload Repository.

2. The wrh$ tables store historical data or snapshots.

3. The wri$ tables: These 49 tables store data related to advisory functions.