11g的数据库因为BUG,导致与AWR相关的分区表不会自动split,这样一来,oracle自动清理策略不会生效,导致AWR数据越来越大,撑满SYSAUX表空间。
12c不存在这个问题。
检查快照实际保留时间与设定值的差异
14:51:28 SQL> col ash for a30
14:51:41 SQL> col SNAP for a30
14:51:46 SQL> col RETENTION for a30
14:51:51 SQL> select sysdate - a.sample_time ash,
14:51:53 2 sysdate - s.begin_interval_time snap,
14:51:53 3 c.RETENTION
14:51:53 4 from sys.wrm$_wr_control c,
14:51:53 5 (
14:51:53 6 select db.dbid,
14:51:53 7 min(w.sample_time) sample_time
14:51:53 8 from sys.v_$database db,
14:51:54 9 sys.Wrh$_active_session_history w
14:51:54 10 where w.dbid = db.dbid group by db.dbid
14:51:54 11 ) a,
14:51:54 12 (
14:51:54 13 select db.dbid,
14:51:54 14 min(r.begin_interval_time) begin_interval_time
14:51:54 15 from sys.v_$database db,
14:51:54 16 sys.wrm$_snapshot r
14:51:54 17 where r.dbid = db.dbid
14:51:54 18 group by db.dbid
14:51:54 19 ) s
14:51:54 20 where a.dbid = s.dbid
14:51:54 21 and c.dbid = a.dbid;
ASH SNAP RETENTION
------------------------------ ------------------------------ ------------------------------
+000000851 21:45:18.673 +000000008 17:01:35.550 +00008 00:00:00.0
1 row selected.
ASH保留了851天,但是设定值是8天。
SYSAUX表空间的使用情况
15:04:31 SQL> COL Item FOR A30
15:05:18 SQL> COL Schema FOR A30
15:05:49 SQL> SELECT occupant_name "Item",
15:05:50 2 space_usage_kbytes / 1048576 "Space Used (GB)",
15:05:50 3 schema_name "Schema",
15:05:50 4 move_procedure "Move Procedure"
15:05:51 5 FROM v$sysaux_occupants
15:05:51 6 ORDER BY 2 desc ;
Item Space Used (GB) Schema Move Procedure
------------------------------ --------------- ------------------------------ ----------------------------------------------------------------
SM/AWR 10.2494507 SYS
SM/OPTSTAT .566101074 SYS
XDB .123962402 XDB XDB.DBMS_XDB.MOVEXDB_TABLESPACE
LOGMNR .073669434 SYSTEM SYS.DBMS_LOGMNR_D.SET_TABLESPACE
SDO .072509766 MDSYS MDSYS.MOVE_SDO
SM/ADVISOR .067810059 SYS
EM .047058105 SYSMAN emd_maintenance.move_em_tblspc
AO .03729248 SYS DBMS_AW.MOVE_AWMETA
XSOQHIST .03729248 SYS DBMS_XSOQ.OlapiMoveProc
JOB_SCHEDULER .02557373 SYS
ORDIM/ORDDATA .013244629 ORDDATA ordsys.ord_admin.move_ordim_tblspc
SM/OTHER .00769043 SYS
XSAMD .005004883 OLAPSYS DBMS_AMD.Move_OLAP_Catalog
SMON_SCN_TIME .004150391 SYS
EXPRESSION_FILTER .003540039 EXFSYS
TEXT .003540039 CTXSYS DRI_MOVE_CTXSYS
WM .002380371 WMSYS DBMS_WM.move_proc
SQL_MANAGEMENT_BASE .001647949 SYS
PL/SCOPE .001525879 SYS
STREAMS .001342773 SYS
LOGSTDBY .001342773 SYSTEM SYS.DBMS_LOGSTDBY.SET_TABLESPACE
EM_MONITORING_USER .000854492 DBSNMP
ORDIM .000427246 ORDSYS ordsys.ord_admin.move_ordim_tblspc
AUTO_TASK .000305176 SYS
ORDIM/SI_INFORMTN_SCHEMA 0 SI_INFORMTN_SCHEMA ordsys.ord_admin.move_ordim_tblspc
TSM 0 TSMSYS
AUDIT_TABLES 0 SYS DBMS_AUDIT_MGMT.move_dbaudit_tables
ULTRASEARCH_DEMO_USER 0 WK_TEST MOVE_WK
ULTRASEARCH 0 WKSYS MOVE_WK
ORDIM/ORDPLUGINS 0 ORDPLUGINS ordsys.ord_admin.move_ordim_tblspc
STATSPACK 0 PERFSTAT
31 rows selected.
Elapsed: 00:00:00.84
SM/AWR的含义是Server Manageability - Automatic Workload Repository,如果它的值大那么表示AWR信息容量大。
SM/OPTSTAT的含义是Server Manageability - Optimizer Statistics History,如果它的值大那么表示优化器统计信息容量大。
查询具体哪些表占用量大
15:06:00 SQL> select * from (
15:06:41 2 select segment_name,SEGMENT_TYPE,sum(bytes)/1024/1024 total_mb from dba_segments where tablespace_name =
15:06:41 3 'SYSAUX' group by segment_name,SEGMENT_TYPE order by 3 desc)
15:06:41 4 where rownum <=20;
SEGMENT_NAME SEGMENT_TYPE TOTAL_MB
--------------------------------------------------------------------------------- ------------------ ----------
WRH$_ACTIVE_SESSION_HISTORY TABLE PARTITION 2951.0625
WRH$_EVENT_HISTOGRAM_PK INDEX PARTITION 768.0625
WRH$_EVENT_HISTOGRAM TABLE PARTITION 736.0625
WRH$_LATCH TABLE PARTITION 640.0625
WRH$_SQLSTAT TABLE PARTITION 528.0625
WRH$_SYSSTAT_PK INDEX PARTITION 432.0625
WRH$_SYSSTAT TABLE PARTITION 408.0625
WRH$_PARAMETER_PK INDEX PARTITION 384.0625
WRH$_LATCH_PK INDEX PARTITION 376.0625
WRH$_LATCH_MISSES_SUMMARY TABLE PARTITION 336.0625
WRH$_PARAMETER TABLE PARTITION 320.0625
WRH$_LATCH_MISSES_SUMMARY_PK INDEX PARTITION 320.0625
WRH$_ACTIVE_SESSION_HISTORY_PK INDEX PARTITION 304.0625
WRH$_SEG_STAT TABLE PARTITION 246.0625
WRH$_SERVICE_STAT_PK INDEX PARTITION 168.0625
WRH$_SYSTEM_EVENT TABLE PARTITION 168.0625
WRH$_SQLSTAT_PK INDEX PARTITION 160.0625
SYS_LOB0000006409C00004$$ LOBSEGMENT 144
I_WRI$_OPTSTAT_H_OBJ#_ICOL#_ST INDEX 136
WRH$_SEG_STAT_PK INDEX PARTITION 112.0625
20 rows selected.
Elapsed: 00:00:00.09
查询快照信息:
15:06:41 SQL> SELECT MIN(SNAP_ID),MAX(SNAP_ID) FROM DBA_HIST_SNAPSHOT;
MIN(SNAP_ID) MAX(SNAP_ID)
------------ ------------
20147 20354
1 row selected.
Elapsed: 00:00:00.00
15:07:22 SQL> SELECT MIN(SNAP_ID),MAX(SNAP_ID) FROM DBA_HIST_ACTIVE_SESS_HISTORY;
MIN(SNAP_ID) MAX(SNAP_ID)
------------ ------------
1 20354
1 row selected.
Elapsed: 00:00:03.83
可以用下面的方法清理指定的快照数据。但是一般情况下,表空间告警的时候,WRH$_ACTIVE_SESSION_HISTORY已经很大了,下面的方法其实就是delete,奇慢无比。
EXEC dbms_workload_repository.drop_snapshot_range(low_snap_id=>1, high_snap_id=>3);
上面的存储过程最主要的其实是下面的语句,可以通过v$session和v$sql抓取出来。
delete from WRH$_ACTIVE_SESSION_HISTORY tab where (:beg_snap <=
tab.snap_id and tab.snap_id <= :end_snap and dbi
d = :dbid) and not exists (select 1 from WRM$_BASELINE b
where (tab.dbid = b.dbid) and
(tab.snap_id >= b.start_snap_id) and
(tab.snap_id <= b.end_snap_id))
下面的命令可以展示每个分区最大最小的snap id
15:29:35 SQL> set serveroutput on
15:29:43 SQL> declare
15:29:43 2 CURSOR cur_part IS
15:29:43 3 SELECT partition_name from dba_tab_partitions
15:29:43 4 WHERE table_name = 'WRH$_ACTIVE_SESSION_HISTORY';
15:29:43 5
15:29:43 6 query1 varchar2(200);
15:29:43 7 query2 varchar2(200);
15:29:43 8
15:29:43 9 TYPE partrec IS RECORD (snapid number, dbid number);
15:29:43 10 TYPE partlist IS TABLE OF partrec;
15:29:43 11
15:29:43 12 Outlist partlist;
15:29:43 13 begin
15:29:43 14 dbms_output.put_line('PARTITION NAME SNAP_ID DBID');
15:29:43 15 dbms_output.put_line('--------------------------- ------- ----------');
15:29:43 16
15:29:43 17 for part in cur_part loop
15:29:43 18 query1 := 'select min(snap_id), dbid from sys.WRH$_ACTIVE_SESSION_HISTORY partition ('||part.partition_name||') group by dbid';
15:29:43 19 execute immediate query1 bulk collect into OutList;
15:29:44 20
15:29:44 21 if OutList.count > 0 then
15:29:44 22 for i in OutList.first..OutList.last loop
15:29:44 23 dbms_output.put_line(part.partition_name||' Min '||OutList(i).snapid||' '||OutList(i).dbid);
15:29:44 24 end loop;
15:29:44 25 end if;
15:29:44 26
15:29:44 27 query2 := 'select max(snap_id), dbid from sys.WRH$_ACTIVE_SESSION_HISTORY partition ('||part.partition_name||') group by dbid';
15:29:44 28 execute immediate query2 bulk collect into OutList;
15:29:44 29
15:29:44 30 if OutList.count > 0 then
15:29:44 31 for i in OutList.first..OutList.last loop
15:29:44 32 dbms_output.put_line(part.partition_name||' Max '||OutList(i).snapid||' '||OutList(i).dbid);
15:29:44 33 dbms_output.put_line('---');
15:29:44 34 end loop;
15:29:44 35 end if;
15:29:44 36
15:29:44 37 end loop;
15:29:44 38 end;
15:29:44 39 /
PARTITION NAME SNAP_ID DBID
--------------------------- ------- ----------
WRH$_ACTIVE_3245728642_0 Min 1 3245728642
WRH$_ACTIVE_3245728642_0 Max 20354 3245728642
PL/SQL procedure successfully completed.
Elapsed: 00:00:09.44
下面展示一种清理方法。总体来说,就是保留最近的数据,将表清空,再将数据插回去。
除此之外,前面提到过此问题最关键的问题是因为分区不能自动split,下面的命令可以将所有与AWR有关的分区表全部split出一个新的分区出来。
alter session set "_swrf_test_action" = 72;
这样就可以自动等待上一个分区数据过期,自动清理,而且还会清理所有的相关分区表。下面我展示的方法,只会清理最大的那张表,其他表中的过期数据还是存在,但是占用量不大,所以也可以接受。
1. 创建临时表,保存近8天的数据
create table mingshuo.ash_bak_20190610 as select * from sys.WRH$_ACTIVE_SESSION_HISTORY where SAMPLE_TIME>=sysdate-9;
SQL> select count(*) from mingshuo.ash_bak_20190610;
COUNT(*)
----------
12648
1 row selected.
Elapsed: 00:00:00.01
2. 禁用AWR
exec dbms_workload_repository.modify_snapshot_settings(interval => 0);
3. truncate WRH$_ACTIVE_SESSION_HISTORY
TRUNCATE TABLE sys.WRH$_ACTIVE_SESSION_HISTORY;
4. truncate 完成后,重建 WRH$_ACTIVE_SESSION_HISTORY表的索引,并从备份表恢复数据
insert into sys.WRH$_ACTIVE_SESSION_HISTORY select * from mingshuo.ash_bak_20190610;
commit;
5. 查看索引状态
set line 300 pages 200
col owner for a20
col index_name for a30
select index_owner,
index_name,
partition_name,
status,
tablespace_name,
last_analyzed
from dba_ind_partitions
where index_name in (select index_name
from dba_indexes
where table_name in ('WRH$_ACTIVE_SESSION_HISTORY')
and table_owner = 'SYS');
6. 如果索引失效重建索引(测试的时候是没有失效的,索引为本地索引)
alter index sys.WRH$_ACTIVE_SESSION_HISTORY_PK rebuild parallel 8 nologging;
alter index sys.WRH$_ACTIVE_SESSION_HISTORY_PK noparallel;
7. 启动AWR
exec dbms_workload_repository.modify_snapshot_settings(interval => 60);
8. 测试awr和ash可能正常生成
---手动生成snapshot
exec dbms_workload_repository.create_snapshot;
@?/rdbms/admin/awrrpt
@?/rdbms/admin/ashrpt
9. 删除备份表
drop table mingshuo.ash_bak_20190610 purge;