Workaround for ORA-600/ORA-7445 with SQL Repair Advisor

From time to time we hit Errors like ORA-600 or ORA-7445, which are triggered by specific SQL statements. For these cases, Oracle developed the SQL Repair Advisor. This is a short demonstration of utilizing the SQL Repair Advisor to avoid ORA-7445 caused by a SQL statement.

SQL and Error:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
SELECT
EXTRACT(YEAR FROM date_new) jahr,
EXTRACT(MONTH FROM date_new) monat,
COUNT(DISTINCT TRUNC(date_new)) arbeitstage,
SUM(CASE WHEN STRING1 IN (SELECT STRING1
FROM MD.MD1
WHERE STRING1_GROUP = 'CUST')
THEN
DECODE(FLAG, 'J', 1, 0)
ELSE
0
END) cust_flag
FROM
SCHEMA.TAB1
WHERE
date_new BETWEEN TO_DATE(201101, 'yyyymm')
AND ADD_MONTHS(TO_DATE(201107, 'yyyymm'), 1)
GROUP BY
EXTRACT(YEAR FROM date_new),
EXTRACT(MONTH FROM date_new)
ORDER BY 1,2;
 
ERROR at line 6:
ORA-03113: end-of-file ON communication channel
Process ID: 7746
SESSION ID: 737 Serial NUMBER: 9823
 
Tue Jul 26 15:19:41 2011
Exception [TYPE: SIGSEGV, Address NOT mapped TO object] [ADDR:0xA] [PC:0x888A3C9, xtyqbcb()+413] ...
ORA-07445: exception encountered: core dump [xtyqbcb()+413] [SIGSEGV] [ADDR:0xA] [PC:0x888A3C9]...

Execution of SQL Repair Advisor:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
DECLARE
  rep_out CLOB;
  t_id VARCHAR2(50);
BEGIN
  T_ID := DBMS_SQLDIAG.CREATE_DIAGNOSIS_TASK( 
  sql_text => q'#SELECT
EXTRACT(YEAR FROM date_new) jahr,
EXTRACT(MONTH FROM date_new) monat,
COUNT(DISTINCT TRUNC(date_new)) arbeitstage,
SUM(CASE WHEN STRING1 IN (SELECT STRING1
FROM MD.MD1
WHERE STRING1_GROUP = 'CUST')
THEN
DECODE(FLAG, 'J', 1, 0)
ELSE
0
END) cust_flag
FROM
SCHEMA.TAB1
WHERE
date_new BETWEEN TO_DATE(201101, 'yyyymm')
AND ADD_MONTHS(TO_DATE(201107, 'yyyymm'), 1)
GROUP BY
EXTRACT(YEAR FROM date_new),
EXTRACT(MONTH FROM date_new)
ORDER BY 1,2#', 
  task_name => 'ORA7445-xtyqbcb', 
  problem_type =>DBMS_SQLDIAG.PROBLEM_TYPE_COMPILATION_ERROR
  );
  DBMS_SQLDIAG.EXECUTE_DIAGNOSIS_TASK (t_id);
  rep_out := DBMS_SQLDIAG.REPORT_DIAGNOSIS_TASK (t_id, DBMS_SQLDIAG.TYPE_TEXT);
  DBMS_OUTPUT.PUT_LINE ('Report : ' || rep_out);
END;
/
 
<strong>Output:</strong>
 
Report : GENERAL INFORMATION SECTION
-------------------------------------------------------------------------------
Tuning Task Name   : ORA7445-xtyqbcb
Tuning Task Owner  : SYS
Workload TYPE      : Single SQL Statement
Scope              : COMPREHENSIVE
TIME LIMIT(seconds):
1800
Completion STATUS  : COMPLETED
Started at         : 07/26/2011 15:46:38
Completed at       : 07/26/2011 15:46:38
 
-------------------------------------------------------------------------------
Schema Name: SYS
SQL ID     : 39ac71hjcn0rt
...
 
-------------------------------------------------------------------------------
FINDINGS SECTION (1
finding)
-------------------------------------------------------------------------------
 
1- SQL Patch Finding (see EXPLAIN plans SECTION below)
------------------------------------------------------
  A potentially better execution plan was found FOR this statement.
 
  Recommendation
 
 
--------------
  - Consider accepting the recommended SQL patch.
    EXECUTE dbms_sqldiag.accept_sql_patch(task_name =>
            'ORA7445-xtyqbcb', task_owner => 'SYS', REPLACE => TRUE);
 
  Rationale
  ---------
    Recommended plan WITH hash VALUE 1503213169 has NUMBER OF ROWS 7, CHECK
 
SUM 13925076401, execution TIME 264 AND 1972 buffer gets
 
-------------------------------------------------------------------------------
 
 
 
PL/SQL PROCEDURE successfully completed.

Activation:

1
2
3
4
5
6
 BEGIN dbms_sqldiag.accept_sql_patch(
      task_name =>'ORA7445-xtyqbcb', 
      task_owner => 'SYS', 
      REPLACE => TRUE);
END;
/

Test:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
SELECT
EXTRACT(YEAR FROM date_new) jahr,
EXTRACT(MONTH FROM date_new) monat,
COUNT(DISTINCT TRUNC(date_new)) arbeitstage,
SUM(CASE WHEN STRING1 IN (SELECT STRING1
FROM MD.MD1
WHERE STRING1_GROUP = 'CUST')
THEN
DECODE(FLAG, 'J', 1, 0)
ELSE
0
END) cust_flag
FROM
SCHEMA.TAB1
WHERE
date_new BETWEEN TO_DATE(201101, 'yyyymm')
AND ADD_MONTHS(TO_DATE(201107, 'yyyymm'), 1)
GROUP BY
EXTRACT(YEAR FROM date_new),
EXTRACT(MONTH FROM date_new)
ORDER BY 1,2;
 
      JAHR      MONAT ARBEITSTAGE CUST_FLAG
---------- ---------- ----------- ----------------
      2011          1          20                0
 
 
7 ROWS selected.
 
SQL> SELECT * FROM TABLE(dbms_xplan.display_cursor(NULL,NULL,'ADVANCED'));
 
PLAN_TABLE_OUTPUT
---------------------
SQL_ID  5wdztx81x0r8f, child NUMBER 0
-------------------------------------
 
 
Plan hash VALUE: 1503213169
 
-----------------------------------------------------------------------------------------------
| Id  | Operation          | Name             | ROWS  | Bytes |TempSpc| Cost (%CPU)| TIME     |
-----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |                  |       |       |       |   591 (100)|          |
|*  1 |  TABLE ACCESS FULL | MD1              |     1 |    12 |       |     3   (0)| 00:00:01 |
|   2 |  SORT GROUP BY     |                  | 27402 |   294K|  2776K|   591  (12)| 00:00:02 |
|*  3 |   TABLE ACCESS FULL| TAB1             |   117K|  1260K|       |   369  (10)| 00:00:01 |
-----------------------------------------------------------------------------------------------
 
 
...
 
 
Note
-----
   - SQL patch "SYS_SQLPTCH_013166b4fb2a0001" used FOR this statement
 
 
66 ROWS selected.

Removal of SQL Patch:

1
EXEC DBMS_SQLDIAG.DROP_SQL_PATCH ('SYS_SQLPTCH_013166b4fb2a0001');

Deletion of SQL Diag Task:

1
 EXEC DBMS_SQLDIAG.DROP_DIAGNOSIS_TASK ('ORA7445-xtyqbcb');

Performance Degradation for Query on DBA_SEGMENTS.BYTES in 11gR2

I have been troubleshooting a performance issue in a DWH environment, which is quite interesting. It was a query on DBA_SEGMENTS in 11gR2 with lots of partitions and it was taking almost 10 minutes versus only a few seconds in 10gR2. The problem could be stripped downt to this SQL query:

select bytes from dba_segments

The dba_segment view can be seen in DBA_VIEWS:

SELECT ...
       decode(bitand(segment_flags, 131072), 131072, blocks,
           (decode(bitand(segment_flags,1),1,
            <strong>dbms_space_admin.segment_number_blocks</strong>(tablespace_id, relative_fno,
            header_block, segment_type_id, buffer_pool_id, segment_flags,
            segment_objd, blocks), blocks)))*blocksize,
      ...
FROM sys_dba_segs

The response time was dominated by Waits for “db file sequential reads” taking almost all of the response time.

11gR2:

call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          0          0           0
Execute      1      0.00       0.00          0          0          0           0
Fetch    12563     23.17     546.74     157447     580414          1      188421
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total    12565     23.17     546.74     157447     580414          1      188421

10gR2:

call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.06       0.06          0          0          0           0
Execute      1      0.00       0.00          0          0          0           0
Fetch     5104      1.77       2.54          0      96688          0       76542
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total     5106      1.83       2.61          0      96688          0       76542

The 10046 Trace File contains more data regarding those waits:

WAIT #4: nam='db file sequential read' ela= 134 file#=21 block#=2313482 blocks=1 obj#=-1 
WAIT #4: nam='db file sequential read' ela= 120 file#=15 block#=2128019 blocks=1 obj#=-1 
WAIT #4: nam='db file sequential read' ela= 128 file#=21 block#=2313490 blocks=1 obj#=-1 
WAIT #4: nam='db file sequential read' ela= 372 file#=21 block#=2313498 blocks=1 obj#=-1 
WAIT #4: nam='db file sequential read' ela= 108 file#=21 block#=2313506 blocks=1 obj#=-1 
WAIT #4: nam='db file sequential read' ela= 130 file#=21 block#=2313514 blocks=1 obj#=-1 
WAIT #4: nam='db file sequential read' ela= 132 file#=21 block#=2313522 blocks=1 obj#=-1 
WAIT #4: nam='db file sequential read' ela= 121 file#=21 block#=2313530 blocks=1 obj#=-1 
WAIT #4: nam='db file sequential read' ela= 158 file#=15 block#=2128003 blocks=1 obj#=-1 
WAIT #4: nam='db file sequential read' ela= 146 file#=21 block#=2313538 blocks=1 obj#=-1 
WAIT #4: nam='db file sequential read' ela= 116 file#=21 block#=2313546 blocks=1 obj#=-1 
(Output modified for formatting)

Strange thing that obj# is -1. I made some block dumps of those blocks and found out that all those blocks (file#/block#) have something in common: They were all segment header blocks. Another strange thing was that when the query was executed a second time, all those I/O requests were performed again – so no caching.

At that point, I opened a Support service request. After a couple of weeks, Support suggested running this for each tablespace of the user segments:

EXEC  dbms_space_admin.TABLESPACE_FIX_SEGMENT_EXTBLKS('<tablespace_name>');

Finally this fixed the problem. Now, we are trying to find out why this was necessary for a database, that was freshly created with dbca 11.2.0.2 from the seeded template and filled with data pump import.

Data Guard 11.2.0.2 – update

In March, Oracle MAA has published a white paper for implementing Data Guard on Exadata platforms. (http://www.oracle.com/technetwork/database/features/availability/maa-wp-dr-dbm-130065.pdf)

Although the paper is focused on Exadata, some information can be applied to non-Exadata systems as well:

  • Corruption Protection:
    • Primary:
      • DB_BLOCK_CHECKSUM=FULL,
      • DB_BLOCK_CHECKING=FULL,
      • DB_LOST_WRITE_PROTECT=TYPICAL
    • (Physical) Standby:
      • DB_BLOCK_CHECKSUM=FULL,
      • DB_BLOCK_CHECKING=OFF,
      • DB_LOST_WRITE_PROTECT=TYPICAL
  • Network Tuning:
    TCP Send/Receive Buffers: 3 x Bandwith Delay Product or 10 MB, whichever is greater
  • Redo Transport Modes:
    • SYNC is recommended if round-trip-time is less than 5 ms.  Impact of SYNC Mode on primary performance has been improved because local online redo log write and redo shipping is not done sequentially anymore but in parallel.
    • ASYNC: Transport Lag is reduced because Redo is not read from online redo log from disk but from log buffer if possible. init.ora log_buffer might need to be increased for this to yield the maximum benefit.
  • LOGGING/NOLOGGING: Normally, in Data Guard you set database-wide “ALTER DATABASE FORCE LOGGING”. In Data Warehouse environments, it might be advisable to set it to NOLOGGING and decide on a tablespace-basis whether the tablespace should allow or disallow NOLOGGING operations. E.g. a tablespace containing only transient, recreatable  or non-critical data might benefit from being set to NOLOGGING for certain bulk operations. (CTAS, Direct Path Inserts, index rebuilds, etc.)
  • STANDBY-FIRST Patches: Some patches (PSU, CPU, PSE) will be flagged in the patch README with “Standby-First”. This means that they can be applied on a physical standby before being applied on the primary. For regular patches, this was not supported.

Using Grid Control Repository for RDBMS Patch Report

I was looking for a method to utilize the Grid Control Repository, which contains information about installed Oracle Homes, databases and patches, for a patch report.

With a little reverse engineering i came up with these relevant tables:

  • Mgmt_Ecm_Snapshot: Every time the inventory is refreshed, a new line is inserted into this table containing the host name and the snapshot_guid. The most current snapshot has flag IS_CURRENT set to ‘Y’
  • Mgmt_Inv_Container: Every Oracle Home is a container. This table contains the snapshot_guid and the container_guid along with a container_description which is basically the Oracle Home Path
  • Mgmt_Inv_Patch: This table contains container_guid, Patch ID and patch installation timestamp
  • Mgmt_Inv_Component: This table lists all the components of the oracle homes along with their version. There is one component per container with the flag Is_Top_Level set to ‘Y’.  We use this component for getting the base version of the installed product. (e.g. 11.2.0.2)
  • Mgmt_Inv_Versioned_Patch: I am not sure if this table is needed for version information, but one of the mgmt views was using these two tables together, so I used it as a reference.

The complete statement now is:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
CREATE OR REPLACE FORCE VIEW "SYSMAN"."RDBMS_PATCH_REPORT"
AS
SELECT
      N.Target_Name,
    S.Start_Timestamp AS Collected_Time,
    S.Target_Name     AS Host_Name,
    C.Container_Name  AS Oracle_Home_Name,
    Container_Location,
    P.Id AS Patch_Id,
    (
    CASE Id
      WHEN '10157506'
      THEN 'GI Bundle1'
      WHEN '10185523'
      THEN 'OWB Bundle'
      WHEN '10248523'
      THEN 'PSU Jan 2011'
      WHEN '11724916'
      THEN 'PSU Apr 2011'
      WHEN '12311357'
      THEN 'GI Psu Apr 2011'
      ELSE NULL
    END ) description,
    P.Timestamp AS Install_Time,
    CASE
      WHEN VP.version IS NULL
      THEN M.version
      ELSE VP.version
    END AS Version
  FROM Mgmt_Ecm_Snapshot S,
    Mgmt_Inv_Container C,
    Mgmt_Inv_Patch P,
    (SELECT T.Target_Guid,
      T.Host_Name,
      T.Target_Name,
      T.Target_Type,
      Mp.Property_Value AS Oh
    FROM Mgmt_Targets T,
      Mgmt_Target_Properties Mp
    WHERE T.Target_Guid  = Mp.Target_Guid
    AND Mp.Property_Name = 'OracleHome'
    AND Target_Type      = 'oracle_database'
    ) N,
    Mgmt_Inv_Component M ,
    MGMT_INV_VERSIONED_PATCH VP
  WHERE S.Snapshot_Guid = C.Snapshot_Guid
  AND S.Is_Current      = 'Y'
  AND C.Container_Type  = 'O'
  AND P.Container_Guid  = C.Container_Guid
  AND N.Host_Name       = S.Target_Name
  AND N.Oh              = C.Container_Location
  AND M.Component_Guid  = Vp.Component_Guid(+)
  AND M.Is_Top_Level    = 'Y'
  AND M.Container_Guid  = C.Container_Guid
  ORDER BY 1,2,3;

Is 11.2.0.2 ready for production? Judge yourself …

Murphy said: if problems arise, they all come at the same time. After several weeks of stability, this week brought hell of a lot of Oracle problems with 11.2.0.2 and RAC. To list the worst:

  • Downtime because of ASM process limit reached. Normally, the formula to calculate depends on the number of concurrent datafile extensions. In our case 100 ASM processes were not enough. Increased to 200 and found  Note 1287496.1 which describes the issue. Merge Patch for bug is available for 11.2.0.2 GI Bundle 1 but not yet for GI Bundle 2. Ups.
  • CPU Starvation because of Adaptive Cursor Sharing: One server process was taking up 100% of cpu time for the last several hours. It turned out to be an OEM monitoring query having several thousands of child cursors.
    PID 	USER 	PR NI VIRT 	RES SHR S %CPU %MEM TIME+ COMMAND
    8667 	ora11 25 0 	8423m 33m 28m R 99.7 	0.1 358:27.78 ora_pz98_MDDB

    I then checked different databases and found that almost all of our 11.2.0.2 databases have several hundreds or even thousands of child cursors.

    SELECT sql_id, is_shareable, MIN(child_number), MAX(child_number), COUNT(*)
    FROM gv$sql
    GROUP BY sql_id, is_shareable HAVING COUNT(*) &gt;100 ORDER BY 6 DESC
     
    SQL_ID        IS_SHAREABLE MIN(CHILD_NUMBER) MAX(CHILD_NUMBER)  COUNT(*)
    ------------- ------------ ----------------- ------------------ ---------
    c7kt3njhnmtkm Y            0                 5097               1397
    c7kt3njhnmtkm N            3                 3544               1836
    1vnhgmpc17vv0 Y            0                 3022               2697
    1vnhgmpc17vv0 N            6                 2185               444
    93qh89pxuxzuw Y            0                 1949               1522
    93qh89pxuxzuw N            2                 1625               428
    5fk0v8km2f811 Y            0                 1281               1763
    4f3ufvfcgfqsg Y            0                 792                794
    cjbwk0ww7j5rv Y            0                 627                1251
    dyqdzgxcv4d3t Y            0                 626                1252
    5fk0v8km2f811 N            3                 543                260
    f0jxh8d6b5af2 Y            0                 494                564
    f0jxh8d6b5af2 N            0                 290                130
    dbvkky621gqtr Y            0                 266                267
    32rqmpqpfv0gm Y            0                 255                257
    g9uwxs7pr8tjm Y            0                 254                257
    40k6jjt90n4fa Y            3                 199                129

    I suspect this to be Bug 10182051 Extended cursor sharing generates many shareable child cursors and there is a workaround:

    alter system set “_optimizer_extended_cursor_sharing_rel”=none;

  • Limited database availability because of failing queries on gv$ tables:
    SQL> select count(*) from gv$session
    2 ;
    select count(*) from  gv$session
    *
    ERROR at line 1:
    ORA-12850: Could not allocate slaves on  all specified instances: 2 needed, 1
    allocated
    ORA-12801: error signaled  in parallel query server P001, instance 3599

    Currently no other known workaround than bouncing all the RAC instances.

  • DataGuard ASYNC Redo Transport not reliable: We have a RAC primary / single instance physical standby setup and use async redo transport. During times of heavy ETL on the primary, the standby databases stops at recovery of one archivelog with “(in transit)”. Primary is showing this error in alert log:
    ARC7: Creating remote archive destination LOG_ARCHIVE_DEST_2: 'HAMDB' (thread 2 SEQUENCE 4044) (MUCDB2)
    ARC7: Archive log rejected (thread 2 SEQUENCE 4044) at host 'HAMDB'
    FAL[server, ARC7]: FAL archive failed, see trace file.
    ARCH: FAL archive failed. Archiver continuing
    ORACLE Instance MUCDB2 - Archival Error. Archiver continuing.

    On standby side, it says:
    Media Recovery Waiting for thread 2 sequence 4044 (in transit)

    The standby database never recovers from this problem, except when standby database is bounced. The problem appears with and without broker configuration. Currently there is no known workaround.

Maybe some of this issues will be addressed in upcoming PSU April, which will be released this week.

Recyclebin Bug – ORA-600 [ktcdso-1] on Oracle 11.2.0.2.1

Just a short note:

If you are using 11.2.0.2 and 11.2.0.2.1, chances are high that database stability is endangered because of massive ORA-600 [ktcdso-1], unless you have recyclebin deactived with init.ora recyclebin=off.

Oracle is currently working on this issue with Bug 10427260: ORA-00600 [KTCDSO-1], [], [], [] WHEN PURGE RECYCLE BIN.

Unfortunately, the instance has to be bounced in order to deactivate recyclebin.

ACFS Filesystem Monitoring and Group Ownership

When you create an ACFS Filesystem in Grid Infrastructure 11.2.0.1 or 11.2.0.2, the filesystem root directory group ownership is set to Group SS_ASM_GRP, e.g. asmadmin.

1
2
3
4
5
6
7
8
9
10
11
12
[grid@md1 ~]$ cd /opt/oracle/gridbase/acfsmounts
[grid@md1 acfsmounts]$ ls -al
total 12
drwxr-xr-x  3 grid oinstall 4096 Jan 10 09:44 .
drwxr-xr-x 10 grid oinstall 4096 Jan 10 09:43 ..
drwxrwx---  4 root   asmadm 4096 Jan 10 09:44 data_testvol
 
SQL> select * from v$asm_filesystem where fs_name = '/opt/oracle/gridbase/acfsmounts/data_testvol'
 
FS_NAME                                        AVAILABLE BLOCK_SIZE STATE         CORRU    NUM_VOL TOTAL_SIZE TOTAL_FREE TOTAL_SNAP_SPACE_USAGE
---------------------------------------------- --------- ---------- ------------- ----- ---------- ---------- ---------- ----------------------
/opt/oracle/gridbase/acfsmounts/data_testvol   10-JAN-11          4 AVAILABLE     FALSE          1        256 119.769531                      0

If – for whatever reason – you change the group ownership from asmadm to a different group, ASM can not populate the views v$asm_filesystem and v$asm_acfsvolumes which in turn means that you can not monitor the filesystem with Oracle Enterprise Manager Grid Control because it uses those 2 views for monitoring.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
[root@md1 data_testvol]# chgrp myapp .
[root@md1 data_testvol]# ls -la
total 80
drwxrwx--- 4 root   myapp     4096 Jan 10 09:45 .
drwxr-xr-x 3 grid   oinstall  4096 Jan 10 09:44 ..
drwxr-xr-x 5 root   root      4096 Jan 10 09:44 .ACFS
-rw-r--r-- 1 root   asmadm     610 Jan 10 09:45 .fslimit
drwx------ 2 root   root     65536 Jan 10 09:44 lost+found
 
 
SQL> select * from v$asm_filesystem where fs_name = '/opt/oracle/gridbase/acfsmounts/data_testvol'
  2  ;
 
no rows selected

From my point of view, this is a severe limitation. ACFS Filesystems, should like any other filesystem, be able to allow any user/group ownership and still be able to monitor it. However, I could not convince my oracle support engineer to see it the same way…

Grid Control Agent 11.1: High Virtual Memory Consumption

There is a known issue in Grid Control 11.1 Agents, which causes huge Virtual Memory Consumption. I have experienced virtual memory consumption of up to 5.6 GB for the emagent process:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
Resources PID: 14103, emagent PPID: 13773 euid: 172 User:agent
--------------------------------------------------------------------------------
CPU Usage (util): 2.5 Log Reads : 14 Wait Reason : OTHER
User/Nice/RT CPU: 2.5 Log Writes: 0 Total RSS/VSS :546.5mb/ 5.68gb
 
Regions PID: 14103, emagent PPID: 13773 euid: 172 User:agent
 
Type RefCt RSS VSS Locked File Name
--------------------------------------------------------------------------------
NULLDR/Shared 531 4kb 4kb 0kb <nulldref>
MEMMAP/Shared 2 8kb 32kb 0kb /var/.../14103
TEXT /Shared 2 12kb 12kb 0kb /opt/.../bin/emagent
DATA /Priv 1 131.4mb 144.0mb 0kb /opt/.../bin/emagent
MEMMAP/Priv 1 52kb 4.0mb 0kb <mmap>
UAREA /Priv 1 64kb 72kb 0kb <uarea>
UAREA /Priv 1 64kb 72kb 0kb <uarea>
UAREA /Priv 1 64kb 72kb 0kb <uarea>
UAREA /Priv 1 64kb 72kb 0kb <uarea>
UAREA /Priv 1 64kb 72kb 0kb <uarea>
MEMMAP/Priv 1 52kb 4.0mb 0kb <mmap>
UAREA /Priv 1 64kb 72kb 0kb <uarea>
UAREA /Priv 1 64kb 72kb 0kb <uarea>
UAREA /Priv 1 64kb 72kb 0kb <uarea>
UAREA /Priv 1 64kb 72kb 0kb <uarea>
UAREA /Priv 1 64kb 72kb 0kb <uarea>
UAREA /Priv 1 64kb 72kb 0kb <uarea>
MEMMAP/Priv 1 356.0mb 5.32gb 0kb <mmap>

The workaround for this issue is to set a Heap Size Limit of 512M in $AGENT_HOME/sysman/config/emd.properties:

append the String “-Xmx512m” to the line agentJavaDefines.

So the line looks like:

agentJavaDefines=-Djava.awt.headless=true -Dsun.lang.ClassLoader.allowArraySyntax=true -Dnetworkaddress.cache.ttl=1800 -DUrlTiming.UseJSSE=true -Doracle.dms.refresh.wait.time=1000 -Xmx512m

This is the bug, which is platform-independet:
Bug 9829732: AGENT 11.1.0.1 RUNNING ON OMS BOX IS CONSUMING HIGH MEMORY

Presentation “Minimal Downtime Oracle 11g Upgrade” at DOAG Conference 2010

I have uploaded the presentation material from my DOAG Conference presentation on “Minimal Downtime Oracle 11g Upgrade”. The material contains a paper, presentation and an online demo which shows how Logical Standby SQL Apply can be used for minimal downtime 10g to 11g Upgrade. Additionally, for convenience, i provide all 3 components in a ZIP package.

DOAG Conference 2010: Minimal Downtime Oracle 11g Upgrades

ADR packages and Oracle Support

Today I decided to take advantage of adrci and incident package generation for upload to Oracle support. Guess what the first request from Oracle support was:

Please upload the text version of the ASM alert.log.

.

Very funny. 🙁