Data Guard

Data Guard 11.2.0.2 – update

By Martin | May 26th, 2011 | Category: Data Guard, Oracle Database | No Comments »

In March, Oracle MAA has published a white paper for implementing Data Guard on Exadata platforms. (http://www.oracle.com/technetwork/database/features/availability/maa-wp-dr-dbm-130065.pdf)

Although the paper is focused on Exadata, some information can be applied to non-Exadata systems as well:

Corruption Protection:
- Primary:
  - DB_BLOCK_CHECKSUM=FULL,
  - DB_BLOCK_CHECKING=FULL,
  - DB_LOST_WRITE_PROTECT=TYPICAL
- (Physical) Standby:
  - DB_BLOCK_CHECKSUM=FULL,
  - DB_BLOCK_CHECKING=OFF,
  - DB_LOST_WRITE_PROTECT=TYPICAL
Network Tuning:
TCP Send/Receive Buffers: 3 x Bandwith Delay Product or 10 MB, whichever is greater
Redo Transport Modes:
- SYNC is recommended if round-trip-time is less than 5 ms. Impact of SYNC Mode on primary performance has been improved because local online redo log write and redo shipping is not done sequentially anymore but in parallel.
- ASYNC: Transport Lag is reduced because Redo is not read from online redo log from disk but from log buffer if possible. init.ora log_buffer might need to be increased for this to yield the maximum benefit.
LOGGING/NOLOGGING: Normally, in Data Guard you set database-wide “ALTER DATABASE FORCE LOGGING”. In Data Warehouse environments, it might be advisable to set it to NOLOGGING and decide on a tablespace-basis whether the tablespace should allow or disallow NOLOGGING operations. E.g. a tablespace containing only transient, recreatable or non-critical data might benefit from being set to NOLOGGING for certain bulk operations. (CTAS, Direct Path Inserts, index rebuilds, etc.)
STANDBY-FIRST Patches: Some patches (PSU, CPU, PSE) will be flagged in the patch README with “Standby-First”. This means that they can be applied on a physical standby before being applied on the primary. For regular patches, this was not supported.

Is 11.2.0.2 ready for production? Judge yourself …

By Martin | April 18th, 2011 | Category: 11gR2, Bugs, Data Guard, MetaLink, Oracle Database | 5 comments

Murphy said: if problems arise, they all come at the same time. After several weeks of stability, this week brought hell of a lot of Oracle problems with 11.2.0.2 and RAC. To list the worst:

Downtime because of ASM process limit reached. Normally, the formula to calculate depends on the number of concurrent datafile extensions. In our case 100 ASM processes were not enough. Increased to 200 and found Note 1287496.1 which describes the issue. Merge Patch for bug is available for 11.2.0.2 GI Bundle 1 but not yet for GI Bundle 2. Ups.

CPU Starvation because of Adaptive Cursor Sharing: One server process was taking up 100% of cpu time for the last several hours. It turned out to be an OEM monitoring query having several thousands of child cursors.

PID 	USER 	PR NI VIRT 	RES SHR S %CPU %MEM TIME+ COMMAND
8667 	ora11 25 0 	8423m 33m 28m R 99.7 	0.1 358:27.78 ora_pz98_MDDB

I then checked different databases and found that almost all of our 11.2.0.2 databases have several hundreds or even thousands of child cursors.

SELECT sql_id, is_shareable, MIN(child_number), MAX(child_number), COUNT(*)
FROM gv$sql
GROUP BY sql_id, is_shareable HAVING COUNT(*) &gt;100 ORDER BY 6 DESC
 
SQL_ID        IS_SHAREABLE MIN(CHILD_NUMBER) MAX(CHILD_NUMBER)  COUNT(*)
------------- ------------ ----------------- ------------------ ---------
c7kt3njhnmtkm Y            0                 5097               1397
c7kt3njhnmtkm N            3                 3544               1836
1vnhgmpc17vv0 Y            0                 3022               2697
1vnhgmpc17vv0 N            6                 2185               444
93qh89pxuxzuw Y            0                 1949               1522
93qh89pxuxzuw N            2                 1625               428
5fk0v8km2f811 Y            0                 1281               1763
4f3ufvfcgfqsg Y            0                 792                794
cjbwk0ww7j5rv Y            0                 627                1251
dyqdzgxcv4d3t Y            0                 626                1252
5fk0v8km2f811 N            3                 543                260
f0jxh8d6b5af2 Y            0                 494                564
f0jxh8d6b5af2 N            0                 290                130
dbvkky621gqtr Y            0                 266                267
32rqmpqpfv0gm Y            0                 255                257
g9uwxs7pr8tjm Y            0                 254                257
40k6jjt90n4fa Y            3                 199                129

I suspect this to be Bug 10182051 Extended cursor sharing generates many shareable child cursors and there is a workaround:

alter system set “_optimizer_extended_cursor_sharing_rel”=none;

Limited database availability because of failing queries on gv$ tables:

SQL> select count(*) from gv$session
2 ;
select count(*) from  gv$session
*
ERROR at line 1:
ORA-12850: Could not allocate slaves on  all specified instances: 2 needed, 1
allocated
ORA-12801: error signaled  in parallel query server P001, instance 3599

Currently no other known workaround than bouncing all the RAC instances.

DataGuard ASYNC Redo Transport not reliable: We have a RAC primary / single instance physical standby setup and use async redo transport. During times of heavy ETL on the primary, the standby databases stops at recovery of one archivelog with “(in transit)”. Primary is showing this error in alert log:

ARC7: Creating remote archive destination LOG_ARCHIVE_DEST_2: 'HAMDB' (thread 2 SEQUENCE 4044) (MUCDB2)
ARC7: Archive log rejected (thread 2 SEQUENCE 4044) at host 'HAMDB'
FAL[server, ARC7]: FAL archive failed, see trace file.
ARCH: FAL archive failed. Archiver continuing
ORACLE Instance MUCDB2 - Archival Error. Archiver continuing.

On standby side, it says:
Media Recovery Waiting for thread 2 sequence 4044 (in transit)

The standby database never recovers from this problem, except when standby database is bounced. The problem appears with and without broker configuration. Currently there is no known workaround.

Maybe some of this issues will be addressed in upcoming PSU April, which will be released this week.

DOAG Presentation on Oracle Data Guard 11g (R1/R2) What´s new? (german)

By Martin | July 14th, 2010 | Category: 11g, 11gR2, Data Guard, Oracle Database, Oracle Enterprise Manager | No Comments »

I have uploaded the material of the german DOAG presentation on Oracle Data Guard 11gr2 Whats New to the “presentation” section. It also contains 6 recorded demos using Enterprise Manager Grid Control.

DataGuard Broker Setup fails because of Cisco Firewall ASA Feature “fixup/inspect”

By Martin | March 16th, 2009 | Category: 10g, Data Guard | No Comments »

I ran into this problem a couple of years ago but I still think that it is worth blogging about. Back then, I tried to setup a data guard broker enviroment but when trying to enable the configuration on the dgmgrl command line, the dataguard configuration files could not be synced to the standby host. After a couple of days of troubleshooting, it turned out that the Cisco ASA Firewall hat the SQL*Net Inspection Feature enabled. But because the data guard broker data packets did not exactly conform to SQL*Net, the Firewall corrupted the packet and it was not able to create the data guard configuration. After disabling the SQL*Net inspection, it worked without any further problems.

ORA-00322, ORA-00312 at DataGuard Standby

By Martin | October 25th, 2008 | Category: 10g, 11g, Bugs, Data Guard, Oracle Database | No Comments »

From time to time I have encountered the following errors on the physical dataguard standby database while in recovery mode:

Errors in file /oracle/STBDB1/oratrace/bdump/stbdb1_mrp0_26719.trc: ORA-00322: log 5 of thread 1 is not current copy ORA-00312: online log 5 thread 1: '/oracle/STBDB1/origlogB/standby_g5_m1.log' Sun Oct 16 13:04:08 2008 Errors in file /oracle/STBDB1/oratrace/bdump/stbdb1_mrp0_26719.trc: ORA-00322: log 5 of thread 1 is not current copy ORA-00312: online log 5 thread 1: '/oracle/STBDB1/origlogB/standby_g5_m1.log'

The problem corrects itself after a couple of minutes, so there is no real problem, but if you have alert log monitoring active, this will trigger a call for investigation. Oracle is tracking this error with Bug 5238386 – ORA-322 possible reading standby redo log header. There is a one-off patch available for 10.2.0.3 and the patch is included in patchset 10.2.0.4.

There is a workaround to clear the referenced redo log but I don´t see any point in doing that, because the error can occurr again as long as the patch is not installed.

Data Guard 11.2.0.2 – update

Is 11.2.0.2 ready for production? Judge yourself …

DOAG Presentation on Oracle Data Guard 11g (R1/R2) What´s new? (german)

DataGuard Broker Setup fails because of Cisco Firewall ASA Feature “fixup/inspect”

ORA-00322, ORA-00312 at DataGuard Standby

Browse Categories

Browse Archives

Stay informed

Who writes