Real Application Clusters

Add Dependency of Database Instance on ASM to OCR

Today at a presentation of Grégory Guillou and Alex Gorbachev from The Pythian Group at UKOUG 2008, I learned that the dependency of RAC database instances on ASM is not set up when adding the database instances to OCR via srvctl add instance.

I was wondering in the past, why the database instances would sometimes not start up correctly when booting and fail at loading the spfile from ASM.

So, if you did not use DBCA to create and register your RAC database with OCR but srvctl, as shown here:

$ srvctl add database -d BOSTON –o /opt/oracle/product/10g_db_rac
$ srvctl add instance -d BOSTON -i BOSTON1 -n boston_host1
$ srvctl add instance -d BOSTON -i BOSTON2 -n boston_host2

then you have to create this dependency manually in order to determine the appropriate startup order:

$ srvctl modify instance –d BOSTON –i BOSTON1 –s +ASM1
$ srvctl modify instance –d BOSTON –i BOSTON2 –s +ASM2

Excellent Presentations on

The database specialist Riyaj Shamsudeen from The Pythian Group has published some excellent presentations on his blog. Don´t miss it!

Is your RAC a ticking bomb?

I have come a across a very nasty bug on Oracle RAC on HP-UX Itanium after upgrading to The problem might also occur on Solaris and Linux x86-64. On one of the RAC instances of a 2 node Cluster, the number of open file descriptors of the oracle racgimon process is increasing by 1 every 60 seconds. This means that if your ulimit of open files for a process is set high and the HP-UX Kernel parameter nfiles is also set high, it might take weeks to months until the racgimon process finally hits the limit. If that happens, it can cause instability of the node because no more filedescriptors can be opened system-wide.

How to check, whether my installation suffers from this bug?

It is very easy: do an lsof -p and look for dozens of open filedescriptors of file hc_SID.dat.

– or –

Check logfile $ORACLE_HOME/log//racg/imon_.log:

2008-10-07 13:05:50.879: [ RACG][82] [29827][82][ora.DBNAME.DBNAME1.inst]: GIMH: GIM-00104: Health
check failed to connect to instance.
GIM-00090: OS-dependent operation:mmap failed with status: 12
GIM-00091: OS failure message: Not enough space
GIM-00092: OS failure occurred at: sskgmsmr_13

Is there a patch?
Good news is, there is. The bug is tracked via BugID 6931689 and there is patch #7298531 available to fix the problem with metalink note 739557.1.