9iR2

Out-of-Memory killer on 32bit Linux with big RAM

By Martin | August 25th, 2009 | Category: 10g, 11g, 9iR2, Linux, Oracle Database | 1 Comment »

It is not very known that you can run into serious problems if you run Linux x86-32bit with a big amount of RAM installed, if using RHEL below 5. The official name for the issue is called “Low Memory Starvation”. The best solution is to use x86-64bit to be able to address the whole amount of RAM efficiently.

However, if that is not feasible, then make sure that you at least run the hugemem kernel if you use RHEL < 5. In RHEL5-32bit, the hugemem kernel is default. A quick demonstration about what can happen if you don´t use hugemem kernel is shown here: We realized that RMAN backup is taking more than 24 hours. Querying v$session, we find out that one session is in ACTION "STARTED", whereas the other sessions are FINISHED.

SQL> select program, module,action 
      from v$session 
      where username = 'SYS' and program like 'rman%'
/      

PROGRAM                    MODULE                       ACTION             
-------------------------- ---------------------------  -------------------
rman@ora-vm1 (TNS V1-V3)    backup full datafile        0000078 FINISHED129
rman@ora-vm1 (TNS V1-V3)    backup full datafile        0000272 STARTED16  
rman@ora-vm1 (TNS V1-V3)    backup full datafile        0000084 FINISHED129
rman@ora-vm1 (TNS V1-V3)    rman@ora-vm1 (TNS V1-V3)                       
rman@ora-vm1 (TNS V1-V3)    rman@ora-vm1 (TNS V1-V3)    0000004 FINISHED131
rman@ora-vm1 (TNS V1-V3)    backup full datafile        0000092 FINISHED129

Then we check the SID,serial# from v$session of this session and also query the UNIX PID from v$process.spid

SQL> select sid,serial# from v$session where event like 'RMAN%';

       SID    SERIAL#
---------- ----------
      4343       5837

We activate SQL Tracing for this session to determine its activity:

SQL> select spid from v$process where addr = 
   (select paddr from v$session where sid = 4343);

SPID
------------
1681

SQL> begin dbms_monitor.session_trace_enable(4343,5837,true,true);
  2  end;
  3  /

However, no trace file gets created. Then we start tracing system calls with strace:

ora-vm1:# strace -fp 1681
attach: ptrace(PTRACE_ATTACH, ...): Operation not permitted

“Not permitted”? – Although I am connected as root?

ps -ef|grep 1681
oracle    1681 1582  0 Aug24 ?        00:00:09 [oracle] <defunct>

The linux command “ps” reports the server process as “defunct”.

ora-vm1:/usr/oracle/admin/labo/udump$ ps -ef|grep 1582
oracle   1582 21578  0 Aug24 ?        00:00:02 rman oracle/product/10.2.0/bin/rman nocatalog
oracle   21663 1582  0 Aug24 ?        00:00:01 oraclelabo (DESCRIPTION=(LOCAL=YES)(ADDRESS=(PROTOCOL=beq)))
oracle   21665 1582  0 Aug24 ?        00:00:03 oraclelabo (DESCRIPTION=(LOCAL=YES)(ADDRESS=(PROTOCOL=beq)))
oracle   1681 1582   0 Aug24 ?        00:00:09 [oracle] <defunct>
oracle   21691 1582  0 Aug24 ?        00:01:36 oraclelabo (DESCRIPTION=(LOCAL=YES)(ADDRESS=(PROTOCOL=beq)))
oracle   21695 1582  0 Aug24 ?        00:01:41 oraclelabo (DESCRIPTION=(LOCAL=YES)(ADDRESS=(PROTOCOL=beq)))
oracle   21793 1582  0 Aug24 ?        00:01:30 oraclelabo (DESCRIPTION=(LOCAL=YES)(ADDRESS=(PROTOCOL=beq)))

Next, I checked logfile /var/log/messages.1 and realized that the kernel out-of-memory killer (OOM) killed this PID because of low memory starvation.

/var/log/messages.1:
Aug  24 22:32:44 ora-vm1 kernel: Out of Memory: Killed process 1681 (oracle).

Hugepages revisited II: Be aware of kernel bugs!

By Martin | January 7th, 2009 | Category: 10g, 11g, 9iR2, Linux, Linux Itanium, Oracle Database | No Comments »

It is well known that hugepages can reduce the overhead of managing memory pages of Oracle SGA by the operating system thus leading to lower system cpu utilization. I have written two blog entries regarding this topic already: Listener Coredumps on heavy load system and Hugepages revisited.

However, there is a potential risk with it: Certain kernels / platforms have bugs regarding hugepages which can lead to problems:

Bug 131295 – Hugepages configured on kernel boot line causes x86_64 kernel boot to fail with OOM: Fixed in RHEL3: kernel-2.4.21-40.EL

Bug 248954 – Oracle ASM DBWR process goes into 100% CPU spin when using hugepages on ia64 (Fixed in kernel-2.6.9-78.EL.ia64.rpm available as update for RHEL4U7)

RHSA-2008:1017-14: on the Itanium® architecture, setting the “vm.nr_hugepages” sysctl parameter caused a kernel stack overflow resulting in a kernel panic, and possibly stack corruption. With this fix, setting vm.nr_hugepages works correctly. Fixed with RHEL5 kernel-2.6.18-92.1.22.el5.ia64.rpm

RHSA-2008:1017-14: hugepages allow the Linux kernel to utilize the multiple page size capabilities of modern hardware architectures. In certain configurations, systems with large amounts of memory could fail to allocate most of this memory for hugepages even if it was free. This could result, for example, in database restart failures. Fixed with RHEL5 kernel-2.6.18-92.1.22.el5.ia64.rpm

Therefore, before enabling hugepages, I recommend to check with your OS Vendor Bug Database, test on a test system and apply recent OS upgrades first.

Installation Prerequisites for Oracle Databases

By Martin | December 23rd, 2008 | Category: 10g, 11g, 9iR2, Oracle Database | No Comments »

More often than not, I see installations, where either shared memory parameters or ulimit settings are not set as required for an optimal Oracle database installation.

The tool RDA, commonly known as a support tool to gather diagnostic information for Oracle Support for service requests, can do these checks quite nicely.

Steps:

Download RDA from MetaLink Note 314422.1.
unzip in working directory
run Health Check Validation Engine of RDA

$ ./rda.pl -T hcve Processing HCVE tests ... Available Pre-Installation Rule Sets: 1. Oracle Database 10g (10.1.0) PreInstall(HP-UX) 2. Oracle Database 10g R1 (10.1.0) PreInstall (HP-UX Itanium) 3. Oracle Database 10g R2 (10.2.0) PreInstall (HPUX) 4. Oracle Database 11g R1 (11.1.0) PreInstall (HPUX) 5. Oracle Application Server 10g (9.0.4) PreInstall (HP-UX) 6. Oracle Application Server 10g R2 (10.1.2) PreInstall (HP-UX) 7. Oracle Portal PreInstall (Generic) Available Post-Installation Rule Sets: 8. Oracle Portal PostInstall (generic) 9. Data Guard PostInstall (Generic) Enter the HCVE rule set number Hit 'Return' to accept the default (1) > 3
Enter value for < Planned ORACLE_HOME location or if set > Hit 'Return' to accept the default ($ORACLE_HOME) > Test "Oracle Database 10g R2 (10.2.0) PreInstall (HPUX)" executed at Tue Dec 23 15:26:31 2008 Test Results ~~~~~~~~~~~~
ID NAME RESULT VALUE ===== ==================== ====== ======================================== 10 OS Certified? PASSED Certified with 10g R2 RDBMS 20 User in /etc/passwd? PASSED userOK 25 Got EXTJOB User? FAILED ExtjobNotFound 30 Group in /etc/group? PASSED GroupOK 40 Input ORACLE_HOME RECORD $ORACLE_HOME 50 ORACLE_HOME Valid? PASSED OHexists 60 O_H Permissions OK? PASSED CorrectPerms 70 Umask Set to 022? PASSED UmaskOK 80 LDLIBRARYPATH Unset? PASSED UnSet 90 SHLIB_PATH Unset? PASSED UnSet 100 Other O_Hs in PATH? PASSED NotFound 110 oraInventory Permiss FAILED oraInventoryNotOK 120 /tmp Adequate? PASSED TempSpaceOK 130 Swap (in MB) RECORD 15144 140 RAM (in MB) PASSED 14334 150 SwapToRAM OK? PASSED SwapToRAMOK 160 Disk Space OK? FAILED OnlySpaceForOne 170 Kernel Parameters OK FAILED [EXECUTABLE_STACK=1] too large [MAXUP..> 175 Links and Libs OK? PASSED AllExist 180 Got ld,nm,ar,make? PASSED ld_nm_ar_make_found 190 ulimits OK? PASSED ulimitOK 200 Got OS Bundles? PASSED GOLDAPPS11iandGOLDBASE11iAdequate 210 Got OS Patches? FAILED [PHNE_31097 or its successor PHNE_324..> 220 Other OUI Up? PASSED NoOtherOUI

The output file contains detailed information, about why a specific check failed.

[PHNE_31097 or its successor PHNE_32477 or its successor PHNE_33498 or its successor PHNE_35418] not installed [PHSS_31221 or its successor PHSS_33263 or its successor PHSS_33944] not installed [PHSS_30970 or its successor PHSS_33033 or its successor PHSS_35379] not installed [PHSS_32508 or its successor PHSS_34411 or its successor PHSS_35099] not installed [PHSS_32509 or its successor PHSS_34412 or its successor PHSS_35098] not installed [PHSS_32510 or its successor PHSS_34413 or its successor PHSS_35100] not installed
FAILED [EXECUTABLE_STACK=1] too large [MAXUPRC=1024] too small [MSGMNI=50] too small [MSGTQL=40] too small [NCSIZE=9964] too small [NFILE=5000] too small [NINODE=4844] too small [SHMMAX=1024000000] too small [MAXSWAPCHUNKS=2048] too small [MSGMAP=42] too small == KernelOK

Installing LTOM for RAC Hanganalyze

By Martin | December 17th, 2008 | Category: 10g, 11g, 9iR2, MetaLink, Oracle Database, Real Application Clusters | No Comments »

I have recently written a short installation instruction for a customer to use LTOM for RAC hanganalyze.

Download von MetaLink Note 352363.1:
https://metalink2.oracle.com/cgi-bin/cr/getfile.cgi?p_attid=352363.1:ltom

unzip as User Oracle in directory with sufficient space. subdir ltom will be created

tar xfv ltom411.tar

Install statspack

Installation: db account ltom/ltom will be created

Java Binary in path
which java

if not, place java in path in ~/.bash_profile:
als User oracle: add $ORACLE_HOME/jre/1.4.2/bin to PATH Variable

e.g.:
export PATH=$PATH:$ORACLE_HOME/bin:$ORACLE_HOME/jre/1.4.2/bin

Put TOM_HOME in .~/.bash_profile:
export TOM_HOME=/home/oracle/ltom/tom_base/install/../tom

Sourcen of TOM_HOME

cd /home/oracle/ltom/tom_base/install
./autoinstall.sh

Sample Output:
[oracle@ora-vm1 install]$ ./autoinstall.sh
TOM_HOME=/home/oracle/ltom/tom_base/install/../tom
rm: cannot remove `autoinstall.log’: No such file or directory

Preparing to install LTOM v4.0.0…

Checking your Oracle Environment…

ORACLE_HOME=/u01/app/oracle/product/10.2.0
ORACLE_SID=ORA102
TNS_ADMIN=
TOM_HOME=/home/oracle/ltom/tom_base/install/../tom

Is this the correct Oracle environment you wish to use for installing LTOM?
Hit Return to Accept Default:
============================> \c

Continuing the install with this environment

Checking for Java…

Java found.

Checking for top utility…
Found top utility.

Checking for vmstat utility…
Found vmstat utility.

Need to run some sql now to create TOM user
it Return to allow setup to do this for you, enter other to do this manually:
============================> \c

Tom user created successfully.
Warning…Please change tom password immediately!

Grant user Tom successful.
Create package dbmssupp successful.
Grant execute on dbmssupp successful.

Checking for statspack…

Cannot verify statspack is installed on your system.
LTOM requires statspack to be installed to work properly.

Enter your email id to get notification of hangs via email…
Hit Return to ignore hang notification via email:
============================> \c

Configuring tom_deploy.properties file…

TOM/LTOM successfully installed.

Please review any messages you received above.

REMEMBER: You must have TOM_HOME defined and java in your PATH
before you can run TOM/LTOM.

Configuration

Change ltom/tom_base/tom/init/hangDetect.properties

from:
#RAC_AWARE=TRUE
HANGANALYZE=A
SYSTEMSTATE=266
DELAY=30
HANGANALYZE=A
SYSTEMSTATE=266

to:
RAW_AWARE=TRUE
HANGANALYZE=A
SYSTEMSTATE=266
DELAY=30
HANGANALYZE=A
SYSTEMSTATE=266

Change tom db passwort:

SQL> alter user tom identified by secret;

Add credentials to config file:

vi tom_deploy.properties

DB_USERID=tom
:q!

./startltom.sh -p
Enter same password as in sqlplus. Password will be saved encrypted in tom_deploy.properties

edit tom_deploy.properties to decide which of the 3 modules should be running during background operations: hangdetect, profiler, session recorder

Start/Stop Scripts

vi start_ltom.sh
#!/bin/bash
. ~/.bash_profile
nohup $TOM_HOME/startltom.sh -s &

vi stop_ltom.sh
#!/bin/bash
touch $TOM_HOME/tmp/hang.stop
touch $TOM_HOME/tmp/profile.stop
touch $TOM_HOME/tmp/session.stop

chmod u+x start_ltom.sh
chmod u+x stop_ltom.sh

If you want to run ltom during a specific timeframe, use cron (as user oracle)

crontab -e
45 01 * * * /home/oracle/ltom/start_ltom.sh >/dev/null 2>/dev/null </dev/null
00 05 * * * /home/oracle/ltom/start_ltom.sh >/dev/null 2>/dev/null </dev/null

Troublshooting Trace Files of DIAG Background Process

By Martin | December 10th, 2008 | Category: 10g, 11g, 9iR2, MetaLink, Oracle Database, Real Application Clusters | No Comments »

In trace files of the background process DIAG, I saw an interesting section that I would like to share.

*** 2008-12-09 03:46:33.280 Dump requested by process [orapid=6] REQUEST:custom dump [2] with parameters [6][27][4][2] Dumping process info of pid[27.31384] requested by pid[6.31160] ORA-00494: Message 494 not found; No message file for product=RDBMS, facility=ORA; arguments: [ [CF]] [ (more than 900 seconds)] [inst 2, osid 31384] ------------------------------------------------------------------------------- ENQUEUE [CF] HELD FOR TOO LONG enqueue holder: 'inst 2, osid 31384' Process 'inst 2, osid 31384' is holding an enqueue for maximum allowed time. The process will be terminated. Oracle Support Services triaging information: to find the root-cause, look at the call stack of process 'inst 2, osid 31384' located below. Ask the developer that owns the first NON-service layer in the stack to investigate. Common service layers are enqueues (ksq), latches (ksl), library cache pins and locks (kgl), and row cache locks (kqr).

It is very nice of diag to tell me what to do with the output. 😉 Especially:

Ask the developer that owns the first NON-service layer in the stack to investigate.

System State Dump Evaluation with ass.awk

By Martin | December 10th, 2008 | Category: 10g, 11g, 9iR2, MetaLink, Oracle Database, Real Application Clusters | No Comments »

I have found out that there is an Oracle Support utility called ass.awk, which can be used to analyze and format a System State dump. It is contained in LTOM, which can be downloaded from MetaLink.

The AWK script is called ass109.awk and can be used in this manner:

[oracle@ora-vm1 ~]$ awk -f ass109.awk mddb1_diag_12345.trc
 
System State ASS:
Starting Systemstate 1
..............................................................................
.....
Ass.Awk Version 1.0.9 - Processing mddb1_diag_12345.trc
System State 1
~~~~~~~~~~~~~~~~
1:
2: waiting FOR 'pmon timer' wait
3: LAST wait FOR 'ksdxexeotherwait'
4: waiting FOR 'rdbms ipc message' wait
5: waiting FOR 'rdbms ipc message' wait
6: waiting FOR 'ges remote message' wait
7: waiting FOR 'gcs log flush sync' wait
8: waiting FOR 'gcs log flush sync' wait
9: waiting FOR 'rdbms ipc message' wait
10: waiting FOR 'rdbms ipc message' wait
11: waiting FOR 'enq: CF - contention' [Enqueue CF-00000000-00000000] wait
12: waiting FOR 'rdbms ipc message' wait
13: waiting FOR 'buffer busy waits' (2,a9,3b) wait
14: waiting FOR 'rdbms ipc message' wait
15: waiting FOR 'rdbms ipc message' wait
16: waiting FOR 'buffer busy waits' (2,a9,3b) wait
17: waiting FOR 'rdbms ipc message' wait
18:
19:
20: waiting FOR 'rdbms ipc message' wait
21: waiting FOR 'log file sync' wait
22: waiting FOR 'ASM background timer' wait
23: waiting FOR 'rdbms ipc message' wait
24: waiting FOR 'log file sync' wait
25: waiting FOR 'class slave wait' wait
26: waiting FOR 'SQL*Net message from client' wait
27: waiting FOR 'SQL*Net message from client' wait
28: waiting FOR 'enq: CF - contention' [Enqueue CF-00000000-00000000] wait
29: waiting FOR 'enq: CF - contention' [Enqueue CF-00000000-00000000] wait
30: waiting FOR 'Streams AQ: qmn coordinator idle wait' wait
31: waiting FOR 'SQL*Net message from client' wait
32: waiting FOR 'SQL*Net message from client' wait
Cmd: SELECT
33: waiting FOR 'log file sync' wait
34: waiting FOR 'Streams AQ: qmn slave idle wait' wait
35: FOR 'Streams AQ: waiting for time management or cleanup tasks' wait
36: waiting FOR 'log file sync' wait
37: waiting FOR 'SQL*Net message from client' wait
Cmd: SELECT
38: waiting FOR 'SQL*Net message from client' wait
39: waiting FOR 'SQL*Net message from client' wait
40: waiting FOR 'PX Deq: Execution Msg' wait
41: waiting FOR 'log file sync' wait
42: waiting FOR 'log file sync' wait
43: waiting FOR 'enq: CF - contention' [Enqueue CF-00000000-00000000] wait
Cmd: SELECT
44: waiting FOR 'SQL*Net message from client' wait
45: waiting FOR 'SQL*Net message from client' wait
46: waiting FOR 'SQL*Net message from client' wait
47: waiting FOR 'SQL*Net message from client' wait
48: waiting FOR 'SQL*Net message from client' wait
49: waiting FOR 'log file sync' wait
50: waiting FOR 'SQL*Net message from client' wait
51: waiting FOR 'SQL*Net message from client' wait
Cmd: SELECT
52: waiting FOR 'row cache lock' [Rcache object=0xa5414098,] wait
Cmd: SELECT
53: waiting FOR 'SQL*Net message from client' wait
54: waiting FOR 'SQL*Net message from client' wait
55: FOR 'Streams AQ: waiting for messages in the queue' wait
56: waiting FOR 'SQL*Net message from client' wait
57: waiting FOR 'SQL*Net message from client' wait
58: waiting FOR 'log file sync' wait
Cmd: INSERT
59: waiting FOR 'jobq slave wait' wait
60: waiting FOR 'jobq slave wait' wait
61: waiting FOR 'SQL*Net message from client' wait
62: waiting FOR 'SQL*Net message from client' wait
63: waiting FOR 'SQL*Net message from client' wait
65: waiting FOR 'SQL*Net message from client' wait
Cmd: SELECT
66: waiting FOR 'SQL*Net message from client' wait
67: LAST wait FOR 'SQL*Net message from client'
68: waiting FOR 'SQL*Net message from client' wait
69: waiting FOR 'SQL*Net message from client' wait
70: waiting FOR 'SQL*Net message from client' wait
71: waiting FOR 'SQL*Net message from client' wait
72: waiting FOR 'SQL*Net message from client' wait
73: waiting FOR 'SQL*Net message from client' wait
74: waiting FOR 'SQL*Net message from client' wait
Cmd: SELECT
75: waiting FOR 'SQL*Net message from client' wait
Cmd: SELECT
76: waiting FOR 'SQL*Net message from client' wait
77: waiting FOR 'log file sync' wait
78: waiting FOR 'SQL*Net message from client' wait
80: waiting FOR 'SQL*Net message from client' wait
Cmd: SELECT
82: waiting FOR 'SQL*Net message from client' wait
84: waiting FOR 'log file sync' wait
86: waiting FOR 'SQL*Net message from client' wait
93: waiting FOR 'SQL*Net message from client' wait
94: waiting FOR 'SQL*Net message from client' wait
Blockers
~~~~~~~~
Above IS a list OF ALL the processes. IF they are waiting FOR a resource
THEN it will be given IN square brackets. Below IS a summary OF the
waited upon resources, together WITH the holder OF that resource.
Notes:
~~~~~
o A process id OF '???' implies that the holder was NOT found IN the
systemstate.
Resource Holder State
Enqueue CF-00000000-00000000 12: waiting FOR 'rdbms ipc message'
Rcache object=0xa5414098, ??? Blocker
Object Names
~~~~~~~~~~~~
Enqueue CF-00000000-00000000
Rcache object=0xa5414098,

[oracle@ora-vm1 ~]$ awk -f ass109.awk mddb1_diag_12345.trc System State ASS: Starting Systemstate 1 .............................................................................. ..... Ass.Awk Version 1.0.9 - Processing mddb1_diag_12345.trc System State 1 ~~~~~~~~~~~~~~~~ 1: 2: waiting for 'pmon timer' wait 3: last wait for 'ksdxexeotherwait' 4: waiting for 'rdbms ipc message' wait 5: waiting for 'rdbms ipc message' wait 6: waiting for 'ges remote message' wait 7: waiting for 'gcs log flush sync' wait 8: waiting for 'gcs log flush sync' wait 9: waiting for 'rdbms ipc message' wait 10: waiting for 'rdbms ipc message' wait 11: waiting for 'enq: CF - contention' [Enqueue CF-00000000-00000000] wait 12: waiting for 'rdbms ipc message' wait 13: waiting for 'buffer busy waits' (2,a9,3b) wait 14: waiting for 'rdbms ipc message' wait 15: waiting for 'rdbms ipc message' wait 16: waiting for 'buffer busy waits' (2,a9,3b) wait 17: waiting for 'rdbms ipc message' wait 18: 19: 20: waiting for 'rdbms ipc message' wait 21: waiting for 'log file sync' wait 22: waiting for 'ASM background timer' wait 23: waiting for 'rdbms ipc message' wait 24: waiting for 'log file sync' wait 25: waiting for 'class slave wait' wait 26: waiting for 'SQL*Net message from client' wait 27: waiting for 'SQL*Net message from client' wait 28: waiting for 'enq: CF - contention' [Enqueue CF-00000000-00000000] wait 29: waiting for 'enq: CF - contention' [Enqueue CF-00000000-00000000] wait 30: waiting for 'Streams AQ: qmn coordinator idle wait' wait 31: waiting for 'SQL*Net message from client' wait 32: waiting for 'SQL*Net message from client' wait Cmd: Select 33: waiting for 'log file sync' wait 34: waiting for 'Streams AQ: qmn slave idle wait' wait 35: for 'Streams AQ: waiting for time management or cleanup tasks' wait 36: waiting for 'log file sync' wait 37: waiting for 'SQL*Net message from client' wait Cmd: Select 38: waiting for 'SQL*Net message from client' wait 39: waiting for 'SQL*Net message from client' wait 40: waiting for 'PX Deq: Execution Msg' wait 41: waiting for 'log file sync' wait 42: waiting for 'log file sync' wait 43: waiting for 'enq: CF - contention' [Enqueue CF-00000000-00000000] wait Cmd: Select 44: waiting for 'SQL*Net message from client' wait 45: waiting for 'SQL*Net message from client' wait 46: waiting for 'SQL*Net message from client' wait 47: waiting for 'SQL*Net message from client' wait 48: waiting for 'SQL*Net message from client' wait 49: waiting for 'log file sync' wait 50: waiting for 'SQL*Net message from client' wait 51: waiting for 'SQL*Net message from client' wait Cmd: Select 52: waiting for 'row cache lock' [Rcache object=0xa5414098,] wait Cmd: Select 53: waiting for 'SQL*Net message from client' wait 54: waiting for 'SQL*Net message from client' wait 55: for 'Streams AQ: waiting for messages in the queue' wait 56: waiting for 'SQL*Net message from client' wait 57: waiting for 'SQL*Net message from client' wait 58: waiting for 'log file sync' wait Cmd: Insert 59: waiting for 'jobq slave wait' wait 60: waiting for 'jobq slave wait' wait 61: waiting for 'SQL*Net message from client' wait 62: waiting for 'SQL*Net message from client' wait 63: waiting for 'SQL*Net message from client' wait 65: waiting for 'SQL*Net message from client' wait Cmd: Select 66: waiting for 'SQL*Net message from client' wait 67: last wait for 'SQL*Net message from client' 68: waiting for 'SQL*Net message from client' wait 69: waiting for 'SQL*Net message from client' wait 70: waiting for 'SQL*Net message from client' wait 71: waiting for 'SQL*Net message from client' wait 72: waiting for 'SQL*Net message from client' wait 73: waiting for 'SQL*Net message from client' wait 74: waiting for 'SQL*Net message from client' wait Cmd: Select 75: waiting for 'SQL*Net message from client' wait Cmd: Select 76: waiting for 'SQL*Net message from client' wait 77: waiting for 'log file sync' wait 78: waiting for 'SQL*Net message from client' wait 80: waiting for 'SQL*Net message from client' wait Cmd: Select 82: waiting for 'SQL*Net message from client' wait 84: waiting for 'log file sync' wait 86: waiting for 'SQL*Net message from client' wait 93: waiting for 'SQL*Net message from client' wait 94: waiting for 'SQL*Net message from client' wait Blockers ~~~~~~~~ Above is a list of all the processes. If they are waiting for a resource then it will be given in square brackets. Below is a summary of the waited upon resources, together with the holder of that resource. Notes: ~~~~~ o A process id of '???' implies that the holder was not found in the systemstate. Resource Holder State Enqueue CF-00000000-00000000 12: waiting for 'rdbms ipc message' Rcache object=0xa5414098, ??? Blocker Object Names ~~~~~~~~~~~~ Enqueue CF-00000000-00000000 Rcache object=0xa5414098,

With this information, you can investigate further into the System state dump by looking at rcache object 0xa5414098. The “State Objects” are a hierarchical structure. Every state object has an adress “SO: 0xa6a5955c” and an owner, which is the parent entry. In this case, you can construct a hierarchical tree of state objects and find dependencies.

 SO: 0xa6a5995c, TYPE: 2, owner: (nil), flag: INIT/-/-/0x00
 (process) Oracle pid=52, calls cur/top: 0x9d62eb1c/0x9d5ccfcc, flag: (0) -
		   INT error: 0, CALL error: 0, sess error: 0, txn error 0
 (post info) LAST post received: 0 0 163
			 LAST post received-location: kqrbtm
			 LAST process TO post me: a6a4e1dc 1 6
			 LAST post sent: 0 0 24
			 LAST post sent-location: ksasnd
			 LAST process posted BY me: a6a4e1dc 1 6
   (latch info) wait_event=0 bits=0
	Process GROUP: DEFAULT, pseudo proc: 0xa6baf79c
	O/S info: USER: oracle, term: UNKNOWN, ospid: 29032
    OSD pid info: Unix process pid: 29032, image: oracle@ora-vm1
    Short stack dump: ksdxfstk()+19<-ksdxcb()+1321<-sspuser()+102<-__kernel_vsyscall()+16
 
    SO: 0xa4123d20, TYPE: 4, owner: 0xa6a5995c, flag: INIT/-/-/0x00
    (SESSION) sid: 955 trans: (nil), creator: 0xa6a5995c, flag: (41) USR/- BSY/-/-/-/-/-
              DID: 0001-0034-00002227, short-term DID: 0001-0034-00002228
              txn branch: (nil)
              oct: 3, prv: 0, SQL: 0x8fee9d44, psql: 0x8fee9f00, USER: 23/DBSNMP
     service name: SYS$USERS
     O/S info: USER: oracle, term: , ospid: 28849, machine: ora-vm1
               program: emagent@ora-vm1(TNS V1-V3)
     application name: emagent@ora-vm1(TNS V1-V3), hash VALUE=2420928497
     waiting FOR 'row cache lock' blocking sess=0x(nil) seq=127 wait_time=0 seconds since wait started=0
                 cache id=d, mode=0, request=5
     Dumping SESSION Wait History
      FOR 'row cache lock' COUNT=1 wait_time=1337230
		         cache id=d, mode=0, request=5
      FOR 'row cache lock' COUNT=1 wait_time=2930561
                 cache id=d, mode=0, request=5
      FOR 'row cache lock' COUNT=1 wait_time=2929106
                 cache id=d, mode=0, request=5
      FOR 'row cache lock' COUNT=1 wait_time=2933326
 
      SO: 0x9d5ccfcc, TYPE: 3, owner: 0xa6a5995c, flag: INIT/-/-/0x00
     (CALL) sess: cur a4123d20, rec a4123d20, usr a4123d20; depth: 0
       ----------------------------------------
       SO: 0x9d51b1a0, TYPE: 3, owner: 0x9d5ccfcc, flag: INIT/-/-/0x00
       (CALL) sess: cur a4123d20, rec a417f990, usr a4123d20; depth: 1
         ----------------------------------------
         SO: 0x9d69d7f0, TYPE: 3, owner: 0x9d51b1a0, flag: INIT/-/-/0x00
         (CALL) sess: cur a417f990, rec a417f990, usr a4123d20; depth: 2
         ----------------------------------------
           SO: 0x9d5e547c, TYPE: 3, owner: 0x9d69d7f0, flag: INIT/-/-/0x00
           (CALL) sess: cur a417f990, rec a417f990, usr a4123d20; depth: 3
           ----------------------------------------
             SO: 0x9d62eb1c, TYPE: 3, owner: 0x9d5e547c, flag: INIT/-/-/0x00
             (CALL) sess: cur a417f990, rec a417f990, usr a4123d20; depth: 4
             ----------------------------------------
               SO: 0x9fedf748, TYPE: 40, owner: 0x9d62eb1c, flag: INIT/-/-/0x00
               (trans) flg = 0x21, flg2 = 0x80000, prx = 0x0, ros = 2147483647 bsn = 0x395 bndsn = 0x395 spn = 0x395
               efd = 34
               parent xid: 0x0000.000.00000000
               env: (scn: 0x0000.00000000 xid: 0x0000.000.00000000 uba: 0x00000000.0000.00 statement num=0 parent xid: xid:
               0x0000.000.00000000 scn: 0x0000.00000000 0sch: scn: 0x0000.00000000)
               cev: (spc = 0 arsp = 0 ubk tsn: 0 rdba: 0x00000000 useg tsn: 0 rdba: 0x00000000
                     hwm uba: 0x00000000.0000.00 col uba: 0x00000000.0000.00
                     num bl: 0 bk list: 0x0)
                     cr opc: 0x0 spc: 0 uba: 0x00000000.0000.00
               (enqueue) TX-00010011-00D3B4CF DID: 0001-0051-00001323
               lv: 7c ac ad 4d 03 00 00 00 00 00 00 00 00 00 00 00 res_flag: 0x45
               res: 0xa447bb2c, lock_flag: 0x0
               own: 0xa4157e90, sess: 0xa4157e90, prv: 0x9fedf774
               slk: (nil)
                xga: 0x0, heap: UGA
               Trans IMU st: 0 Pool INDEX 65535, Redo pool 0x9fedfb74, Undo pool 0x9fedfbf0
               Redo pool range [0x40ed57e8 0x40ed57e8 0x40ed7fe8]
               Undo pool range [0x40ed2fe8 0x40ed2fe8 0x40ed57e8]
               ----------------------------------------
                  SO: 0x9d42237c, TYPE: 50, owner: 0x9fedf748, flag: INIT/-/-/0x00
                  ROW cache enqueue: COUNT=1 SESSION=0xa4123d20 object=0xa5414098, request=X
                  SAVEPOINT=0x395
                  ROW cache parent object: address=0xa5414098 cid=13(dc_sequences)
                  hash=9649e7e typ=9 TRANSACTION=(nil) flags=00000000
                  own=0xa5414104[0xa5414104,0xa5414104] wat=0xa541410c[0x9d422398,0x9d422398] mode=N
                  STATUS=-/-/-/-/-/-/-/-/-
                  request=X release=FALSE flags=2
                  instance LOCK id=QN 010cdbb3 b3b24847

SO: 0xa6a5995c, type: 2, owner: (nil), flag: INIT/-/-/0x00 (process) Oracle pid=52, calls cur/top: 0x9d62eb1c/0x9d5ccfcc, flag: (0) - int error: 0, call error: 0, sess error: 0, txn error 0 (post info) last post received: 0 0 163 last post received-location: kqrbtm last process to post me: a6a4e1dc 1 6 last post sent: 0 0 24 last post sent-location: ksasnd last process posted by me: a6a4e1dc 1 6 (latch info) wait_event=0 bits=0 Process Group: DEFAULT, pseudo proc: 0xa6baf79c O/S info: user: oracle, term: UNKNOWN, ospid: 29032 OSD pid info: Unix process pid: 29032, image: oracle@ora-vm1 Short stack dump: ksdxfstk()+19<-ksdxcb()+1321<-sspuser()+102<-__kernel_vsyscall()+16 SO: 0xa4123d20, type: 4, owner: 0xa6a5995c, flag: INIT/-/-/0x00 (session) sid: 955 trans: (nil), creator: 0xa6a5995c, flag: (41) USR/- BSY/-/-/-/-/- DID: 0001-0034-00002227, short-term DID: 0001-0034-00002228 txn branch: (nil) oct: 3, prv: 0, sql: 0x8fee9d44, psql: 0x8fee9f00, user: 23/DBSNMP service name: SYS$USERS O/S info: user: oracle, term: , ospid: 28849, machine: ora-vm1 program: emagent@ora-vm1(TNS V1-V3) application name: emagent@ora-vm1(TNS V1-V3), hash value=2420928497 waiting for 'row cache lock' blocking sess=0x(nil) seq=127 wait_time=0 seconds since wait started=0 cache id=d, mode=0, request=5 Dumping Session Wait History for 'row cache lock' count=1 wait_time=1337230 cache id=d, mode=0, request=5 for 'row cache lock' count=1 wait_time=2930561 cache id=d, mode=0, request=5 for 'row cache lock' count=1 wait_time=2929106 cache id=d, mode=0, request=5 for 'row cache lock' count=1 wait_time=2933326 SO: 0x9d5ccfcc, type: 3, owner: 0xa6a5995c, flag: INIT/-/-/0x00 (call) sess: cur a4123d20, rec a4123d20, usr a4123d20; depth: 0 ---------------------------------------- SO: 0x9d51b1a0, type: 3, owner: 0x9d5ccfcc, flag: INIT/-/-/0x00 (call) sess: cur a4123d20, rec a417f990, usr a4123d20; depth: 1 ---------------------------------------- SO: 0x9d69d7f0, type: 3, owner: 0x9d51b1a0, flag: INIT/-/-/0x00 (call) sess: cur a417f990, rec a417f990, usr a4123d20; depth: 2 ---------------------------------------- SO: 0x9d5e547c, type: 3, owner: 0x9d69d7f0, flag: INIT/-/-/0x00 (call) sess: cur a417f990, rec a417f990, usr a4123d20; depth: 3 ---------------------------------------- SO: 0x9d62eb1c, type: 3, owner: 0x9d5e547c, flag: INIT/-/-/0x00 (call) sess: cur a417f990, rec a417f990, usr a4123d20; depth: 4 ---------------------------------------- SO: 0x9fedf748, type: 40, owner: 0x9d62eb1c, flag: INIT/-/-/0x00 (trans) flg = 0x21, flg2 = 0x80000, prx = 0x0, ros = 2147483647 bsn = 0x395 bndsn = 0x395 spn = 0x395 efd = 34 parent xid: 0x0000.000.00000000 env: (scn: 0x0000.00000000 xid: 0x0000.000.00000000 uba: 0x00000000.0000.00 statement num=0 parent xid: xid: 0x0000.000.00000000 scn: 0x0000.00000000 0sch: scn: 0x0000.00000000) cev: (spc = 0 arsp = 0 ubk tsn: 0 rdba: 0x00000000 useg tsn: 0 rdba: 0x00000000 hwm uba: 0x00000000.0000.00 col uba: 0x00000000.0000.00 num bl: 0 bk list: 0x0) cr opc: 0x0 spc: 0 uba: 0x00000000.0000.00 (enqueue) TX-00010011-00D3B4CF DID: 0001-0051-00001323 lv: 7c ac ad 4d 03 00 00 00 00 00 00 00 00 00 00 00 res_flag: 0x45 res: 0xa447bb2c, lock_flag: 0x0 own: 0xa4157e90, sess: 0xa4157e90, prv: 0x9fedf774 slk: (nil) xga: 0x0, heap: UGA Trans IMU st: 0 Pool index 65535, Redo pool 0x9fedfb74, Undo pool 0x9fedfbf0 Redo pool range [0x40ed57e8 0x40ed57e8 0x40ed7fe8] Undo pool range [0x40ed2fe8 0x40ed2fe8 0x40ed57e8] ---------------------------------------- SO: 0x9d42237c, type: 50, owner: 0x9fedf748, flag: INIT/-/-/0x00 row cache enqueue: count=1 session=0xa4123d20 object=0xa5414098, request=X savepoint=0x395 row cache parent object: address=0xa5414098 cid=13(dc_sequences) hash=9649e7e typ=9 transaction=(nil) flags=00000000 own=0xa5414104[0xa5414104,0xa5414104] wat=0xa541410c[0x9d422398,0x9d422398] mode=N status=-/-/-/-/-/-/-/-/- request=X release=FALSE flags=2 instance lock id=QN 010cdbb3 b3b24847

Tanel Poder showed in his seminar to use “grep -A 3 SO:” to find all lines containing the State Object Adresses and the following 3 lines to get the tree.

SQL Profiler TVD$XTAT update available

By Martin | November 27th, 2008 | Category: 10g, 11g, 9iR2, Oracle Database, Performance Tuning | No Comments »

I just saw that Christian Antognini has released an update of TVD$XTAT. I am looking forward to experimenting with it. If you used tkprof in the past, you should definitely take a look at this tool.

Formatted SQL Plan Display in Oracle 9i

By Martin | October 19th, 2008 | Category: 9iR2, Oracle Database, SQL | No Comments »

Tom Kyte has posted a way to display the SQL Explain Plan from v$sql_plan in a formatted way in Oracle 9i on AskTom. Whereas this is easily possible in 10g with DBMS_XPLAN, in 9i this poses a bit of a challenge:

Step 1: Create a view dynamic_plan_table

create or replace view dynamic_plan_table
as
select
rawtohex(address) || ‘_’ || child_number statement_id,
sysdate timestamp, operation, options, object_node,
object_owner, object_name, 0 object_instance,
optimizer, search_columns, id, parent_id, position,
cost, cardinality, bytes, other_tag, partition_start,
partition_stop, partition_id, other, distribution,
cpu_cost, io_cost, temp_space, access_predicates,
filter_predicates
from v$sql_plan;

Step 2: Extract SQL child_number, rawtohex(address) from v$sql for a known SQL hash value:

SQL> select rawtohex(address) , child_number from v$sql where hash_value = ‘4163478529’;

Step 3: Query DBMS_XPLAN with created view

SQL> r select plan_table_output from TABLE( dbms_xplan.display ( ‘dynamic_plan_table’, (select rawtohex(address)||’_’||child_number x from v$sql where hash_value = ‘3471874038’ AND rawtohex(ADDRESS) = ‘00000003A924BBD0’ AND child_number = 1 ), ‘ALL’ ) )

9iR2

Out-of-Memory killer on 32bit Linux with big RAM

Hugepages revisited II: Be aware of kernel bugs!

Installation Prerequisites for Oracle Databases

Installing LTOM for RAC Hanganalyze

Troublshooting Trace Files of DIAG Background Process

System State Dump Evaluation with ass.awk

SQL Profiler TVD$XTAT update available

Formatted SQL Plan Display in Oracle 9i

Browse Categories

Browse Archives

Stay informed

Who writes