Archive for November 2010

Presentation “Minimal Downtime Oracle 11g Upgrade” at DOAG Conference 2010

I have uploaded the presentation material from my DOAG Conference presentation on “Minimal Downtime Oracle 11g Upgrade”. The material contains a paper, presentation and an online demo which shows how Logical Standby SQL Apply can be used for minimal downtime 10g to 11g Upgrade. Additionally, for convenience, i provide all 3 components in a ZIP package.

DOAG Conference 2010: Minimal Downtime Oracle 11g Upgrades



ADR packages and Oracle Support

Today I decided to take advantage of adrci and incident package generation for upload to Oracle support. Guess what the first request from Oracle support was:

Please upload the text version of the ASM alert.log.

.

Very funny. 🙁



Grid Control 11g: Agent Metric Swap Utilization on HP-UX with Pseudo-Swap

If you are running Grid Control on HP-UX and you are using pseudo-swap, then you have to add the property:

NMUPM_USE_PSEUDO_MEM=TRUE

to the emd.properties. Otherwise, Swap Utilization is calculated wrongly (higher than acutal) and it might trigger false alarms.

How is the Grid Control Agent calculating Swap Utilization?

In file $AGENT_HOME/sysman/admin/metadata/host.xml line 310 you find some answers:

310
311
<ColumnDescriptor NAME="swapUtil" TYPE="NUMBER" IS_KEY="FALSE" 
COMPUTE_EXPR="(100.0 * usedSwap / (usedSwap + freeSwap))">

In other parts, you find the definition of “usedSwap” and “freeSwap”.

310
311
312
313
 <ColumnDescriptor NAME="usedSwap" TYPE="NUMBER" IS_KEY="FALSE" 
COMPUTE_EXPR="(usedSwapRaw / 1024.0)" HELP="NO_HELP">
 <ColumnDescriptor NAME="freeSwap" TYPE="NUMBER" IS_KEY="FALSE" 
COMPUTE_EXPR="(freeSwapRaw / 1024.0)" HELP="NO_HELP">

Now, we have to find “usedSwapRaw” and “freeSwapRaw”. In a different part (line 387) of this file, you see this section:

387
388
389
390
391
392
393
394
395
396
397
398
399
<GetView NAME="LoadInternalView" FROM_TABLE="_LoadInternal">
                        <Column NAME="cpuLoad_1min" COLUMN_NAME="cpuLoad_1min"/>
                        <Column NAME="cpuLoad" COLUMN_NAME="cpuLoad"/>
                        <Column NAME="cpuLoad_15min" COLUMN_NAME="cpuLoad_15min"/>
                        <Column NAME="pgScan" COLUMN_NAME="pgScan"/>
                        <Column NAME="noOfProcs" COLUMN_NAME="noOfProcs"/>
                        <Column NAME="transfers" COLUMN_NAME="transfers"/>
                        <Column NAME="pageSize" COLUMN_NAME="pageSize"/>
                        <Column NAME="realMem" COLUMN_NAME="realMem"/>
                        <Column NAME="freeMemRaw" COLUMN_NAME="freeMemRaw"/>
                        <Column NAME="usedSwapRaw" COLUMN_NAME="usedSwapRaw"/>
                        <Column NAME="freeSwapRaw" COLUMN_NAME="freeSwapRaw"/>
        </GetView>

In the “Metric” XML Tag for Metric NAME=”_LoadInternal”, you find the QueryDescriptor:

620
621
622
623
624
625
626
627
628
629
 <QueryDescriptor FETCHLET_ID="OSLineToken">
      <Property NAME="NMUPM_USE_PSEUDO_MEM" SCOPE="INSTANCE" OPTIONAL="TRUE">NMUPM_USE_PSEUDO_MEM</Property>
      <Property NAME="ENVNMUPM_USE_PSEUDO_MEM" SCOPE="GLOBAL">%NMUPM_USE_PSEUDO_MEM%</Property>
      <Property NAME="emdRoot" SCOPE="SYSTEMGLOBAL">emdRoot</Property>
      <Property NAME="command" SCOPE="GLOBAL"> %emdRoot%/bin/nmupm </Property>
      <Property NAME="args" SCOPE="GLOBAL">osLoad</Property>
      <Property NAME="startsWith" SCOPE="GLOBAL">em_result=</Property>
      <Property NAME="delimiter" SCOPE="GLOBAL">|</Property>
      <Property NAME="ENVNMUPM_TIMEOUT" OPTIONAL="TRUE" SCOPE="SYSTEMGLOBAL">NMUPM_TIMEOUT</Property>
    </QueryDescriptor>

So, now we know that the agent executes “nmupm osLoad”. Let´s try it.

1
2
3
$ nmupm osLoad
ncpus=8
em_result=4.10|2.43|4.46|500038168.000000|987|3|0.000000|4096.000000|40775544.000000|10464849920.000000|41094647808.000000|21731692544.000000

The result is pipe seperated and consists of the columns from above. The last two are used “usedSwapRaw” and “freeSwapRaw”.

Please note that you will get different results depending on the value of the environment variable “NMUPM_USE_PSEUDO_MEM”.

1
2
3
4
5
6
bash$ NMUPM_USE_PSEUDO_MEM=FALSE nmupm osLoad
ncpus=8
em_result=0.86|1.84|3.89|501596312.000000|986|3|0.000000|4096.000000|40775544.000000|10436485120.000000|21024821248.000000|47362048.000000
bash$ NMUPM_USE_PSEUDO_MEM=TRUE nmupm osLoad
ncpus=8
em_result=0.90|1.83|3.88|501596312.000000|989|3|0.000000|4096.000000|40775544.000000|10432708608.000000|41164206080.000000|21662134272.000000

Summary

If you are getting warning/critical alerts about Swap Utilization, check with glance or “swapinfo -t” to verify if pseudo-swap is used. If it is, set the property and it will fix the metric error.



Grid Control 11g: Diagnosing OMS High CPU Utilization

Beginning with 11g Oracle Enterprise Manager Grid Control is using Weblogic instead of Oracle Application Server. I am currently experiencing very sluggish Grid Control Console performance and high CPU Utilization of EMGC_OMS1 weblogic server. This article shows how you can diagnose the problem and “assist” Oracle support in helping to solve the issue.

First step to find out what´s going on is to use “top”.

1
2
3
4
5
6
7
8
top - 08:29:12 up 4 days, 20:56,  3 users,  load average: 1.47, 1.60, 1.61
Tasks: 532 total,   1 running, 531 sleeping,   0 stopped,   0 zombie
Cpu(s):  1.3%us,  0.7%sy,  0.0%ni, 97.8%id,  0.2%wa,  0.0%hi,  0.0%si,  0.0%st
Mem:  132033652k total, 20255244k used, 111778408k free,   728860k buffers
Swap: 20971512k total,        0k used, 20971512k free, 15391704k cached
 
  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
30181 gc11g    15   0 1778m 1.1g  49m S 99.7  0.9 626:54.30 java

PID 30181 is using 100% (and has been using it for a very long time)

It´s the EMGC_OMS1 server:

1
2
gc11g   30181 30129  9 Nov06 ?        10:27:20 /oracle/jdk/jdk1.6.0_21/bin/java -client -Xms256m -Xmx512m -XX:CompileThreshold=8000 -XX:PermSize=128m -XX:MaxPermSize=512m -Dweblogic.Name=EMGC_OMS1 -Djava.security.policy=/oracle/grid/middleware/wlserver_10.3/server/lib/weblogic.policy -Dweblogic.system.BootIdentityFile=/oracle/grid/gc_inst/user_projects/domains/GCDomain/servers/EMGC_OMS1/data/nodemanager/boot.properties -Dweblogic.nodemanager.ServiceEnabled=true -Dweblogic.security.SSL.ignoreHostnameVerification=true -Dweblogic.ReverseDNSAllowed=false -DINSTANCE_HOME=/oracle/grid/gc_inst/em/EMGC_OMS1 -DORACLE_HOME=/oracle/grid/middleware/oms11g -Ddomain.home=/oracle/grid/gc_inst/user_projects/domains/GCDomain -Djava.awt.headless=true -Ddomain.name=GCDomain -Docm.repeater.home=/oracle/grid/middleware/oms11g -Djava.security.egd=file:///dev/urandom -Xverify:none -da -Dplatform.home=/oracle/grid/middleware/wlserver_10.3 -Dwls.home=/oracle/grid/middleware/wlserver_10.3/server -Dweblogic.home=/oracle/grid/middleware/wlserver_10.3/server -Ddomain.home=/oracle/grid/gc_inst/user_projects/domains/GCDomain -Dcommon.components.home=/oracle/grid/middleware/oracle_common -Djrf.version=11.1.1 -Dorg.apache.commons.logging.Log=org.apache.commons.logging.impl.Jdk14Logger -Djrockit.optfile=/oracle/grid/middleware/oracle_common/modules/oracle.jrf_11.1.1/jrocket_optfile.txt -Doracle.domain.config.dir=/oracle/grid/gc_inst/user_projects/domains/GCDomain/config/fmwconfig -Doracle.server.config.dir=/oracle/grid/gc_inst/user_projects/domains/GCDomain/config/fmwconfig/servers/EMGC_OMS1 -Doracle.security.jps.config=/oracle/grid/gc_inst/user_projects/domains/GCDomain/config/fmwconfig/jps-config.xml -Djava.protocol.handler.pkgs=oracle.mds.net.protocol -Digf.arisidbeans.carmlloc=/oracle/grid/gc_inst/user_projects/domains/GCDomain/config/fmwconfig/carml -Digf.arisidstack.home=/oracle/grid/gc_inst/user_projects/domains/GCDomain/config/fmwconfig/arisidprovider -Dweblogic.alternateTypesDirectory=/oracle/grid/middleware/oracle_common/modules/oracle.ossoiap_11.1.1,/oracle/grid/middleware/oracle_common/modules/oracle.oamprovider_11.1.1 -Dweblogic.jdbc.remoteEnabled=false -Dweblogic.management.discover=false -Dweblogic.management.server=https://t44grid.wwk-group.com:7101 -Dwlw.iterativeDev= -Dwlw.testConsole= -Dwlw.logErrorsToConsole= -Dweblogic.ext.dirs=/oracle/grid/middleware/patch_wls1032/profiles/default/sysext_manifest_classpath weblogic.Server
oel5n1:/home/gc11g >

Next step is to find out which thread is consuming 100% of cpu.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
oel5n1:/home/gc11g >  ps -Leo pid,ruser,vsz,rss,state,priority,nice,time,%cpu,comm,lwp,psr | grep 30181
30181 gc11g   1821640 1167200 S 15 0 00:00:00  0.0 java            30181   5
30181 gc11g   1821640 1167200 S 15 0 00:00:03  0.0 java            30182   8
30181 gc11g   1821640 1167200 S 15 0 00:01:02  0.0 java            30183   0
30181 gc11g   1821640 1167200 S 15 0 00:01:02  0.0 java            30184   8
30181 gc11g   1821640 1167200 S 15 0 00:01:02  0.0 java            30185  11
30181 gc11g   1821640 1167200 S 15 0 00:01:02  0.0 java            30186  10
30181 gc11g   1821640 1167200 S 15 0 00:01:02  0.0 java            30187   3
30181 gc11g   1821640 1167200 S 15 0 00:01:02  0.0 java            30188  15
30181 gc11g   1821640 1167200 S 15 0 00:01:02  0.0 java            30189   9
30181 gc11g   1821640 1167200 S 15 0 00:01:02  0.0 java            30190   2
30181 gc11g   1821640 1167200 S 15 0 00:01:02  0.0 java            30191  13
30181 gc11g   1821640 1167200 S 15 0 00:01:02  0.0 java            30192   1
30181 gc11g   1821640 1167200 S 15 0 00:01:02  0.0 java            30193   4
30181 gc11g   1821640 1167200 S 15 0 00:01:02  0.0 java            30194  14
30181 gc11g   1821640 1167200 S 15 0 00:01:02  0.0 java            30195   5
30181 gc11g   1821640 1167200 R 25 0 08:30:49  7.6 java            30196   2       <<<<<<
30181 gc11g   1821640 1167200 S 15 0 00:00:00  0.0 java            30197  10
30181 gc11g   1821640 1167200 S 15 0 00:00:04  0.0 java            30198   4
30181 gc11g   1821640 1167200 S 15 0 00:00:00  0.0 java            30199   5
30181 gc11g   1821640 1167200 S 15 0 00:00:58  0.0 java            30200  15
30181 gc11g   1821640 1167200 S 15 0 00:00:57  0.0 java            30201   2
30181 gc11g   1821640 1167200 S 15 0 00:00:00  0.0 java            30202   5
30181 gc11g   1821640 1167200 S 15 0 00:00:00  0.0 java            30203  13
30181 gc11g   1821640 1167200 S 15 0 00:00:00  0.0 java            30204  10
30181 gc11g   1821640 1167200 S 15 0 00:00:00  0.0 java            30496   4
30181 gc11g   1821640 1167200 S 15 0 00:18:50  0.2 java            30497  11
30181 gc11g   1821640 1167200 S 15 0 00:00:00  0.0 java            30498  15
30181 gc11g   1821640 1167200 S 15 0 00:00:00  0.0 java            30499   0
30181 gc11g   1821640 1167200 S 15 0 00:00:00  0.0 java            30504   8
30181 gc11g   1821640 1167200 S 15 0 00:00:00  0.0 java            30505  10
30181 gc11g   1821640 1167200 S 15 0 00:00:11  0.0 java            30510  15
30181 gc11g   1821640 1167200 S 15 0 00:00:38  0.0 java            30513  11
30181 gc11g   1821640 1167200 S 15 0 00:00:38  0.0 java            30514  13
30181 gc11g   1821640 1167200 S 15 0 00:00:37  0.0 java            30517  10
30181 gc11g   1821640 1167200 S 15 0 00:00:39  0.0 java            30518  14
30181 gc11g   1821640 1167200 S 15 0 00:00:00  0.0 java            30529   0
30181 gc11g   1821640 1167200 S 15 0 00:00:00  0.0 java            30533   8
30181 gc11g   1821640 1167200 S 15 0 00:00:00  0.0 java            30588   8
30181 gc11g   1821640 1167200 S 15 0 00:00:01  0.0 java            30589   5
30181 gc11g   1821640 1167200 S 15 0 00:00:00  0.0 java            30590  14
30181 gc11g   1821640 1167200 S 15 0 00:00:00  0.0 java            30591   4
30181 gc11g   1821640 1167200 S 15 0 00:00:00  0.0 java            30592   3
30181 gc11g   1821640 1167200 S 15 0 00:00:00  0.0 java            30899   8
30181 gc11g   1821640 1167200 S 15 0 00:00:00  0.0 java            30901   9
30181 gc11g   1821640 1167200 S 15 0 00:00:00  0.0 java            30904   3
30181 gc11g   1821640 1167200 S 15 0 00:00:00  0.0 java            30907  15
30181 gc11g   1821640 1167200 S 15 0 00:00:00  0.0 java            30910  12
30181 gc11g   1821640 1167200 S 15 0 00:00:02  0.0 java            31205   2
30181 gc11g   1821640 1167200 S 15 0 00:29:45  0.4 java            31526   4
30181 gc11g   1821640 1167200 S 15 0 00:00:00  0.0 java            31568  15
30181 gc11g   1821640 1167200 S 15 0 00:00:00  0.0 java            31593   5
30181 gc11g   1821640 1167200 S 15 0 00:12:02  0.1 java            31600  15
30181 gc11g   1821640 1167200 S 15 0 00:08:27  0.1 java            31601  13
30181 gc11g   1821640 1167200 S 15 0 00:00:00  0.0 java            32157   4
30181 gc11g   1821640 1167200 S 15 0 00:00:01  0.0 java            32160   2
30181 gc11g   1821640 1167200 S 15 0 00:00:00  0.0 java            32161  12
30181 gc11g   1821640 1167200 S 15 0 00:00:00  0.0 java            32163   2
30181 gc11g   1821640 1167200 S 15 0 00:00:00  0.0 java            32166  11
30181 gc11g   1821640 1167200 S 15 0 00:00:00  0.0 java            32168   0
30181 gc11g   1821640 1167200 S 15 0 00:00:00  0.0 java            32169  13
30181 gc11g   1821640 1167200 S 15 0 00:00:00  0.0 java            32170  10
30181 gc11g   1821640 1167200 S 15 0 00:00:00  0.0 java            32171  10
30181 gc11g   1821640 1167200 S 16 0 00:00:00  0.0 java            32172   3
30181 gc11g   1821640 1167200 S 15 0 00:00:00  0.0 java            32173  10
30181 gc11g   1821640 1167200 S 15 0 00:00:00  0.0 java            32174   8
30181 gc11g   1821640 1167200 S 15 0 00:00:00  0.0 java            32175  13
30181 gc11g   1821640 1167200 S 15 0 00:00:00  0.0 java            32176  12
30181 gc11g   1821640 1167200 S 15 0 00:00:00  0.0 java            32177   0
30181 gc11g   1821640 1167200 S 15 0 00:00:00  0.0 java            32178   9
30181 gc11g   1821640 1167200 S 15 0 00:00:00  0.0 java            32179  13
30181 gc11g   1821640 1167200 S 15 0 00:00:00  0.0 java            32180   9
30181 gc11g   1821640 1167200 S 15 0 00:00:01  0.0 java            32181   6
30181 gc11g   1821640 1167200 S 15 0 00:00:00  0.0 java            32182   0
30181 gc11g   1821640 1167200 S 15 0 00:00:00  0.0 java            32183   8
30181 gc11g   1821640 1167200 S 15 0 00:00:00  0.0 java            32264  15
30181 gc11g   1821640 1167200 S 15 0 00:01:02  0.0 java            32265   8
30181 gc11g   1821640 1167200 S 15 0 00:00:00  0.0 java            32269   8
30181 gc11g   1821640 1167200 S 15 0 00:00:12  0.0 java            32270   1
30181 gc11g   1821640 1167200 S 15 0 00:00:14  0.0 java            32271   6
30181 gc11g   1821640 1167200 S 15 0 00:00:00  0.0 java            32275  10
30181 gc11g   1821640 1167200 S 15 0 00:00:15  0.0 java            32280  15
30181 gc11g   1821640 1167200 S 15 0 00:00:00  0.0 java            32283  12
30181 gc11g   1821640 1167200 S 15 0 00:10:45  0.1 java            32286  13
30181 gc11g   1821640 1167200 S 15 0 00:00:47  0.0 java            32289  12
30181 gc11g   1821640 1167200 S 15 0 00:02:57  0.0 java            32295   0
30181 gc11g   1821640 1167200 S 15 0 00:00:03  0.0 java            32297  12
30181 gc11g   1821640 1167200 S 15 0 00:00:00  0.0 java            32301   9
30181 gc11g   1821640 1167200 S 15 0 00:00:00  0.0 java            32302  12
30181 gc11g   1821640 1167200 S 15 0 00:00:00  0.0 java            32305   5
30181 gc11g   1821640 1167200 S 15 0 00:00:00  0.0 java            32306  12
30181 gc11g   1821640 1167200 S 15 0 00:01:09  0.0 java            32307   7
30181 gc11g   1821640 1167200 S 15 0 00:00:00  0.0 java            32308   2
30181 gc11g   1821640 1167200 S 15 0 00:00:00  0.0 java            32309   0
30181 gc11g   1821640 1167200 S 15 0 00:00:00  0.0 java            32310   4
30181 gc11g   1821640 1167200 S 15 0 00:00:00  0.0 java            32311   8
30181 gc11g   1821640 1167200 S 15 0 00:00:00  0.0 java            32326  13
30181 gc11g   1821640 1167200 S 15 0 00:00:14  0.0 java            32327  11
30181 gc11g   1821640 1167200 S 15 0 00:00:00  0.0 java            32336  13
30181 gc11g   1821640 1167200 S 15 0 00:00:01  0.0 java            32337   8
30181 gc11g   1821640 1167200 S 15 0 00:00:29  0.0 java            32338   3
30181 gc11g   1821640 1167200 S 15 0 00:00:05  0.0 java            32339   9
30181 gc11g   1821640 1167200 S 15 0 00:00:00  0.0 java            32340  10
30181 gc11g   1821640 1167200 S 15 0 00:00:18  0.0 java            32341  12
30181 gc11g   1821640 1167200 S 15 0 00:00:02  0.0 java            32342   7
30181 gc11g   1821640 1167200 S 15 0 00:00:29  0.0 java            32343   7
30181 gc11g   1821640 1167200 S 15 0 00:00:10  0.0 java            32344   0
30181 gc11g   1821640 1167200 S 15 0 00:00:05  0.0 java            32351   0
30181 gc11g   1821640 1167200 S 15 0 00:00:02  0.0 java             1943   9
30181 gc11g   1821640 1167200 S 15 0 00:01:19  0.0 java             1944   6
30181 gc11g   1821640 1167200 S 15 0 00:00:00  0.0 java             1945  15
30181 gc11g   1821640 1167200 S 15 0 00:00:06  0.0 java             1946  11
30181 gc11g   1821640 1167200 S 15 0 00:00:00  0.0 java             2890  15
30181 gc11g   1821640 1167200 S 15 0 00:02:56  0.0 java             7839  14
30181 gc11g   1821640 1167200 S 15 0 00:00:00  0.0 java            10208   0
30181 gc11g   1821640 1167200 S 15 0 00:00:02  0.0 java            16818  15
30181 gc11g   1821640 1167200 S 15 0 00:00:00  0.0 java            18310   2
30181 gc11g   1821640 1167200 S 15 0 00:00:09  0.0 java              642  14
30181 gc11g   1821640 1167200 S 15 0 00:00:00  0.0 java              690   7
30181 gc11g   1821640 1167200 S 15 0 00:00:00  0.0 java            27517   8
30181 gc11g   1821640 1167200 S 15 0 00:00:00  0.0 java            27518   8
30181 gc11g   1821640 1167200 S 15 0 00:00:01  0.0 java            29228   3
30181 gc11g   1821640 1167200 S 15 0 00:02:46  0.1 java            12075  12
30181 gc11g   1821640 1167200 S 15 0 00:00:00  0.0 java            12160   1
30181 gc11g   1821640 1167200 S 15 0 00:00:00  0.0 java            12829  13
30181 gc11g   1821640 1167200 S 15 0 00:00:02  0.0 java             2773   7
30181 gc11g   1821640 1167200 S 15 0 00:00:00  0.0 java            14748  14
30181 gc11g   1821640 1167200 S 15 0 00:00:00  0.0 java            14749   6
30181 gc11g   1821640 1167200 S 15 0 00:00:00  0.0 java            14750   1
30181 gc11g   1821640 1167200 S 15 0 00:00:00  0.0 java            14751   0
30181 gc11g   1821640 1167200 S 15 0 00:00:00  0.0 java            14752   5
30181 gc11g   1821640 1167200 S 15 0 00:00:00  0.0 java            14753   5
30181 gc11g   1821640 1167200 S 15 0 00:00:00  0.0 java            14860  12
30181 gc11g   1821640 1167200 S 15 0 00:00:00  0.0 java            16706   8
30181 gc11g   1821640 1167200 S 15 0 00:00:00  0.0 java            17312   1
30181 gc11g   1821640 1167200 S 15 0 00:00:00  0.0 java            17313  12
30181 gc11g   1821640 1167200 S 15 0 00:00:00  0.0 java            17425  12
30181 gc11g   1821640 1167200 S 15 0 00:00:00  0.0 java            17426  15
30181 gc11g   1821640 1167200 S 15 0 00:00:00  0.0 java            19993  12
30181 gc11g   1821640 1167200 S 15 0 00:00:00  0.0 java            19994  15
30181 gc11g   1821640 1167200 S 15 0 00:00:00  0.0 java            19995  12
30181 gc11g   1821640 1167200 S 15 0 00:00:00  0.0 java             3575   9
30181 gc11g   1821640 1167200 S 15 0 00:00:00  0.0 java            16593  14
30181 gc11g   1821640 1167200 S 15 0 00:00:00  0.0 java            18488   8
30181 gc11g   1821640 1167200 S 15 0 00:00:00  0.0 java            18489   5
30181 gc11g   1821640 1167200 S 15 0 00:00:00  0.0 java            21901   7
30181 gc11g   1821640 1167200 S 15 0 00:00:00  0.0 java             5572   7
30181 gc11g   1821640 1167200 S 15 0 00:00:00  0.0 java             7993   9
30181 gc11g   1821640 1167200 S 15 0 00:00:00  0.0 java            27543  13
30181 gc11g   1821640 1167200 S 15 0 00:00:00  0.0 java            32075  15

So, it is thread 30196 of PID 30181. Let´s remove the ” sleeping ” threads with grep.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
oel5n1:/home/gc11g >  ps -Leo pid,ruser,vsz,rss,state,priority,nice,time,%cpu,comm,lwp,psr | grep 30181 | grep -v " S "
30181 gc11g   1821640 1167200 R 25 0 08:31:37  7.7 java            30196   6
oel5n1:/home/gc11g >  ps -Leo pid,ruser,vsz,rss,state,priority,nice,time,%cpu,comm,lwp,psr | grep 30181 | grep -v " S "
30181 gc11g   1821640 1167200 R 25 0 08:31:39  7.7 java            30196   2
oel5n1:/home/gc11g >  ps -Leo pid,ruser,vsz,rss,state,priority,nice,time,%cpu,comm,lwp,psr | grep 30181 | grep -v " S "
30181 gc11g   1821640 1167200 R 25 0 08:31:40  7.7 java            30196   2
oel5n1:/home/gc11g >  ps -Leo pid,ruser,vsz,rss,state,priority,nice,time,%cpu,comm,lwp,psr | grep 30181 | grep -v " S "
30181 gc11g   1821640 1167200 R 18 0 08:31:40  7.7 java            30196   2
oel5n1:/home/gc11g >  ps -Leo pid,ruser,vsz,rss,state,priority,nice,time,%cpu,comm,lwp,psr | grep 30181 | grep -v " S "
30181 gc11g   1821640 1167200 R 18 0 08:31:41  7.7 java            30196  15
oel5n1:/home/gc11g >  ps -Leo pid,ruser,vsz,rss,state,priority,nice,time,%cpu,comm,lwp,psr | grep 30181 | grep -v " S "
30181 gc11g   1821640 1167200 R 19 0 08:31:41  7.7 java            30196  15
oel5n1:/home/gc11g >  ps -Leo pid,ruser,vsz,rss,state,priority,nice,time,%cpu,comm,lwp,psr | grep 30181 | grep -v " S "
30181 gc11g   1821640 1167200 R 20 0 08:31:42  7.7 java            30196  13
oel5n1:/home/gc11g >  ps -Leo pid,ruser,vsz,rss,state,priority,nice,time,%cpu,comm,lwp,psr | grep 30181 | grep -v " S "
30181 gc11g   1821640 1167200 R 20 0 08:31:42  7.7 java            30196   9
oel5n1:/home/gc11g >  ps -Leo pid,ruser,vsz,rss,state,priority,nice,time,%cpu,comm,lwp,psr | grep 30181 | grep -v " S "
30181 gc11g   1821640 1167200 R 21 0 08:31:42  7.7 java            30196   9
oel5n1:/home/gc11g >  ps -Leo pid,ruser,vsz,rss,state,priority,nice,time,%cpu,comm,lwp,psr | grep 30181 | grep -v " S "
30181 gc11g   1821640 1167200 R 21 0 08:31:43  7.7 java            30196   9
oel5n1:/home/gc11g >  ps -Leo pid,ruser,vsz,rss,state,priority,nice,time,%cpu,comm,lwp,psr | grep 30181 | grep -v " S "
30181 gc11g   1821640 1167200 R 23 0 08:31:43  7.7 java            30196   9
oel5n1:/home/gc11g >  ps -Leo pid,ruser,vsz,rss,state,priority,nice,time,%cpu,comm,lwp,psr | grep 30181 | grep -v " S "
30181 gc11g   1821640 1167200 R 24 0 08:31:43  7.7 java            30196   9

It is always the same thread which is using 7.7% (~1/16 cores) of the all the CPU´s on the system.

let´s take a pstack:

1
oel5n1:/home/gc11g > pstack 30181

let´s grep for the LWP we know:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
oel5n1:/home/gc11g > pstack 30181 | grep -A 20 30196
 
Thread 135 (Thread 0x4024f940 (LWP 30196)):                           <<<<<<<<<<<<<<<<<<
#0  0x00002b7471e17d3f in instanceKlass::oop_follow_contents ()
#1  0x00002b747203e6c3 in objArrayKlass::oop_follow_contents ()
#2  0x00002b7471fee5e3 in MarkSweep::follow_stack ()
#3  0x00002b7472094964 in PSMarkSweep::mark_sweep_phase1 ()
#4  0x00002b7472093859 in PSMarkSweep::invoke_no_policy ()
#5  0x00002b74720a0737 in PSScavenge::invoke ()
#6  0x00002b747206814e in ParallelScavengeHeap::failed_mem_allocate ()
#7  0x00002b747218fbc9 in VM_ParallelGCFailedAllocation::doit ()
#8  0x00002b747219ccfa in VM_Operation::evaluate ()
#9  0x00002b747219c312 in VMThread::evaluate_operation ()
#10 0x00002b747219c583 in VMThread::loop ()
#11 0x00002b747219c08e in VMThread::run ()
#12 0x00002b74720560df in java_start ()
#13 0x0000003130e064a7 in start_thread () from /lib64/libpthread.so.0
#14 0x00000031302d3c2d in clone () from /lib64/libc.so.6
 
 
oel5n1:/home/gc11g > pstack 30181 | grep -A 20 30196
Thread 135 (Thread 0x4024f940 (LWP 30196)):
#0  0x00002b7471e17d3f in instanceKlass::oop_follow_contents ()
#1  0x00002b747203e6c3 in objArrayKlass::oop_follow_contents ()
#2  0x00002b7471fee5e3 in MarkSweep::follow_stack ()
#3  0x00002b7472094964 in PSMarkSweep::mark_sweep_phase1 ()
#4  0x00002b7472093859 in PSMarkSweep::invoke_no_policy ()
#5  0x00002b74720a0737 in PSScavenge::invoke ()
#6  0x00002b747206814e in ParallelScavengeHeap::failed_mem_allocate ()
#7  0x00002b747218fbc9 in VM_ParallelGCFailedAllocation::doit ()
#8  0x00002b747219ccfa in VM_Operation::evaluate ()
#9  0x00002b747219c312 in VMThread::evaluate_operation ()
#10 0x00002b747219c583 in VMThread::loop ()
#11 0x00002b747219c08e in VMThread::run ()
#12 0x00002b74720560df in java_start ()
#13 0x0000003130e064a7 in start_thread () from /lib64/libpthread.so.0
#14 0x00000031302d3c2d in clone () from /lib64/libc.so.6
 
 
 
oel5n1:/home/gc11g > pstack 30181 | grep -A 20 30196
Thread 135 (Thread 0x4024f940 (LWP 30196)):
#0  0x00002b7472046d27 in objArrayKlass::oop_adjust_pointers ()
#1  0x00002b747209549e in PSMarkSweepDecorator::adjust_pointers ()
#2  0x00002b74720967f0 in PSOldGen::adjust_pointers ()
#3  0x00002b7472094c3a in PSMarkSweep::mark_sweep_phase3 ()
#4  0x00002b74720938f6 in PSMarkSweep::invoke_no_policy ()
#5  0x00002b74720a0737 in PSScavenge::invoke ()
#6  0x00002b747206814e in ParallelScavengeHeap::failed_mem_allocate ()
#7  0x00002b747218fbc9 in VM_ParallelGCFailedAllocation::doit ()
#8  0x00002b747219ccfa in VM_Operation::evaluate ()
#9  0x00002b747219c312 in VMThread::evaluate_operation ()
#10 0x00002b747219c583 in VMThread::loop ()
#11 0x00002b747219c08e in VMThread::run ()
#12 0x00002b74720560df in java_start ()
#13 0x0000003130e064a7 in start_thread () from /lib64/libpthread.so.0
#14 0x00000031302d3c2d in clone () from /lib64/libc.so.6
 
 
oel5n1:/home/gc11g > pstack 30181 | grep -A 20 30196
Thread 135 (Thread 0x4024f940 (LWP 30196)):
#0  0x00002b7471fee9d3 in MarkSweep::IsAliveClosure::do_object_b ()
#1  0x00002b7471dfad18 in Hashtable::unlink ()
#2  0x00002b74720949ef in PSMarkSweep::mark_sweep_phase1 ()
#3  0x00002b7472093859 in PSMarkSweep::invoke_no_policy ()
#4  0x00002b74720a0737 in PSScavenge::invoke ()
#5  0x00002b747206814e in ParallelScavengeHeap::failed_mem_allocate ()
#6  0x00002b747218fbc9 in VM_ParallelGCFailedAllocation::doit ()
#7  0x00002b747219ccfa in VM_Operation::evaluate ()
#8  0x00002b747219c312 in VMThread::evaluate_operation ()
#9  0x00002b747219c583 in VMThread::loop ()
#10 0x00002b747219c08e in VMThread::run ()
#11 0x00002b74720560df in java_start ()
#12 0x0000003130e064a7 in start_thread () from /lib64/libpthread.so.0
#13 0x00000031302d3c2d in clone () from /lib64/libc.so.6
 
 
oel5n1:/home/gc11g > pstack 30181 | grep -A 20 30196
Thread 135 (Thread 0x4024f940 (LWP 30196)):
#0  0x00002b7471fee706 in MarkSweep::AdjustPointerClosure::do_oop ()
#1  0x00002b7471dfadb1 in Hashtable::oops_do ()
#2  0x00002b7472094bf9 in PSMarkSweep::mark_sweep_phase3 ()
#3  0x00002b74720938f6 in PSMarkSweep::invoke_no_policy ()
#4  0x00002b74720a0737 in PSScavenge::invoke ()
#5  0x00002b747206814e in ParallelScavengeHeap::failed_mem_allocate ()
#6  0x00002b747218fbc9 in VM_ParallelGCFailedAllocation::doit ()
#7  0x00002b747219ccfa in VM_Operation::evaluate ()
#8  0x00002b747219c312 in VMThread::evaluate_operation ()
#9  0x00002b747219c583 in VMThread::loop ()
#10 0x00002b747219c08e in VMThread::run ()
#11 0x00002b74720560df in java_start ()
#12 0x0000003130e064a7 in start_thread () from /lib64/libpthread.so.0
#13 0x00000031302d3c2d in clone () from /lib64/libc.so.6
 
 
oel5n1:/home/gc11g > pstack 30181 | grep -A 20 30196
Thread 135 (Thread 0x4024f940 (LWP 30196)):
#0  0x00002b7471e179e5 in instanceKlass::adjust_static_fields ()
#1  0x00002b7471e234ad in instanceKlassKlass::oop_adjust_pointers ()
#2  0x00002b747209549e in PSMarkSweepDecorator::adjust_pointers ()
#3  0x00002b74720967f0 in PSOldGen::adjust_pointers ()
#4  0x00002b7472094c3a in PSMarkSweep::mark_sweep_phase3 ()
#5  0x00002b74720938f6 in PSMarkSweep::invoke_no_policy ()
#6  0x00002b74720a0737 in PSScavenge::invoke ()
#7  0x00002b747206814e in ParallelScavengeHeap::failed_mem_allocate ()
#8  0x00002b747218fbc9 in VM_ParallelGCFailedAllocation::doit ()
#9  0x00002b747219ccfa in VM_Operation::evaluate ()
#10 0x00002b747219c312 in VMThread::evaluate_operation ()
#11 0x00002b747219c583 in VMThread::loop ()
#12 0x00002b747219c08e in VMThread::run ()
#13 0x00002b74720560df in java_start ()
#14 0x0000003130e064a7 in start_thread () from /lib64/libpthread.so.0
#15 0x00000031302d3c2d in clone () from /lib64/libc.so.6
 
 
oel5n1:/home/gc11g > pstack 30181 | grep -A 20 30196
Thread 135 (Thread 0x4024f940 (LWP 30196)):
#0  0x00002b7471e17e54 in instanceKlass::oop_follow_contents ()
#1  0x00002b747203e6c3 in objArrayKlass::oop_follow_contents ()
#2  0x00002b7471fee5e3 in MarkSweep::follow_stack ()
#3  0x00002b7472094964 in PSMarkSweep::mark_sweep_phase1 ()
#4  0x00002b7472093859 in PSMarkSweep::invoke_no_policy ()
#5  0x00002b74720a0737 in PSScavenge::invoke ()
#6  0x00002b747206814e in ParallelScavengeHeap::failed_mem_allocate ()
#7  0x00002b747218fbc9 in VM_ParallelGCFailedAllocation::doit ()
#8  0x00002b747219ccfa in VM_Operation::evaluate ()
#9  0x00002b747219c312 in VMThread::evaluate_operation ()
#10 0x00002b747219c583 in VMThread::loop ()
#11 0x00002b747219c08e in VMThread::run ()
#12 0x00002b74720560df in java_start ()
#13 0x0000003130e064a7 in start_thread () from /lib64/libpthread.so.0
#14 0x00000031302d3c2d in clone () from /lib64/libc.so.6
 
 
oel5n1:/home/gc11g > pstack 30181 | grep -A 20 30196
Thread 135 (Thread 0x4024f940 (LWP 30196)):
#0  0x00002b747203e640 in objArrayKlass::oop_follow_contents ()
#1  0x00002b7471fee5e3 in MarkSweep::follow_stack ()
#2  0x00002b7472094964 in PSMarkSweep::mark_sweep_phase1 ()
#3  0x00002b7472093859 in PSMarkSweep::invoke_no_policy ()
#4  0x00002b74720a0737 in PSScavenge::invoke ()
#5  0x00002b747206814e in ParallelScavengeHeap::failed_mem_allocate ()
#6  0x00002b747218fbc9 in VM_ParallelGCFailedAllocation::doit ()
#7  0x00002b747219ccfa in VM_Operation::evaluate ()
#8  0x00002b747219c312 in VMThread::evaluate_operation ()
#9  0x00002b747219c583 in VMThread::loop ()
#10 0x00002b747219c08e in VMThread::run ()
#11 0x00002b74720560df in java_start ()
#12 0x0000003130e064a7 in start_thread () from /lib64/libpthread.so.0
#13 0x00000031302d3c2d in clone () from /lib64/libc.so.6
 
 
 
oel5n1:/home/gc11g > pstack 30181 | grep -A 20 30196
Thread 135 (Thread 0x4024f940 (LWP 30196)):
#0  0x00002b7471d50445 in ConstantPoolCacheEntry::follow_contents ()
#1  0x00002b7471d4f4f6 in constantPoolCacheKlass::oop_follow_contents ()
#2  0x00002b7471fee5e3 in MarkSweep::follow_stack ()
#3  0x00002b7472094964 in PSMarkSweep::mark_sweep_phase1 ()
#4  0x00002b7472093859 in PSMarkSweep::invoke_no_policy ()
#5  0x00002b74720a0737 in PSScavenge::invoke ()
#6  0x00002b747206814e in ParallelScavengeHeap::failed_mem_allocate ()
#7  0x00002b747218fbc9 in VM_ParallelGCFailedAllocation::doit ()
#8  0x00002b747219ccfa in VM_Operation::evaluate ()
#9  0x00002b747219c312 in VMThread::evaluate_operation ()
#10 0x00002b747219c583 in VMThread::loop ()
#11 0x00002b747219c08e in VMThread::run ()
#12 0x00002b74720560df in java_start ()
#13 0x0000003130e064a7 in start_thread () from /lib64/libpthread.so.0
#14 0x00000031302d3c2d in clone () from /lib64/libc.so.6
 
oel5n1:/home/gc11g > pstack 30181 | grep -A 20 30196
Thread 135 (Thread 0x4024f940 (LWP 30196)):
#0  0x00002b7471f8af28 in klassKlass::oop_follow_contents ()
#1  0x00002b7471e226cc in instanceKlassKlass::oop_follow_contents ()
#2  0x00002b7471fee5e3 in MarkSweep::follow_stack ()
#3  0x00002b7472094964 in PSMarkSweep::mark_sweep_phase1 ()
#4  0x00002b7472093859 in PSMarkSweep::invoke_no_policy ()
#5  0x00002b74720a0737 in PSScavenge::invoke ()
#6  0x00002b747206814e in ParallelScavengeHeap::failed_mem_allocate ()
#7  0x00002b747218fbc9 in VM_ParallelGCFailedAllocation::doit ()
#8  0x00002b747219ccfa in VM_Operation::evaluate ()
#9  0x00002b747219c312 in VMThread::evaluate_operation ()
#10 0x00002b747219c583 in VMThread::loop ()
#11 0x00002b747219c08e in VMThread::run ()
#12 0x00002b74720560df in java_start ()
#13 0x0000003130e064a7 in start_thread () from /lib64/libpthread.so.0
#14 0x00000031302d3c2d in clone () from /lib64/libc.so.6

I´m not much of a Java guy, but this tells me that the thread is doing ParallelGC. So, the issue might be related to a configuration of too small Heap size. (Xmx) I will talk with support to decide whether they support to have Xmx increased to 1G or 2G. In case the cpu consumption is not for Garbage Collection but some other OEM code, then there is a command “emctl dump omsthread”, which can be used:

1
2
3
4
5
oel5n1:/home/gc11g > emctl dump omsthread
Oracle Enterprise Manager 11g Release 1 Grid Control
Copyright (c) 1996, 2010 Oracle Corporation.  All rights reserved.
Thread dumped successfully for process '30181' to log file 
/oracle/grid/gc_inst/user_projects/domains/GCDomain/servers/EMGC_OMS1/logs/EMGC_OMS1.out

If we calculate LWP 30196 from dec to hex, we get 75f5, this is NID from OMS thread dump.

1
2
3
4
"Reference Handler" daemon prio=10 tid=0x000000004fdff000 nid=0x75f5 waiting for monitor entry [0x0000000040725000]
   java.lang.Thread.State: BLOCKED (on object monitor)
        at java.lang.ref.Reference$ReferenceHandler.run(Reference.java:108)
        - waiting to lock <0x00002aaacf4967a0> (a java.lang.ref.Reference$Lock)

Update 26.11.2010:

Oracle support has recommended to increase Xms/Xmx to 1024m and check whether that is sufficient. This has to be changed in file gc_inst/user_projects/domains/GCDomain/bin/setDomainEnv.sh and in those variables for Linux 64bit:

XMS_SUN_64BIT="1024"
XMX_SUN_64BIT="1024"


Advert: Presentation about Minimal Downtime Upgrade at DOAG Conference 2010

Just a short information:  I will be holding a presentation at DOAG Conference 2010 in NĂźrnberg about “Minimal Downtime Oracle Database 11g Upgrades“. The presentation will be on Tuesday, 16.11.2010 at 10:00. The presentation will be in german. This is a short preview in german:

Nachdem der Premier Support fßr das Datenbank-Release 10gR2 im Juli 2010 endet und nur das erste Jahr des Extended Supports kostenfrei ist, stehen viele Kunden vor dem Problem des Upgrades von Oracle 10g nach 11gR1 oder 11gR2 bis spätestens Juli 2011. Dieser Vortrag zeigt MÜglichkeiten, den Oracle-Upgrade mittels Logical Standby sowie mittels Transportable Tablespaces mit minimaler Downtime durchzufßhren. Es wird aufgezeigt, welche Voraussetzungen fßr die beiden Alternativen jeweils benÜtigt werden sowie die potentiellen Probleme bei der Durchfßhrung. Der Vortrag enthält zudem einen praktischen Teil, bei dem die beiden Varianten demonstriert werden.

You can find the presentation materials (including online demo) on the presentations page.