MetaLink

Diagnosing “cursor: pin S wait on X”

I came across a metalink note (Note 786507.1) which describes a very short way to identify the blocker of a session waiting on mutex “cursor: pin S wait on X”. Previously, I thought it could only be analyzed with system state dumps, but the note describes that parameter P2RAW of wait event “cursor: pin S wait on X” contains the blocking session id and the refcount. For 32bit environments the field contains 4 bytes (2 bytes for session id, 2 bytes for refcount). On 64 bit environments, the field contains 8 bytes (4 vs. 4).

select p2raw from v$session where event = ‘cursor: pin S wait on X’

The first 2 or 4 bytes from p2raw have to be converted to dec and point to the blocking Session ID. (SID)



Oracle RAC 10.2.0.4 bug when adding/dropping redo log groups

On the oracle-l list, there was a post about a nasty bug in Oracle RAC 10.2.0.4 which can cause LGWR errors, instance failures and redo log corruptions.

The bug occurs if you add/remove redo log groups on a cluster database, which has some of it´s instances shut down.

More information about this Bug, which is tracked with Bug ID 6786022 can be found on MetaLink. There is a patch for 10.2.0.4 and Linux x86-64 available.



ASSM Problem with too low PCTFREE

During the Hotsos Symposium 2009 Training Day, Jonathan Lewis presented a problem which appears even on current 10g/11g databases. What is especially interesting is how this issue can be diagnosed. I reproduced the problem in 11.1.0.7 and will provide the steps you can use to verify. The problem can be demonstrated best when comparing response time of an update statement for 8k blocksize and 16k blocksize.

  • 8k Blocksize
  • SQL> DROP TABLE t1_8k purge;
     
    SQL> CREATE TABLESPACE DEMO8K datafile SIZE 128M blocksize 8192;
     
    TABLESPACE created.
     
    Elapsed: 00:00:07.35
     
    SQL> CREATE TABLE mdecker.t1_8k
    (n1 NUMBER,
     n2 NUMBER)
     TABLESPACE DEMO8K;
     
    TABLE created.
     
    Elapsed: 00:00:00.02
     
    SQL> INSERT INTO t1_8k
    SELECT TRUNC(dbms_random.VALUE(10000000,100000000)) n1,
               TO_NUMBER(NULL) AS n2
        FROM dual
    CONNECT BY LEVEL <= 500000
    /
     
     
    500000 ROWS created.
     
    Elapsed: 00:00:06.89
     
    SQL> BEGIN dbms_stats.gather_table_stats(
            ownname => 'MDECKER',
            tabname => 'T1_8K',
            estimate_percent => 100);
    END;
    /
     
    PL/SQL PROCEDURE successfully completed.
     
    Elapsed: 00:00:02.13
     
    SQL> SELECT num_rows,blocks FROM dba_tables WHERE table_name = 'T1_8K';
     
      NUM_ROWS     BLOCKS
    ---------- ----------
        500000        874
     
    Elapsed: 00:00:00.15
    SQL> UPDATE t1_8k SET n2 = n1;
     
    500000 ROWS updated.
     
    Elapsed: 00:01:09.04
    SQL> COMMIT;

  • 16k Blocksize
  • SQL> CREATE TABLESPACE DEMO16K datafile SIZE 128M blocksize 16384;
     
    TABLESPACE created.
     
    Elapsed: 00:00:14.75
     
    SQL> CREATE TABLE mdecker.t1_16k
    (n1 NUMBER,
     n2 NUMBER)
     TABLESPACE DEMO16K;
     
    TABLE created.
     
    Elapsed: 00:00:00.03
     
    SQL> INSERT INTO t1_16k
    SELECT TRUNC(dbms_random.VALUE(10000000,100000000)) n1,
               TO_NUMBER(NULL) AS n2
        FROM dual
    CONNECT BY LEVEL <= 500000
    /  
     
     
    500000 ROWS created.
     
    Elapsed: 00:00:05.51
    SQL> BEGIN dbms_stats.gather_table_stats(
            ownname => 'MDECKER',
            tabname => 'T1_16K',
            estimate_percent => 100);
    END;
    /  
     
    PL/SQL PROCEDURE successfully completed.
     
    Elapsed: 00:00:01.78
    SQL> SELECT num_rows,blocks FROM dba_tables WHERE table_name = 'T1_16K';
     
      NUM_ROWS     BLOCKS
    ---------- ----------
        500000        436
     
    Elapsed: 00:00:00.07
     
    SQL>  UPDATE t1_16k SET n2 = n1;
     
    500000 ROWS updated.
     
    Elapsed: 00:20:20.89

    As you can see, the update statement for the 8k blocksize table took around 69 seconds whereas the same update for the table in the 16k tablespace took more than 20 minutes. When executing oradebug short_stack, you can see that for the 16k update, the stacktrace is similar for many executions. So, a lot of time is spent in the kernel functions ktspfsrch() and ktspscan_bmb().


    SQL> oradebug setospid 23668
    Oracle pid: 18, Unix process pid: 23668, image: oracle@ora-vm1.intra (TNS V1-V3)
    SQL> oradebug short_stack
    .....ktspfsrch()+559<-ktspscan_bmb()+315 .....
    SQL> oradebug short_stack
    .....ktspfsrch()+559<-ktspscan_bmb()+315 .....
    SQL> oradebug short_stack
    .....ktspfsrch()+559<-ktspscan_bmb()+315 .....

    It is important to understand that the problem is not necessarily related to the blocksize, but to the PCTFREE value. More information about this topic can be found here:

    http://structureddata.org/files/jl_test_case.html
    http://structureddata.org/2008/09/08/understanding-performance/
    http://www.oraclealchemist.com/oracle/hey-guys-does-size-matter/



Intelligent (aka adaptive) Cursor Sharing Problem

Today, at the last day of Hotsos Symposium 2009 Ric van Dyke presented on “Intelligent (aka adaptive) Cursor Sharing”, a new feature of Oracle Database 11g. The feature was created to fix the shortcomings of bind peeking on columns with histograms. Tom Kyte presented a few years ago on this topic when he talked about the impact of the wheather on Oracle database performance.

During the very interesting presentation Ric presented with a demo how this feature works but also showed a nasty side effect. He tested it on a million row table with a column “object type”, which was skewed. He bound several different values to the object_type bind variable causing 2 different execution plans to be used for the same sql_id. The problem with it was that he ended up with dozens of child cursors showing V$SQL.IS_SHARABLE => N.

It was not known whether this is a bug or whether the whole concept is flawed. I searched on metalink and found the appropriate document:

Bug 6644714 – High number of child cursors with adaptive cursor sharing
This issue is fixed in

* 11.2 (Future Release)
* 11.1.0.7 (Server Patch Set)
Lots of child cursors may be produced with adaptive cursor sharing enabled,
sometimes with overlapping or same bind value selectivity ranges (in v$sql_cs_selectivity).
This problem shows as LOAD_OPTIMIZER_STATS=’Y’ for many of the cursors in V$SQL_SHARED_CURSOR.

– and –

Bug 7213010 – Adaptive cursor sharing generates lots of child cursors

This issue is fixed in

* 11.2 (Future Release)
* 11.1.0.7 (Server Patch Set)

Lots of child cursors may be produced for a query with bind variables
appearing in at least two simple relational predicates
(eg. col :b).

v$sql_cs_selectivity shows that there are multiple equivalent / overlapping
ranges for a single cursor. The number of ranges gets bigger and bigger
as the query is run more times.

I then tried to verify if the problem indeed is gone in 11.1.0.7 and found out, that it isn´t.

Here is my testcase:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
SQL> @/tmp/setup.sql
Connected.
SQL> DROP TABLE tab1 purge;
 
TABLE dropped.
 
SQL> CREATE TABLE tab1 (a INT,
  2                     b varchar2(20),
  3                                     padding varchar2(200))
  4  tablespace USERS;
 
TABLE created.
 
SQL>
SQL> DROP SEQUENCE myid;
 
SEQUENCE dropped.
 
SQL> CREATE SEQUENCE myid cache 1000;
 
SEQUENCE created.
 
SQL>
SQL> BEGIN
  2  FOR i IN 1..100000
  3  loop
  4          INSERT INTO tab1 VALUES (myid.nextval,
  5                                                           'AAAAAAAAAA',
  6                                                           rpad('*',200,'*'));
  7  END loop;
  8
  9  FOR i IN 1..1000
 10  loop
 11          INSERT INTO tab1 VALUES (myid.nextval,
 12                                                           'BBBBBBBBBB',
 13                                                           rpad('*',200,'*'));
 14  END loop;
 15
 16  FOR i IN 1..1000
 17  loop
 18          INSERT INTO tab1 VALUES (myid.nextval,
 19                                                           'CCCCCCCCCC',
 20                                                           rpad('*',200,'*'));
 21  END loop;
 22
 23  FOR i IN 1..100
 24  loop
 25          INSERT INTO tab1 VALUES (myid.nextval,
 26                                                           'DDDDDDDDDD',
 27                                                           rpad('*',200,'*'));
 28  END loop;
 29
 30  FOR i IN 1..10
 31  loop
 32          INSERT INTO tab1 VALUES (myid.nextval,
 33                                                           'EEEEEEEEEE',
 34                                                           rpad('*',200,'*'));
 35  END loop;
 36
 37  INSERT INTO tab1 VALUES (myid.nextval,
 38                                                           'FFFFFFFFFF',
 39                                                           rpad('*',200,'*'));
 40
 41  commit;
 42  END;
 43  /
 
PL/SQL PROCEDURE successfully completed.
 
SQL>
SQL> CREATE INDEX tab1_idx1 ON tab1 (b);
 
INDEX created.
 
SQL>
SQL> BEGIN dbms_stats.gather_table_stats(ownname => 'MDECKER',
  2                                          tabname => 'TAB1',
  3                                          method_opt =>'FOR ALL COLUMNS SIZE 254',
  4                                          cascade=> TRUE,
  5                                          estimate_percent => 100,
  6                                          no_invalidate => FALSE);
  7  END;
  8  /
 
PL/SQL PROCEDURE successfully completed.
 
SQL>
SQL> SET LINES 300
SQL> col TABLE_NAME format a10
SQL> col owner format a8
SQL> SELECT owner, TABLE_NAME,num_rows, blocks, round((blocks*8192)/1024/1024) AS MB
  2  FROM dba_tables WHERE TABLE_NAME = 'TAB1' AND owner = 'MDECKER';
 
OWNER    TABLE_NAME   NUM_ROWS     BLOCKS         MB
-------- ---------- ---------- ---------- ----------
MDECKER  TAB1           102111       3268         26
 
SQL>
SQL> prompt "Press Enter to continue"
"Press Enter to continue"
SQL> pause
 
SQL>
SQL> /*
SQL> OWNER       TABLE_NAME   NUM_ROWS  BLOCKS         MB
SQL> -------- ---------- ---------- ---------- ----------
SQL> MDECKER  TAB1           102111       3268         26
SQL>
SQL> */
SQL>
SQL>
SQL>
SQL> SELECT b , COUNT(*) FROM mdecker.tab1 GROUP BY b ORDER BY 1 DESC;
 
B                      COUNT(*)
-------------------- ----------
FFFFFFFFFF                    1
EEEEEEEEEE                   10
DDDDDDDDDD                  100
CCCCCCCCCC                 1000
BBBBBBBBBB                 1000
AAAAAAAAAA               100000
 
6 ROWS selected.
 
SQL>
SQL> /*
SQL> B                      COUNT(*)
SQL> -------------------- ----------
SQL> FFFFFFFFFF                    1
SQL> EEEEEEEEEE                   10
SQL> DDDDDDDDDD                  100
SQL> CCCCCCCCCC                 1000
SQL> BBBBBBBBBB                 1000
SQL> AAAAAAAAAA               100000
SQL>
SQL> 6 rows selected.
SQL>
SQL> */
SQL>
SQL> prompt "Press Enter to continue"
"Press Enter to continue"
SQL> pause
 
SQL> SELECT num_buckets, histogram FROM dba_tab_col_statistics WHERE TABLE_NAME = 'TAB1' AND column_name = 'B';
 
NUM_BUCKETS HISTOGRAM
----------- ---------------
          6 FREQUENCY
 
SQL>
SQL> /*
SQL> NUM_BUCKETS HISTOGRAM
SQL> ----------- ---------------
SQL>           6 FREQUENCY
SQL> */
SQL>
SQL>
SQL>
SQL>
SQL>
SQL> SHOW USER
USER IS "MDECKER"
SQL> @/tmp/sql1.txt
SQL> SET LINES 300
SQL> SET echo ON
SQL> SET pages 10000
SQL> col bind format a120
SQL> prompt "Press enter to run"
"Press enter to run"
SQL> pause
 
SQL>
SQL> variable myvar varchar2(20)
SQL> EXEC :myvar := 'AAAAAAAAAA';
 
PL/SQL PROCEDURE successfully completed.
 
SQL> SET termout off
SQL> SELECT * FROM TABLE(dbms_xplan.display_cursor());
 
PLAN_TABLE_OUTPUT
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------   -----------------------------------------------------------------
SQL_ID  fhtgpcv2m2j4s, child NUMBER 0
-------------------------------------
SELECT /* MD_TESTCASE */ * FROM tab1 WHERE b = :myvar
 
Plan hash VALUE: 2211052296
 
--------------------------------------------------------------------------
| Id  | Operation         | Name | ROWS  | Bytes | Cost (%CPU)| TIME     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |       |       |   889 (100)|          |
|*  1 |  TABLE ACCESS FULL| TAB1 |   100K|    20M|   889   (1)| 00:00:11 |
--------------------------------------------------------------------------
 
Predicate Information (IDENTIFIED BY operation id):
---------------------------------------------------
 
   1 - FILTER("B"=:MYVAR)
 
 
18 ROWS selected.
 
SQL> SELECT sql_id, is_obsolete, is_bind_sensitive, is_bind_aware, is_shareable, dbms_sqltune.extract_binds(bind_data) bind, plan_hash_value, child_number FROM
  2   v$sql WHERE sql_id = 'fhtgpcv2m2j4s';
 
SQL_ID        I I I I BIND(NAME, POSITION, DUP_POSITION, DATATYPE, DATATYPE_STRING, CHARACTER_SID, PRECISION, SCALE, MAX_LENGTH, LAST_CAPTURED PLAN_HASH_VALUE CHILD_NUMBER
------------- - - - - ------------------------------------------------------------------------------------------------------------------------ --------------- ------------
fhtgpcv2m2j4s N Y N Y SQL_BIND_SET(SQL_BIND(NULL, 1, NULL, 1, 'VARCHAR2(32)', 178, NULL, NULL, 32, '12-MAR-09', 'AAAAAAAAAA', ANYDATA()))           2211052296            0
 
SQL>
SQL> prompt "Press enter to run"
"Press enter to run"
SQL> pause
 
SQL> EXEC :myvar := 'BBBBBBBBBB';
 
PL/SQL PROCEDURE successfully completed.
 
SQL> SET termout off
SQL> SELECT * FROM TABLE(dbms_xplan.display_cursor());
 
PLAN_TABLE_OUTPUT
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------   -----------------------------------------------------------------
SQL_ID  fhtgpcv2m2j4s, child NUMBER 0
-------------------------------------
SELECT /* MD_TESTCASE */ * FROM tab1 WHERE b = :myvar
 
Plan hash VALUE: 2211052296
 
--------------------------------------------------------------------------
| Id  | Operation         | Name | ROWS  | Bytes | Cost (%CPU)| TIME     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |       |       |   889 (100)|          |
|*  1 |  TABLE ACCESS FULL| TAB1 |   100K|    20M|   889   (1)| 00:00:11 |
--------------------------------------------------------------------------
 
Predicate Information (IDENTIFIED BY operation id):
---------------------------------------------------
 
   1 - FILTER("B"=:MYVAR)
 
 
18 ROWS selected.
 
SQL> SELECT sql_id, is_obsolete, is_bind_sensitive, is_bind_aware, is_shareable, dbms_sqltune.extract_binds(bind_data) bind, plan_hash_value, child_number FROM
  2   v$sql WHERE sql_id = 'fhtgpcv2m2j4s';
 
SQL_ID        I I I I BIND(NAME, POSITION, DUP_POSITION, DATATYPE, DATATYPE_STRING, CHARACTER_SID, PRECISION, SCALE, MAX_LENGTH, LAST_CAPTURED PLAN_HASH_VALUE CHILD_NUMBER
------------- - - - - ------------------------------------------------------------------------------------------------------------------------ --------------- ------------
fhtgpcv2m2j4s N Y N Y SQL_BIND_SET(SQL_BIND(NULL, 1, NULL, 1, 'VARCHAR2(32)', 178, NULL, NULL, 32, '12-MAR-09', 'AAAAAAAAAA', ANYDATA()))           2211052296            0
 
SQL>
SQL> EXEC :myvar := 'CCCCCCCCCC';
 
PL/SQL PROCEDURE successfully completed.
 
SQL> prompt "Press enter to run"
"Press enter to run"
SQL> pause
 
SQL> SET termout off
SQL> SELECT * FROM TABLE(dbms_xplan.display_cursor());
 
PLAN_TABLE_OUTPUT
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------   -----------------------------------------------------------------
SQL_ID  fhtgpcv2m2j4s, child NUMBER 1
-------------------------------------
SELECT /* MD_TESTCASE */ * FROM tab1 WHERE b = :myvar
 
Plan hash VALUE: 1918680838
 
-----------------------------------------------------------------------------------------
| Id  | Operation                   | Name      | ROWS  | Bytes | Cost (%CPU)| TIME     |
-----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |           |       |       |    36 (100)|          |
|   1 |  TABLE ACCESS BY INDEX ROWID| TAB1      |  1000 |   210K|    36   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN          | TAB1_IDX1 |  1000 |       |     4   (0)| 00:00:01 |
-----------------------------------------------------------------------------------------
 
Predicate Information (IDENTIFIED BY operation id):
---------------------------------------------------
 
   2 - access("B"=:MYVAR)
 
 
19 ROWS selected.
 
SQL> SELECT sql_id, is_obsolete, is_bind_sensitive, is_bind_aware, is_shareable, dbms_sqltune.extract_binds(bind_data) bind, plan_hash_value, child_number FROM
  2   v$sql WHERE sql_id = 'fhtgpcv2m2j4s';
 
SQL_ID        I I I I BIND(NAME, POSITION, DUP_POSITION, DATATYPE, DATATYPE_STRING, CHARACTER_SID, PRECISION, SCALE, MAX_LENGTH, LAST_CAPTURED PLAN_HASH_VALUE CHILD_NUMBER
------------- - - - - ------------------------------------------------------------------------------------------------------------------------ --------------- ------------
fhtgpcv2m2j4s N Y N Y SQL_BIND_SET(SQL_BIND(NULL, 1, NULL, 1, 'VARCHAR2(32)', 178, NULL, NULL, 32, '12-MAR-09', 'AAAAAAAAAA', ANYDATA()))           2211052296            0
fhtgpcv2m2j4s N Y Y Y SQL_BIND_SET(SQL_BIND(NULL, 1, NULL, 1, 'VARCHAR2(32)', 178, NULL, NULL, 32, '12-MAR-09', 'CCCCCCCCCC', ANYDATA()))           1918680838            1
 
SQL>
SQL> EXEC :myvar := 'DDDDDDDDDD';
 
PL/SQL PROCEDURE successfully completed.
 
SQL> prompt "Press enter to run"
"Press enter to run"
SQL> pause
 
SQL> SET termout off
SQL> SELECT * FROM TABLE(dbms_xplan.display_cursor());
 
PLAN_TABLE_OUTPUT
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------   -----------------------------------------------------------------
SQL_ID  fhtgpcv2m2j4s, child NUMBER 2
-------------------------------------
SELECT /* MD_TESTCASE */ * FROM tab1 WHERE b = :myvar
 
Plan hash VALUE: 1918680838
 
-----------------------------------------------------------------------------------------
| Id  | Operation                   | Name      | ROWS  | Bytes | Cost (%CPU)| TIME     |
-----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |           |       |       |     5 (100)|          |
|   1 |  TABLE ACCESS BY INDEX ROWID| TAB1      |   100 | 21600 |     5   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN          | TAB1_IDX1 |   100 |       |     1   (0)| 00:00:01 |
-----------------------------------------------------------------------------------------
 
Predicate Information (IDENTIFIED BY operation id):
---------------------------------------------------
 
   2 - access("B"=:MYVAR)
 
 
19 ROWS selected.
 
SQL> SELECT sql_id, is_obsolete, is_bind_sensitive, is_bind_aware, is_shareable, dbms_sqltune.extract_binds(bind_data) bind, plan_hash_value, child_number FROM
  2   v$sql WHERE sql_id = 'fhtgpcv2m2j4s';
 
SQL_ID        I I I I BIND(NAME, POSITION, DUP_POSITION, DATATYPE, DATATYPE_STRING, CHARACTER_SID, PRECISION, SCALE, MAX_LENGTH, LAST_CAPTURED PLAN_HASH_VALUE CHILD_NUMBER
------------- - - - - ------------------------------------------------------------------------------------------------------------------------ --------------- ------------
fhtgpcv2m2j4s N Y N Y SQL_BIND_SET(SQL_BIND(NULL, 1, NULL, 1, 'VARCHAR2(32)', 178, NULL, NULL, 32, '12-MAR-09', 'AAAAAAAAAA', ANYDATA()))           2211052296            0
fhtgpcv2m2j4s N Y Y N SQL_BIND_SET(SQL_BIND(NULL, 1, NULL, 1, 'VARCHAR2(32)', 178, NULL, NULL, 32, '12-MAR-09', 'CCCCCCCCCC', ANYDATA()))           1918680838            1
fhtgpcv2m2j4s N Y Y Y SQL_BIND_SET(SQL_BIND(NULL, 1, NULL, 1, 'VARCHAR2(32)', 178, NULL, NULL, 32, '12-MAR-09', 'DDDDDDDDDD', ANYDATA()))           1918680838            2
 
SQL>
SQL> EXEC :myvar := 'EEEEEEEEEE';
 
PL/SQL PROCEDURE successfully completed.
 
SQL> prompt "Press enter to run"
"Press enter to run"
SQL> pause
 
SQL> SET termout off
SQL> SELECT * FROM TABLE(dbms_xplan.display_cursor());
 
PLAN_TABLE_OUTPUT
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------   -----------------------------------------------------------------
SQL_ID  fhtgpcv2m2j4s, child NUMBER 3
-------------------------------------
SELECT /* MD_TESTCASE */ * FROM tab1 WHERE b = :myvar
 
Plan hash VALUE: 1918680838
 
-----------------------------------------------------------------------------------------
| Id  | Operation                   | Name      | ROWS  | Bytes | Cost (%CPU)| TIME     |
-----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |           |       |       |     2 (100)|          |
|   1 |  TABLE ACCESS BY INDEX ROWID| TAB1      |    10 |  2160 |     2   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN          | TAB1_IDX1 |    10 |       |     1   (0)| 00:00:01 |
-----------------------------------------------------------------------------------------
 
Predicate Information (IDENTIFIED BY operation id):
---------------------------------------------------
 
   2 - access("B"=:MYVAR)
 
 
19 ROWS selected.
 
SQL> SELECT sql_id, is_obsolete, is_bind_sensitive, is_bind_aware, is_shareable, dbms_sqltune.extract_binds(bind_data) bind, plan_hash_value, child_number FROM
  2   v$sql WHERE sql_id = 'fhtgpcv2m2j4s';
 
SQL_ID        I I I I BIND(NAME, POSITION, DUP_POSITION, DATATYPE, DATATYPE_STRING, CHARACTER_SID, PRECISION, SCALE, MAX_LENGTH, LAST_CAPTURED PLAN_HASH_VALUE CHILD_NUMBER
------------- - - - - ------------------------------------------------------------------------------------------------------------------------ --------------- ------------
fhtgpcv2m2j4s N Y N Y SQL_BIND_SET(SQL_BIND(NULL, 1, NULL, 1, 'VARCHAR2(32)', 178, NULL, NULL, 32, '12-MAR-09', 'AAAAAAAAAA', ANYDATA()))           2211052296            0
fhtgpcv2m2j4s N Y Y N SQL_BIND_SET(SQL_BIND(NULL, 1, NULL, 1, 'VARCHAR2(32)', 178, NULL, NULL, 32, '12-MAR-09', 'CCCCCCCCCC', ANYDATA()))           1918680838            1
fhtgpcv2m2j4s N Y Y N SQL_BIND_SET(SQL_BIND(NULL, 1, NULL, 1, 'VARCHAR2(32)', 178, NULL, NULL, 32, '12-MAR-09', 'DDDDDDDDDD', ANYDATA()))           1918680838            2
fhtgpcv2m2j4s N Y Y Y SQL_BIND_SET(SQL_BIND(NULL, 1, NULL, 1, 'VARCHAR2(32)', 178, NULL, NULL, 32, '12-MAR-09', 'EEEEEEEEEE', ANYDATA()))           1918680838            3
 
SQL>
SQL> EXEC :myvar := 'FFFFFFFFFF';
 
PL/SQL PROCEDURE successfully completed.
 
SQL> prompt "Press enter to run"
"Press enter to run"
SQL> pause
 
SQL> SET termout off
SQL> SELECT * FROM TABLE(dbms_xplan.display_cursor());
 
PLAN_TABLE_OUTPUT
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------   -----------------------------------------------------------------
SQL_ID  fhtgpcv2m2j4s, child NUMBER 4
-------------------------------------
SELECT /* MD_TESTCASE */ * FROM tab1 WHERE b = :myvar
 
Plan hash VALUE: 1918680838
 
-----------------------------------------------------------------------------------------
| Id  | Operation                   | Name      | ROWS  | Bytes | Cost (%CPU)| TIME     |
-----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |           |       |       |     2 (100)|          |
|   1 |  TABLE ACCESS BY INDEX ROWID| TAB1      |     1 |   216 |     2   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN          | TAB1_IDX1 |     1 |       |     1   (0)| 00:00:01 |
-----------------------------------------------------------------------------------------
 
Predicate Information (IDENTIFIED BY operation id):
---------------------------------------------------
 
   2 - access("B"=:MYVAR)
 
 
19 ROWS selected.
 
SQL> SELECT sql_id, is_obsolete, is_bind_sensitive, is_bind_aware, is_shareable, dbms_sqltune.extract_binds(bind_data) bind, plan_hash_value, child_number FROM
  2   v$sql WHERE sql_id = 'fhtgpcv2m2j4s';
 
SQL_ID        I I I I BIND(NAME, POSITION, DUP_POSITION, DATATYPE, DATATYPE_STRING, CHARACTER_SID, PRECISION, SCALE, MAX_LENGTH, LAST_CAPTURED PLAN_HASH_VALUE CHILD_NUMBER
------------- - - - - ------------------------------------------------------------------------------------------------------------------------ --------------- ------------
fhtgpcv2m2j4s N Y N Y SQL_BIND_SET(SQL_BIND(NULL, 1, NULL, 1, 'VARCHAR2(32)', 178, NULL, NULL, 32, '12-MAR-09', 'AAAAAAAAAA', ANYDATA()))           2211052296            0
fhtgpcv2m2j4s N Y Y N SQL_BIND_SET(SQL_BIND(NULL, 1, NULL, 1, 'VARCHAR2(32)', 178, NULL, NULL, 32, '12-MAR-09', 'CCCCCCCCCC', ANYDATA()))           1918680838            1
fhtgpcv2m2j4s N Y Y N SQL_BIND_SET(SQL_BIND(NULL, 1, NULL, 1, 'VARCHAR2(32)', 178, NULL, NULL, 32, '12-MAR-09', 'DDDDDDDDDD', ANYDATA()))           1918680838            2
fhtgpcv2m2j4s N Y Y N SQL_BIND_SET(SQL_BIND(NULL, 1, NULL, 1, 'VARCHAR2(32)', 178, NULL, NULL, 32, '12-MAR-09', 'EEEEEEEEEE', ANYDATA()))           1918680838            3
fhtgpcv2m2j4s N Y Y Y SQL_BIND_SET(SQL_BIND(NULL, 1, NULL, 1, 'VARCHAR2(32)', 178, NULL, NULL, 32, '12-MAR-09', 'FFFFFFFFFF', ANYDATA()))           1918680838            4

You can see that after parsing for bind CCCCCCCC, the execution plan changes from full table scan to index range scan. Moreover child cursor 1 is marked “is_bind_aware=Y”. When now bind DDDDDDDD is used, there is a new child (#2) and child 1 is set to is_sharable=N. After using bind EEEEEEEE, child #2 is set to sharable=N and a new child #3 is created. This repeats until there are 5 child cursors, 2 of which are set to is_sharable=Y. When the script is executed multiple times, there is even a sixth child cursor.

1
2
3
4
5
6
7
8
SQL_ID        I I I I BIND(NAME, POSITION, DUP_POSITION, DATATYPE, DATATYPE_STRING, CHARACTER_SID, PRECISION, SCALE, MAX_LENGTH, LAST_CAPTURED PLAN_HASH_VALUE CHILD_NUMBER EXECUTIONS
------------- - - - - ------------------------------------------------------------------------------------------------------------------------ --------------- ------------ ----------
fhtgpcv2m2j4s N Y N N SQL_BIND_SET(SQL_BIND(NULL, 1, NULL, 1, 'VARCHAR2(32)', 178, NULL, NULL, 32, '12-MAR-09', 'AAAAAAAAAA', ANYDATA()))           2211052296            0          2
fhtgpcv2m2j4s N Y Y N SQL_BIND_SET(SQL_BIND(NULL, 1, NULL, 1, 'VARCHAR2(32)', 178, NULL, NULL, 32, '12-MAR-09', 'CCCCCCCCCC', ANYDATA()))           1918680838            1          1
fhtgpcv2m2j4s N Y Y N SQL_BIND_SET(SQL_BIND(NULL, 1, NULL, 1, 'VARCHAR2(32)', 178, NULL, NULL, 32, '12-MAR-09', 'DDDDDDDDDD', ANYDATA()))           1918680838            2          1
fhtgpcv2m2j4s N Y Y N SQL_BIND_SET(SQL_BIND(NULL, 1, NULL, 1, 'VARCHAR2(32)', 178, NULL, NULL, 32, '12-MAR-09', 'EEEEEEEEEE', ANYDATA()))           1918680838            3          1
fhtgpcv2m2j4s N Y Y Y SQL_BIND_SET(SQL_BIND(NULL, 1, NULL, 1, 'VARCHAR2(32)', 178, NULL, NULL, 32, '12-MAR-09', 'FFFFFFFFFF', ANYDATA()))           1918680838            4         16
fhtgpcv2m2j4s N Y Y Y SQL_BIND_SET(SQL_BIND(NULL, 1, NULL, 1, 'VARCHAR2(32)', 178, NULL, NULL, 32, '12-MAR-09', 'AAAAAAAAAA', ANYDATA()))           2211052296            5          3

I wondered why the cursors are not sharable and queried v$sql_shared_cursor:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
SQL>   SELECT * FROM v$sql_shared_cursor WHERE sql_id = 'fhtgpcv2m2j4s';
 
SQL_ID        ADDRESS  CHILD_AD CHILD_NUMBER U S O O S L F E B P I S T A B D L T R I I R L I O E M U T N F A I T D L D B P C S C P T M B M R O P M F L P L A F L R L H
------------- -------- -------- ------------ - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
fhtgpcv2m2j4s 23F4E438 23F4DAB0            0 N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N Y N N N N N N
fhtgpcv2m2j4s 23F4E438 23D924D8            1 N N N N N N N N N N N N N N N N N N Y N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N
fhtgpcv2m2j4s 23F4E438 23D7E888            2 N N N N N N N N N N N N N N N N N N Y N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N
fhtgpcv2m2j4s 23F4E438 23D6B8C8            3 N N N N N N N N N N N N N N N N N N Y N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N
fhtgpcv2m2j4s 23F4E438 23D574D8            4 N N N N N N N N N N N N N N N N N N Y N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N
fhtgpcv2m2j4s 23F4E438 23B983B0            5 N N N N N N N N N N N N N N N N N N Y N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N
 
6 ROWS selected.
 
SQL> SELECT sql_id, address, child_number, LOAD_OPTIMIZER_STATS, row_level_sec_mismatch FROM v$sql_shared_cursor WHERE sql_id = 'fhtgpcv2m2j4s';
 
SQL_ID        ADDRESS  CHILD_NUMBER L R
------------- -------- ------------ - -
fhtgpcv2m2j4s 245CE53C            0 Y N
fhtgpcv2m2j4s 245CE53C            1 N Y
fhtgpcv2m2j4s 245CE53C            2 N Y
fhtgpcv2m2j4s 245CE53C            3 N Y
fhtgpcv2m2j4s 245CE53C            4 N Y
fhtgpcv2m2j4s 245CE53C            5 N Y
 
SQL> SELECT * FROM v$sql_cs_statistics WHERE sql_id = 'fhtgpcv2m2j4s';
 
ADDRESS  HASH_VALUE SQL_ID        CHILD_NUMBER BIND_SET_HASH_VALUE P EXECUTIONS ROWS_PROCESSED BUFFER_GETS   CPU_TIME
-------- ---------- ------------- ------------ ------------------- - ---------- -------------- ----------- ----------
245CE53C 3308340376 fhtgpcv2m2j4s            5          3946114705 N          1              0           2          0
245CE53C 3308340376 fhtgpcv2m2j4s            5          2785428864 Y          1              0           2          0
245CE53C 3308340376 fhtgpcv2m2j4s            4          1680726256 Y          1              2           3          0
245CE53C 3308340376 fhtgpcv2m2j4s            3          1438126164 Y          1             20           5          0
245CE53C 3308340376 fhtgpcv2m2j4s            2           451655377 Y          1            200          21          0
245CE53C 3308340376 fhtgpcv2m2j4s            1          4078512309 Y          1           2000         169          0
245CE53C 3308340376 fhtgpcv2m2j4s            0           479772785 Y          1         100000        9823          0
 
7 ROWS selected.
 
SQL> SELECT * FROM v$sql_cs_selectivity WHERE sql_id = 'fhtgpcv2m2j4s';
 
ADDRESS  HASH_VALUE SQL_ID        CHILD_NUMBER PREDICATE                                  RANGE_ID LOW        HIGH
-------- ---------- ------------- ------------ ---------------------------------------- ---------- ---------- ----------
245CE53C 3308340376 fhtgpcv2m2j4s            5 =MYVAR                                            0 0.000004   0.010773
245CE53C 3308340376 fhtgpcv2m2j4s            4 =MYVAR                                            0 0.000009   0.010773
245CE53C 3308340376 fhtgpcv2m2j4s            3 =MYVAR                                            0 0.000088   0.010773
245CE53C 3308340376 fhtgpcv2m2j4s            2 =MYVAR                                            0 0.000881   0.010773
245CE53C 3308340376 fhtgpcv2m2j4s            1 =MYVAR                                            0 0.008814   0.010773

It is not clear, why the cursors are not shared and why this is expressed in the flag “row_level_sec_mismatch”. From these tests we can see that the number of child cursors is still quite high.



Session waiting for “enq: RO – fast object reuse” – DBWR Process spinning on CPU

I have encountered the following problem on a 10.2.0.4 database on Linux x86_64 today:
A user session has been waiting for “enq: RO – fast object reuse” for almost 60 minutes while executing a “truncate table” SQL statement.

SQL> select username, event, sql_id, taddr, last_call_et from v$session where sid = 234;

USERNAME EVENT SQL_ID TADDR LAST_CALL_ET
———- —————————– ————- —————- ————
MD enq: RO – fast object reuse ljk299jlkj003 0000000153264570 3542

SQL> select sql_text from v$sqlstats where sql_id = ‘ljk299jlkj003’;

SQL_TEXT
————————————-
truncate table tab1

The Session was blocked by the CKPT process:

SQL> select * from dba_waiters;

WAITING_SESSION HOLDING_SESSION LOCK_TYPE MODE_HELD MODE_REQUESTED LOCK_ID1 LOCK_ID2
————— ————— ————————– —————————————- —————————————- ———- ———-
234 423 RO Row-S (SS) Exclusive 65573 1

SQL> select sid, serial#, sql_id, last_call_et, machine, program, username from v$session where sid = 423;

SID SERIAL# SQL_ID LAST_CALL_ET MACHINE PROGRAM
———- ———- ————- ———— —————- ——————————–
423 1 4133636 ora-vm1.intra oracle@ora-vm1.intra (CKPT)

The checkpoint process was waiting for database writer DBWR process, which was spinning on one cpu:

top

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
10712 oracle 25 0 2201m 1.7g 1.7g R 99.5 21.7 108:18.03 oracle

PID 10712 maps to DBW0:

[oracle@ora-vm1 ]$ ps -ef|grep 10712
oracle 10712 1 0 2008 ? 03:23:05 ora_dbw0_MDDB01

mpstat

Linux 2.6.9-78.ELsmp (ora-vm1.intra) 01/20/2009

02:21:56 PM CPU %user %nice %system %iowait %irq %soft %idle intr/s
02:21:57 PM all 49.75 0.00 0.00 0.00 0.00 0.00 50.25 1055.00
02:21:57 PM 0 0.00 0.00 0.00 0.00 0.00 0.00 100.00 1006.00
02:21:57 PM 1 100.00 0.00 0.00 0.00 0.00 0.00 0.00 49.00

02:21:57 PM CPU %user %nice %system %iowait %irq %soft %idle intr/s
02:21:58 PM all 50.75 0.00 0.00 0.50 0.00 0.00 48.76 1161.00
02:21:58 PM 0 1.00 0.00 0.00 1.00 0.00 0.00 98.00 1087.00
02:21:58 PM 1 100.00 0.00 0.00 0.00 0.00 0.00 0.00 74.00

The stack of dbw0 during the time showed these signatures:

[oracle@ora-vm1 oracle]$ pstack 10712
#0 0x000000000074b7fb in kslfre ()
#1 0x00000000010ccc3b in kcbo_exam_buf ()
#2 0x00000000010d0d62 in kcbo_service_ockpt ()
#3 0x0000000001080cd7 in kcbbdrv ()
#4 0x00000000007ddcc2 in ksbabs ()
#5 0x00000000007e4b32 in ksbrdp ()
#6 0x0000000002efcb50 in opirip ()
#7 0x00000000012da326 in opidrv ()
#8 0x0000000001e62456 in sou2o ()
#9 0x00000000006d2555 in opimai_real ()
#10 0x00000000006d240c in main ()
[oracle@ora-vm1 oracle]$ pstack 10712
#0 0x000000000074b36d in kslfre ()
#1 0x00000000010cc203 in kcbo_write_process ()
#2 0x00000000010ce608 in kcbo_write_q ()
#3 0x0000000001080a6d in kcbbdrv ()
#4 0x00000000007ddcc2 in ksbabs ()
#5 0x00000000007e4b32 in ksbrdp ()
#6 0x0000000002efcb50 in opirip ()
#7 0x00000000012da326 in opidrv ()
#8 0x0000000001e62456 in sou2o ()
#9 0x00000000006d2555 in opimai_real ()
#10 0x00000000006d240c in main ()
[oracle@ora-vm1 oracle]$ pstack 10712
#0 0x00000000010ccb60 in kcbo_exam_buf ()
#1 0x00000000010d0d62 in kcbo_service_ockpt ()
#2 0x0000000001080cd7 in kcbbdrv ()
#3 0x00000000007ddcc2 in ksbabs ()
#4 0x00000000007e4b32 in ksbrdp ()
#5 0x0000000002efcb50 in opirip ()
#6 0x00000000012da326 in opidrv ()
#7 0x0000000001e62456 in sou2o ()
#8 0x00000000006d2555 in opimai_real ()
#9 0x00000000006d240c in main ()
[oracle@ora-vm1 oracle]$ pstack 10712
#0 0x00000000010d0da5 in kcbo_service_ockpt ()
#1 0x0000000001080cd7 in kcbbdrv ()
#2 0x00000000007ddcc2 in ksbabs ()
#3 0x00000000007e4b32 in ksbrdp ()
#4 0x0000000002efcb50 in opirip ()
#5 0x00000000012da326 in opidrv ()
#6 0x0000000001e62456 in sou2o ()
#7 0x00000000006d2555 in opimai_real ()
#8 0x00000000006d240c in main ()

A MetaLink Research for the term “kcbo_service_ockpt” leads to Bug 7376934, which is a duplicate of Bug 7385253 – DBWR IS CONSUMING HIGH CPU.

Patch 7385253 is available for Linux x86_64, HP-UX, Solaris, AIX.
Reference:
MetaLink Note 762085.1 – Subject: ‘enq: RO – fast object reuse’ contention when gathering schema/table statistics in parallel



Huge Space Consumption by $ORACLE_HOME/.patch_storage

If you are keeping your system up to date with Patchsets, Patch Bundles, Merge Label Requests (MLR) or Critical Patch Updates (CPU), you will most likely suffer from a huge .patch_storage Subdirectory in your $ORACLE_HOME.

On one of my databases it looked like this:

Space used by $ORACLE_HOME: 7 GB
Space used by $ORACLE_HOME/.patch_storage: 4.3 GB

Can data in this directory be removed?

MetaLink Note 550522.1 (Subject: How To Avoid Disk Full Issues Because OPatch Backups Take Big Amount Of Disk Space.) has the answer and tells you: it depends. Normally, this data is used in order to be able to rollback a patch. However, if you have installed a patchset (eg. 10.2.0.4), then the patches for the previous patchset (10.2.0.3) which are located in the .patch_storage directory are not needed anymore and can be removed. However, I would not recommend that you delete the directories manually yourself, as this would not be supported. Instead let Oracle do it for you:

Recent versions of opatch (current is 10.2.0.4.5 as of January 2009) have a utility included, which removes patches not needed anymore from the .patch_storage directory. Moreover, the opatch utility creates these .patch_storage backup directories more intelligently which should result in less space wasted.

[oracle@vmhost1 ora10]$./OPatch/opatch util Cleanup
Invoking OPatch 10.2.0.4.5

Oracle Interim Patch Installer version 10.2.0.4.5
Copyright (c) 2008, Oracle Corporation. All rights reserved.

UTIL session

Oracle Home : /oracle/ora10
Central Inventory : /oracle/oraInventory
from : /var/opt/oracle/oraInst.loc
OPatch version : 10.2.0.4.5
OUI version : 10.2.0.4.0
OUI location : /oracle/ora10/oui
Log file location : /oracle/ora10/cfgtoollogs/opatch/opatch2009-01-15_17-00-51PM.log

Patch history file: /oracle/ora10/cfgtoollogs/opatch/opatch_history.txt

Invoking utility “cleanup”
OPatch will clean up ‘restore.sh,make.txt’ files and ‘rac,scratch,backup’ directories.
You will be still able to rollback patches after this cleanup.
Do you want to proceed? [y|n]
y
User Responded with: Y
Size of directory “/oracle/ora10/.patch_storage” before cleanup is 4575330012 bytes.
Size of directory “/oracle/ora10/.patch_storage” after cleanup is 188326505 bytes.

UtilSession: Backup area for restore has been cleaned up. For a complete list of files/directories
deleted, Please refer log file.

OPatch succeeded.

180 MB instead of 4 GB. I like that.



NUMA enabled in 10.2.0.4

When upgrading from pre 10.2.0.4 to 10.2.0.4, Oracle enables NUMA support. This has the effect that there can be multiple shared memory segments (MetaLink Note: 429872.1) although shmmax/shmall are set to high values.

I have read MetaLink Notes (7171446.8, 6730567.8, 6689903.8) and this blog entry, where a customer had problems on HP-UX with the default NUMA settings.

Better than that, it can also lead to instance crashes in 10.2.0.4 as reported in MetaLink Note 743191.1. Good news is that there is a patch available for Linux x86_64/10.2.0.4.

I have asked Oracle Support whether it is safe to leave NUMA enabled for Linux Itanium, but they would not comment on it. Instead they asked me to check with the OS vendor. Great. ;-(



Installing LTOM for RAC Hanganalyze

I have recently written a short installation instruction for a customer to use LTOM for RAC hanganalyze.

  • Download von MetaLink Note 352363.1:
    https://metalink2.oracle.com/cgi-bin/cr/getfile.cgi?p_attid=352363.1:ltom
  • unzip as User Oracle in directory with sufficient space. subdir ltom will be created

tar xfv ltom411.tar

  • Install statspack
  • Installation: db account ltom/ltom will be created

Java Binary in path
which java

if not, place java in path in ~/.bash_profile:
als User oracle: add $ORACLE_HOME/jre/1.4.2/bin to PATH Variable

e.g.:
export PATH=$PATH:$ORACLE_HOME/bin:$ORACLE_HOME/jre/1.4.2/bin

Put TOM_HOME in .~/.bash_profile:
export TOM_HOME=/home/oracle/ltom/tom_base/install/../tom

Sourcen of TOM_HOME

cd /home/oracle/ltom/tom_base/install
./autoinstall.sh

Sample Output:
[oracle@ora-vm1 install]$ ./autoinstall.sh
TOM_HOME=/home/oracle/ltom/tom_base/install/../tom
rm: cannot remove `autoinstall.log’: No such file or directory

Preparing to install LTOM v4.0.0…

Checking your Oracle Environment…

ORACLE_HOME=/u01/app/oracle/product/10.2.0
ORACLE_SID=ORA102
TNS_ADMIN=
TOM_HOME=/home/oracle/ltom/tom_base/install/../tom

Is this the correct Oracle environment you wish to use for installing LTOM?
Hit Return to Accept Default:
============================> \c

Continuing the install with this environment

Checking for Java…

Java found.

Checking for top utility…
Found top utility.

Checking for vmstat utility…
Found vmstat utility.

Need to run some sql now to create TOM user
it Return to allow setup to do this for you, enter other to do this manually:
============================> \c

Tom user created successfully.
Warning…Please change tom password immediately!

Grant user Tom successful.
Create package dbmssupp successful.
Grant execute on dbmssupp successful.

Checking for statspack…

Cannot verify statspack is installed on your system.
LTOM requires statspack to be installed to work properly.

Enter your email id to get notification of hangs via email…
Hit Return to ignore hang notification via email:
============================> \c

Configuring tom_deploy.properties file…

TOM/LTOM successfully installed.

Please review any messages you received above.

REMEMBER: You must have TOM_HOME defined and java in your PATH
before you can run TOM/LTOM.

  • Configuration

Change ltom/tom_base/tom/init/hangDetect.properties

from:
#RAC_AWARE=TRUE
HANGANALYZE=A
SYSTEMSTATE=266
DELAY=30
HANGANALYZE=A
SYSTEMSTATE=266

to:
RAW_AWARE=TRUE
HANGANALYZE=A
SYSTEMSTATE=266
DELAY=30
HANGANALYZE=A
SYSTEMSTATE=266

Change tom db passwort:

SQL> alter user tom identified by secret;

Add credentials to config file:

vi tom_deploy.properties

DB_USERID=tom
:q!

./startltom.sh -p
Enter same password as in sqlplus. Password will be saved encrypted in tom_deploy.properties

  • edit tom_deploy.properties to decide which of the 3 modules should be running during background operations: hangdetect, profiler, session recorder
  • Start/Stop Scripts

vi start_ltom.sh
#!/bin/bash
. ~/.bash_profile
nohup $TOM_HOME/startltom.sh -s &

vi stop_ltom.sh
#!/bin/bash
touch $TOM_HOME/tmp/hang.stop
touch $TOM_HOME/tmp/profile.stop
touch $TOM_HOME/tmp/session.stop

chmod u+x start_ltom.sh
chmod u+x stop_ltom.sh

  • If you want  to run ltom during a specific timeframe, use cron (as user oracle)

crontab -e
45 01 * * *     /home/oracle/ltom/start_ltom.sh >/dev/null 2>/dev/null </dev/null
00 05 * * *     /home/oracle/ltom/start_ltom.sh >/dev/null 2>/dev/null </dev/null



Troublshooting Trace Files of DIAG Background Process

In trace files of the background process DIAG, I saw an interesting section that I would like to share.


*** 2008-12-09 03:46:33.280
Dump requested by process [orapid=6]
REQUEST:custom dump [2] with parameters [6][27][4][2]
Dumping process info of pid[27.31384] requested by pid[6.31160]
ORA-00494: Message 494 not found; No message file for product=RDBMS, facility=ORA; arguments: [ [CF]] [ (more than 900 seconds)] [inst
2, osid 31384]
-------------------------------------------------------------------------------
ENQUEUE [CF] HELD FOR TOO LONG
enqueue holder: 'inst 2, osid 31384'
Process 'inst 2, osid 31384' is holding an enqueue for maximum allowed time.
The process will be terminated.
Oracle Support Services triaging information: to find the root-cause, look
at the call stack of process 'inst 2, osid 31384' located below. Ask the
developer that owns the first NON-service layer in the stack to investigate.
Common service layers are enqueues (ksq), latches (ksl), library cache
pins and locks (kgl), and row cache locks (kqr).

It is very nice of diag to tell me what to do with the output. 😉 Especially:

Ask the developer that owns the first NON-service layer in the stack to investigate.



System State Dump Evaluation with ass.awk

I have found out that there is an Oracle Support utility called ass.awk, which can be used to analyze and format a System State dump. It is contained in LTOM, which can be downloaded from MetaLink.

The AWK script is called ass109.awk and can be used in this manner:

[oracle@ora-vm1 ~]$ awk -f ass109.awk mddb1_diag_12345.trc
 
System State ASS:
Starting Systemstate 1
..............................................................................
.....
Ass.Awk Version 1.0.9 - Processing mddb1_diag_12345.trc
System State 1
~~~~~~~~~~~~~~~~
1:
2: waiting FOR 'pmon timer' wait
3: LAST wait FOR 'ksdxexeotherwait'
4: waiting FOR 'rdbms ipc message' wait
5: waiting FOR 'rdbms ipc message' wait
6: waiting FOR 'ges remote message' wait
7: waiting FOR 'gcs log flush sync' wait
8: waiting FOR 'gcs log flush sync' wait
9: waiting FOR 'rdbms ipc message' wait
10: waiting FOR 'rdbms ipc message' wait
11: waiting FOR 'enq: CF - contention' [Enqueue CF-00000000-00000000] wait
12: waiting FOR 'rdbms ipc message' wait
13: waiting FOR 'buffer busy waits' (2,a9,3b) wait
14: waiting FOR 'rdbms ipc message' wait
15: waiting FOR 'rdbms ipc message' wait
16: waiting FOR 'buffer busy waits' (2,a9,3b) wait
17: waiting FOR 'rdbms ipc message' wait
18:
19:
20: waiting FOR 'rdbms ipc message' wait
21: waiting FOR 'log file sync' wait
22: waiting FOR 'ASM background timer' wait
23: waiting FOR 'rdbms ipc message' wait
24: waiting FOR 'log file sync' wait
25: waiting FOR 'class slave wait' wait
26: waiting FOR 'SQL*Net message from client' wait
27: waiting FOR 'SQL*Net message from client' wait
28: waiting FOR 'enq: CF - contention' [Enqueue CF-00000000-00000000] wait
29: waiting FOR 'enq: CF - contention' [Enqueue CF-00000000-00000000] wait
30: waiting FOR 'Streams AQ: qmn coordinator idle wait' wait
31: waiting FOR 'SQL*Net message from client' wait
32: waiting FOR 'SQL*Net message from client' wait
Cmd: SELECT
33: waiting FOR 'log file sync' wait
34: waiting FOR 'Streams AQ: qmn slave idle wait' wait
35: FOR 'Streams AQ: waiting for time management or cleanup tasks' wait
36: waiting FOR 'log file sync' wait
37: waiting FOR 'SQL*Net message from client' wait
Cmd: SELECT
38: waiting FOR 'SQL*Net message from client' wait
39: waiting FOR 'SQL*Net message from client' wait
40: waiting FOR 'PX Deq: Execution Msg' wait
41: waiting FOR 'log file sync' wait
42: waiting FOR 'log file sync' wait
43: waiting FOR 'enq: CF - contention' [Enqueue CF-00000000-00000000] wait
Cmd: SELECT
44: waiting FOR 'SQL*Net message from client' wait
45: waiting FOR 'SQL*Net message from client' wait
46: waiting FOR 'SQL*Net message from client' wait
47: waiting FOR 'SQL*Net message from client' wait
48: waiting FOR 'SQL*Net message from client' wait
49: waiting FOR 'log file sync' wait
50: waiting FOR 'SQL*Net message from client' wait
51: waiting FOR 'SQL*Net message from client' wait
Cmd: SELECT
52: waiting FOR 'row cache lock' [Rcache object=0xa5414098,] wait
Cmd: SELECT
53: waiting FOR 'SQL*Net message from client' wait
54: waiting FOR 'SQL*Net message from client' wait
55: FOR 'Streams AQ: waiting for messages in the queue' wait
56: waiting FOR 'SQL*Net message from client' wait
57: waiting FOR 'SQL*Net message from client' wait
58: waiting FOR 'log file sync' wait
Cmd: INSERT
59: waiting FOR 'jobq slave wait' wait
60: waiting FOR 'jobq slave wait' wait
61: waiting FOR 'SQL*Net message from client' wait
62: waiting FOR 'SQL*Net message from client' wait
63: waiting FOR 'SQL*Net message from client' wait
65: waiting FOR 'SQL*Net message from client' wait
Cmd: SELECT
66: waiting FOR 'SQL*Net message from client' wait
67: LAST wait FOR 'SQL*Net message from client'
68: waiting FOR 'SQL*Net message from client' wait
69: waiting FOR 'SQL*Net message from client' wait
70: waiting FOR 'SQL*Net message from client' wait
71: waiting FOR 'SQL*Net message from client' wait
72: waiting FOR 'SQL*Net message from client' wait
73: waiting FOR 'SQL*Net message from client' wait
74: waiting FOR 'SQL*Net message from client' wait
Cmd: SELECT
75: waiting FOR 'SQL*Net message from client' wait
Cmd: SELECT
76: waiting FOR 'SQL*Net message from client' wait
77: waiting FOR 'log file sync' wait
78: waiting FOR 'SQL*Net message from client' wait
80: waiting FOR 'SQL*Net message from client' wait
Cmd: SELECT
82: waiting FOR 'SQL*Net message from client' wait
84: waiting FOR 'log file sync' wait
86: waiting FOR 'SQL*Net message from client' wait
93: waiting FOR 'SQL*Net message from client' wait
94: waiting FOR 'SQL*Net message from client' wait
Blockers
~~~~~~~~
Above IS a list OF ALL the processes. IF they are waiting FOR a resource
THEN it will be given IN square brackets. Below IS a summary OF the
waited upon resources, together WITH the holder OF that resource.
Notes:
~~~~~
o A process id OF '???' implies that the holder was NOT found IN the
systemstate.
Resource Holder State
Enqueue CF-00000000-00000000 12: waiting FOR 'rdbms ipc message'
Rcache object=0xa5414098, ??? Blocker
Object Names
~~~~~~~~~~~~
Enqueue CF-00000000-00000000
Rcache object=0xa5414098,

With this information, you can investigate further into the System state dump by looking at rcache object 0xa5414098. The “State Objects” are a hierarchical structure. Every state object has an adress “SO: 0xa6a5955c” and an owner, which is the parent entry. In this case, you can construct a hierarchical tree of state objects and find dependencies.

 SO: 0xa6a5995c, TYPE: 2, owner: (nil), flag: INIT/-/-/0x00
 (process) Oracle pid=52, calls cur/top: 0x9d62eb1c/0x9d5ccfcc, flag: (0) -
		   INT error: 0, CALL error: 0, sess error: 0, txn error 0
 (post info) LAST post received: 0 0 163
			 LAST post received-location: kqrbtm
			 LAST process TO post me: a6a4e1dc 1 6
			 LAST post sent: 0 0 24
			 LAST post sent-location: ksasnd
			 LAST process posted BY me: a6a4e1dc 1 6
   (latch info) wait_event=0 bits=0
	Process GROUP: DEFAULT, pseudo proc: 0xa6baf79c
	O/S info: USER: oracle, term: UNKNOWN, ospid: 29032
    OSD pid info: Unix process pid: 29032, image: oracle@ora-vm1
    Short stack dump: ksdxfstk()+19<-ksdxcb()+1321<-sspuser()+102<-__kernel_vsyscall()+16
 
    SO: 0xa4123d20, TYPE: 4, owner: 0xa6a5995c, flag: INIT/-/-/0x00
    (SESSION) sid: 955 trans: (nil), creator: 0xa6a5995c, flag: (41) USR/- BSY/-/-/-/-/-
              DID: 0001-0034-00002227, short-term DID: 0001-0034-00002228
              txn branch: (nil)
              oct: 3, prv: 0, SQL: 0x8fee9d44, psql: 0x8fee9f00, USER: 23/DBSNMP
     service name: SYS$USERS
     O/S info: USER: oracle, term: , ospid: 28849, machine: ora-vm1
               program: emagent@ora-vm1(TNS V1-V3)
     application name: emagent@ora-vm1(TNS V1-V3), hash VALUE=2420928497
     waiting FOR 'row cache lock' blocking sess=0x(nil) seq=127 wait_time=0 seconds since wait started=0
                 cache id=d, mode=0, request=5
     Dumping SESSION Wait History
      FOR 'row cache lock' COUNT=1 wait_time=1337230
		         cache id=d, mode=0, request=5
      FOR 'row cache lock' COUNT=1 wait_time=2930561
                 cache id=d, mode=0, request=5
      FOR 'row cache lock' COUNT=1 wait_time=2929106
                 cache id=d, mode=0, request=5
      FOR 'row cache lock' COUNT=1 wait_time=2933326
 
      SO: 0x9d5ccfcc, TYPE: 3, owner: 0xa6a5995c, flag: INIT/-/-/0x00
     (CALL) sess: cur a4123d20, rec a4123d20, usr a4123d20; depth: 0
       ----------------------------------------
       SO: 0x9d51b1a0, TYPE: 3, owner: 0x9d5ccfcc, flag: INIT/-/-/0x00
       (CALL) sess: cur a4123d20, rec a417f990, usr a4123d20; depth: 1
         ----------------------------------------
         SO: 0x9d69d7f0, TYPE: 3, owner: 0x9d51b1a0, flag: INIT/-/-/0x00
         (CALL) sess: cur a417f990, rec a417f990, usr a4123d20; depth: 2
         ----------------------------------------
           SO: 0x9d5e547c, TYPE: 3, owner: 0x9d69d7f0, flag: INIT/-/-/0x00
           (CALL) sess: cur a417f990, rec a417f990, usr a4123d20; depth: 3
           ----------------------------------------
             SO: 0x9d62eb1c, TYPE: 3, owner: 0x9d5e547c, flag: INIT/-/-/0x00
             (CALL) sess: cur a417f990, rec a417f990, usr a4123d20; depth: 4
             ----------------------------------------
               SO: 0x9fedf748, TYPE: 40, owner: 0x9d62eb1c, flag: INIT/-/-/0x00
               (trans) flg = 0x21, flg2 = 0x80000, prx = 0x0, ros = 2147483647 bsn = 0x395 bndsn = 0x395 spn = 0x395
               efd = 34
               parent xid: 0x0000.000.00000000
               env: (scn: 0x0000.00000000 xid: 0x0000.000.00000000 uba: 0x00000000.0000.00 statement num=0 parent xid: xid:
               0x0000.000.00000000 scn: 0x0000.00000000 0sch: scn: 0x0000.00000000)
               cev: (spc = 0 arsp = 0 ubk tsn: 0 rdba: 0x00000000 useg tsn: 0 rdba: 0x00000000
                     hwm uba: 0x00000000.0000.00 col uba: 0x00000000.0000.00
                     num bl: 0 bk list: 0x0)
                     cr opc: 0x0 spc: 0 uba: 0x00000000.0000.00
               (enqueue) TX-00010011-00D3B4CF DID: 0001-0051-00001323
               lv: 7c ac ad 4d 03 00 00 00 00 00 00 00 00 00 00 00 res_flag: 0x45
               res: 0xa447bb2c, lock_flag: 0x0
               own: 0xa4157e90, sess: 0xa4157e90, prv: 0x9fedf774
               slk: (nil)
                xga: 0x0, heap: UGA
               Trans IMU st: 0 Pool INDEX 65535, Redo pool 0x9fedfb74, Undo pool 0x9fedfbf0
               Redo pool range [0x40ed57e8 0x40ed57e8 0x40ed7fe8]
               Undo pool range [0x40ed2fe8 0x40ed2fe8 0x40ed57e8]
               ----------------------------------------
                  SO: 0x9d42237c, TYPE: 50, owner: 0x9fedf748, flag: INIT/-/-/0x00
                  ROW cache enqueue: COUNT=1 SESSION=0xa4123d20 object=0xa5414098, request=X
                  SAVEPOINT=0x395
                  ROW cache parent object: address=0xa5414098 cid=13(dc_sequences)
                  hash=9649e7e typ=9 TRANSACTION=(nil) flags=00000000
                  own=0xa5414104[0xa5414104,0xa5414104] wat=0xa541410c[0x9d422398,0x9d422398] mode=N
                  STATUS=-/-/-/-/-/-/-/-/-
                  request=X release=FALSE flags=2
                  instance LOCK id=QN 010cdbb3 b3b24847

Tanel Poder showed in his seminar to use “grep -A 3 SO:” to find all lines containing the State Object Adresses and the following 3 lines to get the tree.