Linux Itanium

Multipathing Configuration issue waiting to happen

Quite some time ago, I came across a quite hard to find issue during a consulting engagement, which i find worth mentioning. A 2 node RAC cluster running on RHEL4 x86-64 was relocated to a different data center. Apart from making sure, that the switch ports and Fibre Channel Ports are available on the new location, there is not much to worry about.

After the relocation, on one node the multipathing configuration, implemented with dev-mapper-multipath would not work. The command “multipath -ll” would just not return any output. After more than an hour, we pinned the issue down to the error message:

# multipath -v 3
#
# all paths in cache :
#



path sdh not found in pathvec

When checking what device sdh was, we realized that this was a KVM device, plugged in by the sysadmins.

May 27 14:46:26 host1 kernel: Attached scsi removable disk sdh at scsi10, channel 0, id 0, lun 0
May 27 14:46:26 host1 kernel: Type: Direct-Access ANSI SCSI revision: 02
May 27 14:46:26 host1 kernel: Vendor: KVM Model: vmDisk Rev: 0.01
May 27 14:46:26 host1 kernel: scsi10 : SCSI emulation for USB Mass Storage devices
May 27 14:46:26 host1 kernel: sr1: scsi3-mmc drive: 0x/0x caddy
May 27 14:46:26 host1 kernel: Type: CD-ROM ANSI SCSI revision: 02
May 27 14:46:26 host1 kernel: Vendor: KVM Model: vmDisk-CD Rev: 0.01

BTW: What is a KVM device?
Wikipedia states: A KVM switch (with KVM being an abbreviation for Keyboard, Video or Visual Display Unit, Mouse) is a hardware device that allows a user to control multiple computers from a single keyboard, video monitor and mouse.

We then added the device sdh to the multipath blacklist section in /etc/multipath.conf, and the problem was solved:

devnode_blacklist {
devnode “^sdh$”
}



Your experience with RAC Dynamic Remastering (DRM) in 10gR2?

One of my customers is having severe RAC performance issues, which appeared a dozen times so far. Each time, the performance impact lasted around 10 minutes and caused basically a hang of the application. ASH investigation revealed that the time frame of performance issues exactly matches a DRM operation of the biggest segment of the database. During the problematic time period, there are 20-50 instead of 5-10 active sessions and they are mostly waiting for gc related events: “gc buffer busy”,”gc cr block busy”, “gc cr block 2-way”, “gc current block 2-way”, “gc current request”, “gc current grant busy”, etc.

In addition, there is one single session which has wait event “kjbdrmcvtq lmon drm quiesce: ping completion” (on instance 1) and 1-3 sessions with wait event “gc remaster“. (on instance 2) The cur_obj# of the session waiting on “gc remaster” is pointing to the segment being remastered.

Does anybody have any experience with DRM problems with 10.2.0.4 on Linux Itanium?

I know that it is possible to deactive DRM, but usually it should be beneficial to have it enabled. I could not find any reports of performance impact during DRM operation on metalink. Support is involved but clueless so far.

Regards,
Martin

http://forums.oracle.com/forums/message.jspa?messageID=3447436#3447436



RAC Deadlock Detection not working on 10.2.0.4 / Linux Itanium

We recently experienced a big issue, when the production database was hung. It turned out that the database has deadlocked, but the GES did not detect the deadlock situation, so all the sessions were waiting on “enq: TX row lock contention”.

We could provide a reproducible testcase and it turned out to be bug 7014855. The bug is platform specific to Linux Itanium port and a patch is available.



Session waiting for “enq: RO – fast object reuse” – DBWR Process spinning on CPU

I have encountered the following problem on a 10.2.0.4 database on Linux x86_64 today:
A user session has been waiting for “enq: RO – fast object reuse” for almost 60 minutes while executing a “truncate table” SQL statement.

SQL> select username, event, sql_id, taddr, last_call_et from v$session where sid = 234;

USERNAME EVENT SQL_ID TADDR LAST_CALL_ET
———- —————————– ————- —————- ————
MD enq: RO – fast object reuse ljk299jlkj003 0000000153264570 3542

SQL> select sql_text from v$sqlstats where sql_id = ‘ljk299jlkj003’;

SQL_TEXT
————————————-
truncate table tab1

The Session was blocked by the CKPT process:

SQL> select * from dba_waiters;

WAITING_SESSION HOLDING_SESSION LOCK_TYPE MODE_HELD MODE_REQUESTED LOCK_ID1 LOCK_ID2
————— ————— ————————– —————————————- —————————————- ———- ———-
234 423 RO Row-S (SS) Exclusive 65573 1

SQL> select sid, serial#, sql_id, last_call_et, machine, program, username from v$session where sid = 423;

SID SERIAL# SQL_ID LAST_CALL_ET MACHINE PROGRAM
———- ———- ————- ———— —————- ——————————–
423 1 4133636 ora-vm1.intra oracle@ora-vm1.intra (CKPT)

The checkpoint process was waiting for database writer DBWR process, which was spinning on one cpu:

top

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
10712 oracle 25 0 2201m 1.7g 1.7g R 99.5 21.7 108:18.03 oracle

PID 10712 maps to DBW0:

[oracle@ora-vm1 ]$ ps -ef|grep 10712
oracle 10712 1 0 2008 ? 03:23:05 ora_dbw0_MDDB01

mpstat

Linux 2.6.9-78.ELsmp (ora-vm1.intra) 01/20/2009

02:21:56 PM CPU %user %nice %system %iowait %irq %soft %idle intr/s
02:21:57 PM all 49.75 0.00 0.00 0.00 0.00 0.00 50.25 1055.00
02:21:57 PM 0 0.00 0.00 0.00 0.00 0.00 0.00 100.00 1006.00
02:21:57 PM 1 100.00 0.00 0.00 0.00 0.00 0.00 0.00 49.00

02:21:57 PM CPU %user %nice %system %iowait %irq %soft %idle intr/s
02:21:58 PM all 50.75 0.00 0.00 0.50 0.00 0.00 48.76 1161.00
02:21:58 PM 0 1.00 0.00 0.00 1.00 0.00 0.00 98.00 1087.00
02:21:58 PM 1 100.00 0.00 0.00 0.00 0.00 0.00 0.00 74.00

The stack of dbw0 during the time showed these signatures:

[oracle@ora-vm1 oracle]$ pstack 10712
#0 0x000000000074b7fb in kslfre ()
#1 0x00000000010ccc3b in kcbo_exam_buf ()
#2 0x00000000010d0d62 in kcbo_service_ockpt ()
#3 0x0000000001080cd7 in kcbbdrv ()
#4 0x00000000007ddcc2 in ksbabs ()
#5 0x00000000007e4b32 in ksbrdp ()
#6 0x0000000002efcb50 in opirip ()
#7 0x00000000012da326 in opidrv ()
#8 0x0000000001e62456 in sou2o ()
#9 0x00000000006d2555 in opimai_real ()
#10 0x00000000006d240c in main ()
[oracle@ora-vm1 oracle]$ pstack 10712
#0 0x000000000074b36d in kslfre ()
#1 0x00000000010cc203 in kcbo_write_process ()
#2 0x00000000010ce608 in kcbo_write_q ()
#3 0x0000000001080a6d in kcbbdrv ()
#4 0x00000000007ddcc2 in ksbabs ()
#5 0x00000000007e4b32 in ksbrdp ()
#6 0x0000000002efcb50 in opirip ()
#7 0x00000000012da326 in opidrv ()
#8 0x0000000001e62456 in sou2o ()
#9 0x00000000006d2555 in opimai_real ()
#10 0x00000000006d240c in main ()
[oracle@ora-vm1 oracle]$ pstack 10712
#0 0x00000000010ccb60 in kcbo_exam_buf ()
#1 0x00000000010d0d62 in kcbo_service_ockpt ()
#2 0x0000000001080cd7 in kcbbdrv ()
#3 0x00000000007ddcc2 in ksbabs ()
#4 0x00000000007e4b32 in ksbrdp ()
#5 0x0000000002efcb50 in opirip ()
#6 0x00000000012da326 in opidrv ()
#7 0x0000000001e62456 in sou2o ()
#8 0x00000000006d2555 in opimai_real ()
#9 0x00000000006d240c in main ()
[oracle@ora-vm1 oracle]$ pstack 10712
#0 0x00000000010d0da5 in kcbo_service_ockpt ()
#1 0x0000000001080cd7 in kcbbdrv ()
#2 0x00000000007ddcc2 in ksbabs ()
#3 0x00000000007e4b32 in ksbrdp ()
#4 0x0000000002efcb50 in opirip ()
#5 0x00000000012da326 in opidrv ()
#6 0x0000000001e62456 in sou2o ()
#7 0x00000000006d2555 in opimai_real ()
#8 0x00000000006d240c in main ()

A MetaLink Research for the term “kcbo_service_ockpt” leads to Bug 7376934, which is a duplicate of Bug 7385253 – DBWR IS CONSUMING HIGH CPU.

Patch 7385253 is available for Linux x86_64, HP-UX, Solaris, AIX.
Reference:
MetaLink Note 762085.1 – Subject: ‘enq: RO – fast object reuse’ contention when gathering schema/table statistics in parallel



Hugepages revisited II: Be aware of kernel bugs!

It is well known that hugepages can reduce the overhead of managing memory pages of Oracle SGA by the operating system thus leading to lower system cpu utilization. I have written two blog entries regarding this topic already: Listener Coredumps on heavy load system and Hugepages revisited.

However, there is a potential risk with it: Certain kernels / platforms have bugs regarding hugepages which can lead to problems:

  • Bug 131295 – Hugepages configured on kernel boot line causes x86_64 kernel boot to fail with OOM: Fixed in RHEL3: kernel-2.4.21-40.EL
  • Bug 248954 – Oracle ASM DBWR process goes into 100% CPU spin when using hugepages on ia64 (Fixed in kernel-2.6.9-78.EL.ia64.rpm available as update for RHEL4U7)
  • RHSA-2008:1017-14: on the Itanium® architecture, setting the “vm.nr_hugepages” sysctl parameter caused a kernel stack overflow resulting in a kernel panic, and possibly stack corruption. With this fix, setting vm.nr_hugepages works correctly. Fixed with RHEL5 kernel-2.6.18-92.1.22.el5.ia64.rpm
  • RHSA-2008:1017-14: hugepages allow the Linux kernel to utilize the multiple page size capabilities of modern hardware architectures. In certain configurations, systems with large amounts of memory could fail to allocate most of this memory for hugepages even if it was free. This could result, for example, in database restart failures. Fixed with RHEL5 kernel-2.6.18-92.1.22.el5.ia64.rpm

Therefore, before enabling hugepages, I recommend to check with your OS Vendor Bug Database, test on a test system and apply recent OS upgrades first.



NUMA enabled in 10.2.0.4

When upgrading from pre 10.2.0.4 to 10.2.0.4, Oracle enables NUMA support. This has the effect that there can be multiple shared memory segments (MetaLink Note: 429872.1) although shmmax/shmall are set to high values.

I have read MetaLink Notes (7171446.8, 6730567.8, 6689903.8) and this blog entry, where a customer had problems on HP-UX with the default NUMA settings.

Better than that, it can also lead to instance crashes in 10.2.0.4 as reported in MetaLink Note 743191.1. Good news is that there is a patch available for Linux x86_64/10.2.0.4.

I have asked Oracle Support whether it is safe to leave NUMA enabled for Linux Itanium, but they would not comment on it. Instead they asked me to check with the OS vendor. Great. ;-(



Installing 10gR2 RAC on Linux Itanium (Montecito)

Recently, I had to install 10gR2 on Linux Itanium (Montecito CPUs) and found out that the Java version that ships with the binaries does not work on this platform. Therefore you have to download Patch 5390722 and perform the following steps for RAC installation:

  1. Install Patch 5390722: Install JDK into new 10.2 CRS Home, then install JRE into new 10.2 CRS Home.
  2. Take a tar backup of the CRS Home containing these two components. You will need it.
  3. Install 10.2.0.1 Clusterware by running from 10.2.0.1 binaries: ./runInstaller -jreLoc $CRS_HOME/jre/1.4.2
  4. Install Patch 5390722 with the option CLUSTER_NODES={"node1", "node2", ...}: Install JDK into new 10.2 RDBMS Home, then install JRE into new 10.2 RDBMS
  5. Install 10.2.0.1 RDBMS Binaries into the new 10.2 RDBMS: ./runInstaller -jreLoc $ORACLE_HOME/jre/1.4.2
  6. If you want to install the 10.2.0.4 patchset, you will have to follow these steps:
    for CRS: ./runInstaller -jreLoc $ORA_CRS_HOME/jdk/jre
    for RDBMS: ./runInstaller -jreLoc $ORACLE_HOME/jdk/jre
  7. After that, you have to repair the JRE because the 10.2.0.4 patchset has overwritten the patched JRE with the defective versions. (7448301)
    % cd $ORACLE_HOME/jre
    % rm -rf 1.4.2
    % tar –xvf $ORACLE_HOME/jre/1.4.2-5390722.tar

Sources:

  • Note: 404248.1 – How To Install Oracle CRS And RAC Software On Itanium Servers With Montecito Processors
  • Note: 400227.1 – How To Install Oracle RDBMS Software On Itanium Servers With Montecito Processors
  • Bug 7448301 – Linux Itanium: 10.2.0.4 Patchset for Linux Itanium (Montecito) has wrong Java runtime


Listener Coredumps on heavy load system

Recently I have come across a system which experiences listener crashes (core dumps) every couple of days. The listener logfile shows errors shortly before core-dumping:

29-SEP-2008 03:49:07 * (CONNECT_DATA=(SID=MDDB1)(CID=(PROGRAM=)(HOST=__jdbc__)(USER=))) * (ADDRESS=(PROTOCOL=tcp)(HOST=192.168.0.1)(PORT
=1398)) * establish * MDDB1 * 12518
TNS-12518: TNS:listener could not hand off client connection
TNS-12571: TNS:packet writer failure
TNS-12560: TNS:protocol adapter error
TNS-00530: Protocol adapter error
Linux IA64 Error: 104: Connection reset by peer

After analysing the core dump with gdb, the stack points to malloc() calls, which mean that the listener requests memory from the OS.

gdb /oracle/ora10/bin/tnslsnr core.311
GNU gdb Red Hat Linux (6.3.0.0-1.153.el4rh)
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB. Type "show warranty" for details.
This GDB was configured as "ia64-redhat-linux-gnu"...(no debugging symbols found)
Using host libthread_db library "/lib/tls/libthread_db.so.1".

Reading symbols from shared object read from target memory...(no debugging symbols found)...done.
Loaded system supplied DSO at 0xa000000000000000
Core was generated by `/oracle/ora10/bin/tnslsnr listenerv -inherit'.
Program terminated with signal 11, Segmentation fault.

#0 0x20000000027ee220 in malloc_consolidate ()
from /lib/tls/libc.so.6.1
(gdb) bt
#0 0x20000000027ee220 in malloc_consolidate () from /lib/tls/libc.so.6.1
#1 0x20000000027f0e30 in _int_malloc () from /lib/tls/libc.so.6.1
#2 0x20000000027f4b50 in malloc () from /lib/tls/libc.so.6.1
#3 0x40000000000079f0 in nsglconcrt ()
#4 0x4000000000011a00 in nsglhc ()
#5 0x4000000000019690 in nsglhe ()
#6 0x400000000001b980 in nsglma ()
#7 0x4000000000007770 in main ()
(gdb) quit

After contacting Oracle Support with this stack, they confirmed it to be Bug #6752308 which was closed as Duplicate of Bug 6139856. There is patch for 10.2.0.3 available and they also recommend to implement hugepages. By the way, there is an interesting article on the effect of utilizing – or not utilizing – hugepages here.

6139856 - TT11.1VALGRIND: FMR (FREE MEMORY READ/WRITE) IN NSEV.C