All entries by this author

Multipathing Configuration issue waiting to happen

Quite some time ago, I came across a quite hard to find issue during a consulting engagement, which i find worth mentioning. A 2 node RAC cluster running on RHEL4 x86-64 was relocated to a different data center. Apart from making sure, that the switch ports and Fibre Channel Ports are available on the new location, there is not much to worry about.

After the relocation, on one node the multipathing configuration, implemented with dev-mapper-multipath would not work. The command “multipath -ll” would just not return any output. After more than an hour, we pinned the issue down to the error message:

# multipath -v 3
#
# all paths in cache :
#



path sdh not found in pathvec

When checking what device sdh was, we realized that this was a KVM device, plugged in by the sysadmins.

May 27 14:46:26 host1 kernel: Attached scsi removable disk sdh at scsi10, channel 0, id 0, lun 0
May 27 14:46:26 host1 kernel: Type: Direct-Access ANSI SCSI revision: 02
May 27 14:46:26 host1 kernel: Vendor: KVM Model: vmDisk Rev: 0.01
May 27 14:46:26 host1 kernel: scsi10 : SCSI emulation for USB Mass Storage devices
May 27 14:46:26 host1 kernel: sr1: scsi3-mmc drive: 0x/0x caddy
May 27 14:46:26 host1 kernel: Type: CD-ROM ANSI SCSI revision: 02
May 27 14:46:26 host1 kernel: Vendor: KVM Model: vmDisk-CD Rev: 0.01

BTW: What is a KVM device?
Wikipedia states: A KVM switch (with KVM being an abbreviation for Keyboard, Video or Visual Display Unit, Mouse) is a hardware device that allows a user to control multiple computers from a single keyboard, video monitor and mouse.

We then added the device sdh to the multipath blacklist section in /etc/multipath.conf, and the problem was solved:

devnode_blacklist {
devnode “^sdh$”
}



Is your database secure enough? Check out Metasploit …

I have come across a short post on Pete Finnigan´s Oracle Security Weblog, who informed about the release of new Metasploit modules usable for penetration testing of Oracle databases.

What is Metasploit?

Metasploit is a framework, which enables automatic utilization of all kinds of exploits to test security of a system. Among others, there is an Oracle module.

To get some idea about what is possible, watch this: Attacking Oracle with the Metasploit Framework Shmoocon Firetalk Demo Video. In a very impressive 5 minute video, the presenter demonstrates how to use Oracle Listener version identification, SID brute force, well known username/password combinations (e.g. scott/tiger), gets access to scott, privilege escalates to dba, plants a java class to exec os commands, etc… You get the idea….

This will be something to watch out for, because it will enable script-kiddies to attack badly secured databases connected to the internet, or well trained rogue internal employees to attack databases, which do not have critical patch updates for well known security vulnerabilities installed.

A reuters report about this new release can you find here.

Update 2009-08-13: The metasploit developer has uploaded new demo videos of how to hack an oracle database with metasploit.



Oracle introduces Patch Set Updates (PSU) for 10.2.0.4 Database

On July 14th Oracle announced on MetaLink the release of a new patching strategy for the Oracle Database.

The new Patch Set Updates (PSU) will contain cumulative patches, which contain recommended bugfixes. They will be provided on a the same quarterly basis as the Critical Patch Update (CPU), therefore release months will be January, April, July and October. The Patch Set Update will be described in the release version. E.g. 10.2.0.4.1 will be the first Patch Set Update (PSU), 10.2.0.4.2 the second PSU etc.

As described in the release information, the 10.2.0.4.1 PSU Patch (Patch 8576156) contains all the recommended patch bundles up to July 2009 (Generic, CRS, RAC, Services, DataGuard) as well as the Critical Patch Update July 2009. Moreover 5 additional critical bugfixes are included. OPatch version 10.2.0.4.7 is required for installation of PSU 10.2.0.4.1 and the PSU is rolling installable on RAC environments without downtime.

Later PSU patches can be installed on either the base release or on top of any previous PSU. For example, PSU 10.2.0.4.3 can be installed on top of Base 10.2.0.4.0, PSU 10.2.0.4.1, PSU 10.2.4.0.2.

As already mentioned, the customer has the option to install security patches only by installing the quarterly Critical Patch Update or to install security plus non-security bugfixes by installint the Patch Set Update (PSU). As the PSU 10.2.0.4.1 already contains Critical Patch Update July 2009, the documentation states that future security patches are recommended to be installed not by CPU Patches but through PSU Patches.

Further information can be found in these MetaLink Notes:

854428.1 – Intro to Patch Set Updates (PSU)
850471.1 – Oracle Announces First Patch Set Update For Oracle Database Release 10.2
8576156.8 – Bug 8576156 10.2.0.4.1 Patch Set Update (PSU)
854473.1 – Known Issues with this Patch Set Update 10.2.0.4.1



Book review: HOWTO Secure and Audit Oracle 10g and 11g

I have added a new book review to my bookshelf: HOWTO Secure and Audit Oracle 10g and 11g – Ron Ben Natan



Book Review: Expert Oracle JDBC – J.R. Menon

I have added a new book review to my bookshelf: Expert Oracle JDBC – J.R. Menon.



Reclaimable Space Report – Segment Advisor

Today, I tried to get a nice, clean report about objects with reclaimable space from Segment Advisor. It is no problem to display the list in Enterprise Manager Grid|DB Control, but it is not so easy in SQL*Plus.

This is what i ended up with:

 SELECT
  segment_owner ,
  segment_name,
  round(allocated_space/1024/1024) ALLOC_MB ,
  round(used_space/1024/1024) USED_MB ,
  round(reclaimable_space/1024/1024) RECLAIM_MB    ,
  (1-ROUND((used_space/allocated_space),2))*100 AS reclaim_pct
   FROM TABLE(dbms_space.asa_recommendations('TRUE', 'TRUE', 'FALSE'))
  WHERE tablespace_name IN ('TS_DATA')
AND segment_type         = 'TABLE'
AND segment_owner LIKE '%'
AND segment_name LIKE '%'
AND (reclaimable_space >= 1000000
         OR (((1-ROUND((used_space/allocated_space),2))*100)) > 30)
ORDER BY reclaimable_space DESC


Default 10gR2 RAC TNS Configuration can cause TCP Timeouts for application

The default RAC installation does normally not set “local_listener” init.ora parameter. If the listener is running on port 1521, then the database does not need the parameter in order to find and register with the local TNS listener process. However, if you have *not* set local_listener, it means that the database registers at the listener with the physical IP address instead of the virtual (vip) address.

You can determine if this happens when you take a look at “lsnrctl serv” output from your rac nodes:

Service "S_APP" has 1 instance(s).
  Instance "MDDB1", status READY, has 2 handler(s) for this service...
    Handler(s):
      "DEDICATED" established:0 refused:0 state:ready
         REMOTE SERVER
         (ADDRESS=(PROTOCOL=TCP)(HOST=ora-vm1.intra)(PORT=1521))

Instead of ora-vm1.intra, this should be ora-vm1-vip.intra.

Why should I care?

If you use the default configuration, then you are using the parameter “REMOTE_LISTENER” and therefore Server Side Connect Time Load Balancing. This means, that the listeners of all nodes receive load information from all instances of all nodes and they can redirect connections to the least loaded instance, even if the instance is on another node. But the connect string they then send back to the client contains the physical IP address instead of the virtual.

In case of node crashes or kernel panics, etc. the client has to wait for the TCP timeout until this is detected.

Solution

tnsnames.ora:

LISTENER_MDDB1 =
  (ADDRESS_LIST =
    (ADDRESS = (PROTOCOL = TCP)(HOST = ora-vm1-vip.intra)(PORT = 1521))
  )

LISTENER_MDDB2 =
  (ADDRESS_LIST =
    (ADDRESS = (PROTOCOL = TCP)(HOST = ora-vm2-vip.intra)(PORT = 1521))
  )


init.ora:
alter system set local_listener = 'LISTENER_MDDB1' sid='MDDB1';
alter system set local_listener = 'LISTENER_MDDB2' sid='MDDB2';



Your experience with RAC Dynamic Remastering (DRM) in 10gR2?

One of my customers is having severe RAC performance issues, which appeared a dozen times so far. Each time, the performance impact lasted around 10 minutes and caused basically a hang of the application. ASH investigation revealed that the time frame of performance issues exactly matches a DRM operation of the biggest segment of the database. During the problematic time period, there are 20-50 instead of 5-10 active sessions and they are mostly waiting for gc related events: “gc buffer busy”,”gc cr block busy”, “gc cr block 2-way”, “gc current block 2-way”, “gc current request”, “gc current grant busy”, etc.

In addition, there is one single session which has wait event “kjbdrmcvtq lmon drm quiesce: ping completion” (on instance 1) and 1-3 sessions with wait event “gc remaster“. (on instance 2) The cur_obj# of the session waiting on “gc remaster” is pointing to the segment being remastered.

Does anybody have any experience with DRM problems with 10.2.0.4 on Linux Itanium?

I know that it is possible to deactive DRM, but usually it should be beneficial to have it enabled. I could not find any reports of performance impact during DRM operation on metalink. Support is involved but clueless so far.

Regards,
Martin

http://forums.oracle.com/forums/message.jspa?messageID=3447436#3447436



RAC Deadlock Detection not working on 10.2.0.4 / Linux Itanium

We recently experienced a big issue, when the production database was hung. It turned out that the database has deadlocked, but the GES did not detect the deadlock situation, so all the sessions were waiting on “enq: TX row lock contention”.

We could provide a reproducible testcase and it turned out to be bug 7014855. The bug is platform specific to Linux Itanium port and a patch is available.



Oracle Enterprise Manager – Grid Control 10gR5 (10.2.0.5)

A while ago, I attended a one-day seminar of DOAG, the german oracle user group, which covered Oracle Enterprise Manager – Grid Control 10gR5.

There were presentations about:

  • Bare-Metal Provisioning: Provisioning of Linux on top of new hardware
  • Provisioning of RAC Nodes (Clusterware, ASM, RAC)
  • Provisioning of RAC Databases
  • Command Line Interface EMCLI
  • Managing Oracle VM with Oracle Enterprise Manager
  • OEM Grid Control 10gR5 – New Features

The presentation of the New Features is available (although in german) here. The other presentations are available for DOAG members at the DOAG site. Moreover, one of the presenters will install a website for OEM topics on insight-oem.com.