Linux

Hugepages revisited II: Be aware of kernel bugs!

It is well known that hugepages can reduce the overhead of managing memory pages of Oracle SGA by the operating system thus leading to lower system cpu utilization. I have written two blog entries regarding this topic already: Listener Coredumps on heavy load system and Hugepages revisited.

However, there is a potential risk with it: Certain kernels / platforms have bugs regarding hugepages which can lead to problems:

  • BugĀ 131295 – Hugepages configured on kernel boot line causes x86_64 kernel boot to fail with OOM: Fixed in RHEL3: kernel-2.4.21-40.EL
  • Bug 248954 – Oracle ASM DBWR process goes into 100% CPU spin when using hugepages on ia64 (Fixed in kernel-2.6.9-78.EL.ia64.rpm available as update for RHEL4U7)
  • RHSA-2008:1017-14: on the ItaniumĀ® architecture, setting the “vm.nr_hugepages” sysctl parameter caused a kernel stack overflow resulting in a kernel panic, and possibly stack corruption. With this fix, setting vm.nr_hugepages works correctly. Fixed with RHEL5 kernel-2.6.18-92.1.22.el5.ia64.rpm
  • RHSA-2008:1017-14: hugepages allow the Linux kernel to utilize the multiple page size capabilities of modern hardware architectures. In certain configurations, systems with large amounts of memory could fail to allocate most of this memory for hugepages even if it was free. This could result, for example, in database restart failures. Fixed with RHEL5 kernel-2.6.18-92.1.22.el5.ia64.rpm

Therefore, before enabling hugepages, I recommend to check with your OS Vendor Bug Database, test on a test system and apply recent OS upgrades first.



NUMA enabled in 10.2.0.4

When upgrading from pre 10.2.0.4 to 10.2.0.4, Oracle enables NUMA support. This has the effect that there can be multiple shared memory segments (MetaLink Note: 429872.1) although shmmax/shmall are set to high values.

I have read MetaLink Notes (7171446.8, 6730567.8, 6689903.8) and this blog entry, where a customer had problems on HP-UX with the default NUMA settings.

Better than that, it can also lead to instance crashes in 10.2.0.4 as reported in MetaLink Note 743191.1. Good news is that there is a patch available for Linux x86_64/10.2.0.4.

I have asked Oracle Support whether it is safe to leave NUMA enabled for Linux Itanium, but they would not comment on it. Instead they asked me to check with the OS vendor. Great. ;-(



Hugepages revisited

A while ago I wrote a post about a specific listener coredump issue which could be solved by 1) installing an oracle patch and 2) by implementing hugepages. I have also linked to an article from Pythian Group Oracle expert Riyaj Shamsudeen, who demonstrated problems with memory page management overheads with big SGAs without hugepages.

Today, I want to document the steps necessary to implement hugepages after doing some research:

  • Check your /etc/sysctl.conf shmall and shmmax values. I recommend that you set shmmax bigger or
  • Check your current total shared memory segment size. Depending on your “bc -l” skills by summing all byte lines from ipcs -m or by executing a script from metalink Note 401749.1 (which does exactly that). Calculate how many hugepages you need by “cat /proc/meminfo” and dividing it by the pagesize of your platform. (Linux IA64 has 256MB pages for example) I recommend to add 1 extra page for safety.
  • Say, you need 200 hugepages. Multiply it with the pagesize and enter this value in
    /etc/security/limits.conf: (values in kb)

oracle soft memlock 2097152
oracle hard memlock 2097152

  • Set the parameters in /etc/sysctl.conf:

vm.nr_hugepages=200
vm.hugetlb_shm_group=<group id of dba group from /etc/group>
e.g. vm.hugetlb_shm_group=201

I am going to implement hugepages on Linux Itanium for a Real Application Cluster system. I have read posts that there are different issues regarding startup with srvctl or sqlplus and startup by oracle or root. I will investigate and write more soon.