Listener Coredumps on heavy load system

By Martin | November 10th, 2008 | Category: 10g, Linux Itanium, Oracle Database, Unix | 2 comments

Recently I have come across a system which experiences listener crashes (core dumps) every couple of days. The listener logfile shows errors shortly before core-dumping:

29-SEP-2008 03:49:07 * (CONNECT_DATA=(SID=MDDB1)(CID=(PROGRAM=)(HOST=__jdbc__)(USER=))) * (ADDRESS=(PROTOCOL=tcp)(HOST=192.168.0.1)(PORT =1398)) * establish * MDDB1 * 12518 TNS-12518: TNS:listener could not hand off client connection TNS-12571: TNS:packet writer failure TNS-12560: TNS:protocol adapter error TNS-00530: Protocol adapter error Linux IA64 Error: 104: Connection reset by peer

After analysing the core dump with gdb, the stack points to malloc() calls, which mean that the listener requests memory from the OS.

gdb /oracle/ora10/bin/tnslsnr core.311 GNU gdb Red Hat Linux (6.3.0.0-1.153.el4rh) Copyright 2004 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "ia64-redhat-linux-gnu"...(no debugging symbols found) Using host libthread_db library "/lib/tls/libthread_db.so.1".
Reading symbols from shared object read from target memory...(no debugging symbols found)...done. Loaded system supplied DSO at 0xa000000000000000 Core was generated by `/oracle/ora10/bin/tnslsnr listenerv -inherit'. Program terminated with signal 11, Segmentation fault. #0 0x20000000027ee220 in malloc_consolidate () from /lib/tls/libc.so.6.1 (gdb) bt #0 0x20000000027ee220 in malloc_consolidate () from /lib/tls/libc.so.6.1 #1 0x20000000027f0e30 in _int_malloc () from /lib/tls/libc.so.6.1 #2 0x20000000027f4b50 in malloc () from /lib/tls/libc.so.6.1 #3 0x40000000000079f0 in nsglconcrt () #4 0x4000000000011a00 in nsglhc () #5 0x4000000000019690 in nsglhe () #6 0x400000000001b980 in nsglma () #7 0x4000000000007770 in main () (gdb) quit

After contacting Oracle Support with this stack, they confirmed it to be Bug #6752308 which was closed as Duplicate of Bug 6139856. There is patch for 10.2.0.3 available and they also recommend to implement hugepages. By the way, there is an interesting article on the effect of utilizing – or not utilizing – hugepages here.

6139856 - TT11.1VALGRIND: FMR (FREE MEMORY READ/WRITE) IN NSEV.C

2 comments
Leave a comment »

Hugepages revisited | ora-solutions.net - Martin Decker November 30th, 2008 21:32 :
[…] while ago I wrote a post about a specific listener coredump issue which could be solved by 1) installing an oracle patch and […]
Hugepages revisited II: Be aware of kernel bugs! | ora-solutions.net - Martin Decker January 7th, 2009 13:31 :
[…] to lower system cpu utilization. I have written two blog entries regarding this topic already: Listener Coredumps on heavy load system and Hugepages […]

Listener Coredumps on heavy load system

2 comments
Leave a comment »

Leave Comment

Browse Categories

Browse Archives

Stay informed

Who writes

Listener Coredumps on heavy load system

2 comments Leave a comment »

Leave Comment

Browse Categories

Browse Archives

Stay informed

Who writes

2 comments
Leave a comment »