help-cfengine
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: Coredumping problem on Mandrake 10/2.6 kernel


From: Brian Thomas
Subject: RE: Coredumping problem on Mandrake 10/2.6 kernel
Date: Mon, 8 Nov 2004 08:59:14 -0800

I'm sure you have to answer this question all the time, and I apologize
for my ignorance, but... Where do I go from here? :)

I mean, I understand that there can be (and were!) multiple libdb's on a
system, and how the libraries from one and the headers from another can
cause problems. But what I don't understand is how to get around it. Nor
do I understand why I have the specific problem I do; that of cfservd
running for hours just fine, but then deciding to crash. What in the
header mismatch might result in that?

For what it's worth, I've made sure there's only one version of libdb
(4.x, I removed the 3.x) from the system, recompiled everything, and the
problems still persist. And, in fact, the version I compiled against my
own 3.x build on Friday ran ok only for about 8 hours, it crashed
sometime during the night, so my theory about a libdb4 weirdism doesn't
seem to hold water.

I believe this is more than just a libdb version collision problem,
although I wholeheartedly believe libdb is still at fault in some way. I
just don't know how to A) troubleshoot or B) work around it. Short of
removing all libdb packages entirely from the system, how can I make
sure to get a standalone cfengine compile that won't go hunting through
/usr/include for header files when I've specified an particular
--with-berkeleydb path, for the purposes of entirely removing the
possibility of a mismatch being the culprit?

Again I apologize for what I know this list gets pinged about regularly,
but I've yet to find a solid answer on this, just a lot of folks
stumbling around trying to figure out what to do.

Brian 

-----Original Message-----
From: Mark Burgess [mailto:Mark.Burgess@iu.hio.no] 
Sent: Friday, November 05, 2004 9:10 PM
To: Brian Thomas
Cc: help-cfengine@gnu.org
Subject: Re: Coredumping problem on Mandrake 10/2.6 kernel


A likely explanation is that you have multiple versions of Berkeley
db on the system and yuo are mixing old header files with newer
libraries. This will cause core dumps, just as mixing
regex libraries will...

Mark

On Fri, Nov 05, 2004 at 03:11:08PM -0800, Brian Thomas wrote:
> Well... Progress, at least on the cfagent front:
> 
> I can actually run cfagent without a problem now, statically linked
> against a libdb-3.3.11 compile. I don't know if cfservd will stay up
> until the next run of the clients today, since that's the only time
> there's enough load, but whatever the problem, it seems
(unsurprisingly)
> linked to libdb. But it is NOT an issue with dynamic loading
weirdness,
> since as far as everything I can tell, it's all static.
> 
> Brian
> 
> -----Original Message-----
> From: Brian Thomas 
> Sent: Friday, November 05, 2004 2:46 PM
> To: help-cfengine@gnu.org
> Subject: Coredumping problem on Mandrake 10/2.6 kernel
> 
> So I'd thought originally I'd solved my problems with coredump
problems
> on Mandrake 10.x, but my excitement was premature. Furthermore, in
> testing I realized my locally-compiled version is not just having a
> problem with cfservd; it looks like cfagent is crashing as well.
> 
> I originally was, and still am, having problems with 'cfservd'
> coredumping after running for awhile, usually under heavy-ish load. At
> my half-hour intervals it would crap out, and appeared to be related
to
> libdb.
> 
> So in an effort to solve this, I undertook an effort to compile
> statically against libdb. I'll skip the intervening frustration,
suffice
> to say I decided during the ordeal that just compiling my own libdb
and
> my own openssl static libraries and compiling against them was
probably
> better anyway than using the system static libdb.a. No problem with
the
> compile process itself once I did that, and I can verify (with ldd) I
am
> relying on neither a dynamic libdb nor a dynamic libcrypto.
> 
> The problem is, I have twice the problems! Why? Because now cfagent is
> coredumping, and much more spectacularly (Read: Immediately) than
> cfservd, although cfservd is still crashing under load.
> 
> Included below is lots of relevant, maybe too much, information. I'm
not
> sure what to do at this point; originally I thought this was an issue
> with the tls (/lib/tls) versions of the libraries, and tried
> compiling/executing against each individually, with the same results
> either way.
> 
> So first, the software versions. Bear in mind I have the exact same
> problems when compiling against the Mandrake-installed versions of
> openssl and berkeleydb:
> 
> Openssl 0.9.7e
> BerkeleyDB 4.2.52
> Cfengine 2.1.11
> 
> Next, configure line (After this it's just a 'make'):
> 
> ./configure --with-berkeleydb=/var/tmp/db-4.2.52
> --with-openssl=/var/tmp/openssl-0.9.7e
> 
> Next, OS config:
> 
> # uname -a
> Linux amd-usa 2.6.3-7mdk-p3-smp-64GB #1 SMP Wed Mar 17 15:34:39 CET
2004
> i686 unknown unknown GNU/Linux
> # cat /etc/issue:
> Mandrake Linux release 10.0 (Official) for i586
> Kernel 2.6.3-7mdk-p3-smp-64GB on a 4-processor i686 / \l
> 
> Next, gdb output. This first one is from the cfservd crash:
> 
> # gdb -c ./core.32076 cfservd
> GNU gdb 6.0-2mdk (Mandrake Linux)
> [warranty deletia]
> This GDB was configured as "i586-mandrake-linux-gnu"...Using host
> libthread_db library "/lib/libthread_db.so.1".
> 
> Core was generated by `./cfservd -m'.
> Program terminated with signal 11, Segmentation fault.
> 
> warning: current_sos: Can't read pathname for load map: Input/output
> error
> 
> Reading symbols from /lib/libnss_nis.so.2...done.
> Loaded symbols for /lib/libnss_nis.so.2
> Reading symbols from /lib/tls/libpthread.so.0...done.
> Loaded symbols for /lib/tls/libpthread.so.0
> Reading symbols from /lib/tls/libm.so.6...done.
> Loaded symbols for /lib/tls/libm.so.6
> Reading symbols from /lib/tls/libc.so.6...done.
> Loaded symbols for /lib/tls/libc.so.6
> Reading symbols from /lib/libnsl.so.1...done.
> Loaded symbols for /lib/libnsl.so.1
> Reading symbols from /lib/libnss_files.so.2...done.
> Loaded symbols for /lib/libnss_files.so.2
> Reading symbols from /lib/ld-linux.so.2...done.
> Loaded symbols for /lib/ld-linux.so.2
> Reading symbols from /lib/libnss_nisplus.so.2...done.
> Loaded symbols for /lib/libnss_nisplus.so.2
> Reading symbols from /lib/libnss_dns.so.2...done.
> Loaded symbols for /lib/libnss_dns.so.2
> Reading symbols from /lib/libresolv.so.2...done.
> Loaded symbols for /lib/libresolv.so.2
> #0  0x080b0a6f in __bam_pinsert ()
> (gdb) backtrace
> #0  0x080b0a6f in __bam_pinsert ()
> #1  0x080af683 in __bam_page ()
> #2  0x080af070 in __bam_split ()
> #3  0x080f3bf9 in __bam_c_put ()
> #4  0x080dc06b in __db_c_put ()
> #5  0x080d588f in __db_put ()
> #6  0x080e250e in __db_put_pp ()
> #7  0x08063d97 in LastSeen (hostname=0x40427900
"hostfoo.shopping.com",
> role=cf_accept) at ip.c:443
> #8  0x0804e130 in VerifyConnection (conn=0x8254e68, buf=0x4042e966
> "10.20.3.50 hostfoo.shopping.com root 0")
>     at cfservd.c:1777
> #9  0x0804d06c in BusyWithConnection (conn=0x8254e68) at
cfservd.c:1234
> #10 0x0804cbc1 in HandleConnection (conn=0x8254e68) at cfservd.c:1133
> #11 0x4002c7d3 in start_thread () from /lib/tls/libpthread.so.0
> #12 0x40144b4a in clone () from /lib/tls/libc.so.6
> 
> Next is the output from the cfagent crash. Note, these two crashes DO
> NOT happen at the same time! Usually I can crank up a cfservd, and as
> long as there's no significant load it will run fine, while cfagent
will
> crash every time. Similarly, cfservd will always eventually crash,
> whether or not I run the locally-compiled cfagent against it. I am
still
> guessing the two crashes have the same similar root causes, but they
do
> not trigger each other!
> 
> # gdb -c ./core.32098 cfagent
> GNU gdb 6.0-2mdk (Mandrake Linux)
> [warranty deletia]
> This GDB was configured as "i586-mandrake-linux-gnu"...Using host
> libthread_db library "/lib/libthread_db.so.1".
> 
> Core was generated by `./cfagent --debug'.
> Program terminated with signal 11, Segmentation fault.
> 
> warning: current_sos: Can't read pathname for load map: Input/output
> error
> 
> Reading symbols from /lib/libnss_nis.so.2...done.
> Loaded symbols for /lib/libnss_nis.so.2
> Reading symbols from /lib/libpthread.so.0...done.
> Loaded symbols for /lib/libpthread.so.0
> Reading symbols from /lib/libm.so.6...done.
> Loaded symbols for /lib/libm.so.6
> Reading symbols from /lib/libc.so.6...done.
> Loaded symbols for /lib/libc.so.6
> Reading symbols from /lib/libnsl.so.1...done.
> Loaded symbols for /lib/libnsl.so.1
> Reading symbols from /lib/libnss_files.so.2...done.
> Loaded symbols for /lib/libnss_files.so.2
> Reading symbols from /lib/ld-linux.so.2...done.
> Loaded symbols for /lib/ld-linux.so.2
> Reading symbols from /lib/libnss_nisplus.so.2...done.
> Loaded symbols for /lib/libnss_nisplus.so.2
> Reading symbols from /lib/libnss_dns.so.2...done.
> Loaded symbols for /lib/libnss_dns.so.2
> Reading symbols from /lib/libresolv.so.2...done.
> Loaded symbols for /lib/libresolv.so.2
> #0  0x40115b47 in memcpy () from /lib/libc.so.6
> (gdb) backtrace
> #0  0x40115b47 in memcpy () from /lib/libc.so.6
> #1  0x080cf5ec in __bam_copy ()
> #2  0x080cf01e in __bam_psplit ()
> #3  0x080cd86c in __bam_page ()
> #4  0x080cd280 in __bam_split ()
> #5  0x08111e09 in __bam_c_put ()
> #6  0x080fa27b in __db_c_put ()
> #7  0x080f3a9f in __db_put ()
> #8  0x0810071e in __db_put_pp ()
> #9  0x0805ba27 in LastSeen (hostname=0xbfff4650
> "serverfoo.shopping.com", role=cf_connect) at ip.c:443
> #10 0x0805b265 in RemoteConnect (host=0xbfff4650
> "serverfoo.shopping.com", forceipv4=110 'n') at ip.c:192
> #11 0x080590c7 in OpenServerConnection (ip=0x8290c40) at client.c:57
> #12 0x08054308 in MakeImages () at do.c:2435
> #13 0x0804d70e in DoTree (passes=1, info=0x81cdf00 "Update") at
> cfagent.c:1274
> #14 0x0804b435 in main (argc=2, argv=0xbfffe7a4) at cfagent.c:107
> 
> 
> 
> 
> _______________________________________________
> Help-cfengine mailing list
> Help-cfengine@gnu.org
> http://lists.gnu.org/mailman/listinfo/help-cfengine

-- 


~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Work: +47 22453272            Email:  Mark.Burgess@iu.hio.no
Fax : +47 22453205            WWW  :  http://www.iu.hio.no/~mark
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~







reply via email to

[Prev in Thread] Current Thread [Next in Thread]