freeipmi-users
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Freeipmi-users] bmc-watchdog 0.7.15-2 exiting under Ubuntu 10.04


From: Robert Hardy
Subject: Re: [Freeipmi-users] bmc-watchdog 0.7.15-2 exiting under Ubuntu 10.04
Date: Mon, 31 Jan 2011 18:11:47 -0500
User-agent: Mozilla/5.0 (Windows; U; Windows NT 6.1; en-GB; rv:1.9.2.13) Gecko/20101207 Thunderbird/3.1.7

That would be /var/log/freeipmi/bmc-watchdog.log here and nothing is logged at startup (or after the unexpected exit) during bootup.

I've put all sorts of debugging lines in my init script for bmc-watchdog.

I finally ended up doing doing this at root:
mv /usr/sbin/bmc-watchdog /usr/sbin/bmc-watchdog.real

and then putting this in /usr/sbin/bmc-watchdog:
#!/bin/bash
strace -fFv -o /tmp/bmcstrace.log -- /usr/sbin/bmc-watchdog.real $@

At bootup the bmc-watchdog initscript does launch a process with a new PID but it does NOT log the regular "starting bmc-watchdog daemon". It in fact logs nothing at all to /var/log/freeipmi/bmc-watchdog.log DURING BOOT UP.

The strace above captured bmc-watchdog running at bootup and the same process exiting here at the last few lines:

1584  semop(229383, {{0, 1, SEM_UNDO}}, 1) = 0
1584  nanosleep({0, 1000}, NULL)        = 0
1584 write(2, "bmc-watchdog.real: watchdog time"..., 72) = -1 EBADF (Bad file descriptor)
1584  exit_group(1)                     = ?

I've posted the entire strace here:
http://webcon.ca/~rhardy/bmcdrop/

Can you parse that and make any suggestions as to why it would exit uncleanly and only on boot up?

I'm not quite sure what is going on, but it seems to be trying to write on a bad file descriptor, getting an error and then exiting. From the strace, file descriptor 2 is in fact closed so that error makes sense to me. The real question is it trying to write to FD 2?

When I restart bmc-watchdog when it gets to the same place it properly writes the startup message on file descriptor 0 which is the log file which was opened earlier...

2466  write(0, "[Jan 31 18:03:23]: starting bmc-"..., 48) = 48

I'm open to debugging suggestions too... Ideas?

Thanks for your help,
Rob

On 2011-01-28 5:37 PM, Albert Chu wrote:
Hey Robert,

That is indeed strange.  Does the bmc-watchdog log say anything? (I
can't remember the exact location, but I think it's /var/log/freeipmi/
something).

Al

On Thu, 2011-01-27 at 13:14 -0800, Robert Hardy wrote:
I'm running bmc-watchdog 0.7.15-2 under a current Ubuntu 10.04 64 bit on
several fairly new unloaded Supermicro servers.

On only one (always the same server) of four servers the bmc-watchdog
process quietly exits shortly after start up leaving the system setup for a
hard reset shortly after bootup.

The options and builds are identical on all of the servers. These are my
options: OPTIONS="-d -u 2 -p 0 -a 1 -F -P -L -S -O -i 300 -e 60"

Through debugging I've confirmed on boot up:

- The init script gets run

- It launches bmc-watchdog  saves a new PID correctly in 
/var/run/bmc-watchdog.pid.

- Checking for a bmc-watchdog process in rc.local shows it isn't running and
    the timer is counting down.

- There is no shutdown message logged when the process disappears during bootup.

- There are no messages suggesting the process was killed

On shutdown the init script gets as far as removing
/var/run/bmc-watchdog.pid and seems to work fine.

If I stuff this in rc.local the bmc-watchdog starts up properly and never
seems to die again until the next reboot:
/usr/sbin/service bmc-watchdog stop
/usr/sbin/service bmc-watchdog start

All in all this is very weird behaviour. Is it possible a newer version of
bmc-watchdog would address this? i.e. is this a known bug?

Any other ideas why this is happening (or how I can debug further)?

Regards,
Rob

_______________________________________________
Freeipmi-users mailing list
address@hidden
http://lists.gnu.org/mailman/listinfo/freeipmi-users




reply via email to

[Prev in Thread] Current Thread [Next in Thread]